ARK (Archival Resource Key): a Persistent Identifier Solution
Autor: John A. Kunze, University of California, email@example.com
Ein Archival Resource Key (ARK) ist eine URL, die die langzeitig verfügbare, digitale Referenz von Informationsobjekten jeglicher Art ermöglicht. Die California Digital Library (CDL) nutzt die Software „noid“, um ARKs zu generieren. ARKs setzen sich aus einer Folge von Zeichen (eineindeutig und unveränderbar) zusammen, die den Namen der Organisation (durch eine NAAN identifiziert), die dem Label “ark:” folgen. Davor kann optional der Protokoll- und Hostnamen einer URL stehen.
An ARK1 is a URL created to support persistent, long-term access to information objects. ARKs can identify objects of any type: digital documents, databases, images, software, and websites, as well as physical objects (books, bones, statues, etc.) and intangible objects (chemicals, diseases, vocabulary terms, performances).
ARKs and other persistent identifiers are necessary and useful because both the protocols used to access objects (such as http and ftp) and the sites that host the objects are subject to change. An ARK contains parts that are impervious to such changes and parts that are flexible enough to support changing user service expectations around a stable object core.
An ARK is represented by a sequence of characters that contains the label, “ark:”, optionally preceded by the protocol name (”http://”) and hostname that begins every URL. That first part of the URL, or the “Name Mapping Authority” (NMA), is mutable and replaceable, as neither the web server itself nor the current web protocols are expected to last longer than the identified objects. It is possible to use an NMA hostname that is longer-lived than that of your own organization (as has been done effectively with hosts such as http://n2t.info/ and http://doi.org/).
The immutable, globally unique identifier follows the “ark:” label. This includes a “Name Assigning Authority Number” (NAAN) identifying the naming organization, followed by the name that it assigns to the object. Here is a diagrammed example:
http://example.org/ark:/13030/654xz321/s3/f8.05v.tiff \________________/ \__/ \___/\_______/\_____________/ (replaceable) | | | Qualifier | ARK-Label | | (NMA-supported) | | | Name Mapping Authority (NMA) | Name (NAA-assigned) | Name Assigning Authority Number (NAAN)
The NAAN used above, 13030, represents the California Digital Library (CDL). A sampling of other institutions registered for ARK assignment includes:
12025 US National Library of Medicine 13030 California Digital Library 13960 Internet Archive 27927 Portico/Ithaka Electronic-Archiving Initiative 12148 National Library of France 78319 Google 64269 Digital Curation Centre
To generate ARKs, the California Digital Library (CDL) uses the open-source “noid” (nice opaque identifiers, rhymes with “employed”) software2. The noid software can also serve as an institution’s “identifier resolver”. Please send email to firstname.lastname@example.org if you are interested in generating ARKs. An ARK provides extra services above and beyond those of an ordinary URL. Instead of connecting to one thing, an ARK should connect to three things:
- the object itself,
- a brief metadata record if you append a single question mark to the ARK, and
- a maintenance commitment from the current server when you append two question marks.
In a web browser, for example, if you enter
it returns a brief machine- and eye-readable metadata record, such as:
erc: who: (:unav) unavailable what: Truckee River, below Truckee Station, looking towards Eastern Summit. -- Photographer's number: 222 -- Photographer's series: Central Pacific Railroad, California. when: (:unav) unavailable where: http://ark.cdlib.org/ark:/13030/tf5p30086k
It is a side-benefit of ARKs that an object’s metadata doesn’t need an identifier different from that for the object, which cuts in half the number of identifiers that need to be generated and managed.
- ↑ The complete ARK specification: http://www.cdlib.org/inside/diglib/ark/arkspec.pdf
- ↑ The noid software documentation: http://www.cdlib.org/inside/diglib/ark/noid.pdf