The KIM Technology Watch Report: http://metadaten-twr.org

Archive for the ‘Knowledge Organization standards’ Category

New thesaurus standard ISO 25964 Part 1 published

Monday, October 24th, 2011 by traugottkoch

After three and a half years of intensive work and a long formal process, ISO TC46/SC9 WG8 has finalized work on Part 1 of the new thesaurus standard. It has been published by ISO on August 18, 2011.

ISO 25964-1 “Information and documentation — Thesauri and interoperability with other vocabularies — Part 1: Thesauri for information retrieval”, is a standard for development of thesauri and exchange of thesaurus data, covering both mono- and multilingual thesauri.

It has been developed (cf. [1]) as an extensive revision and extension of ISO 2788 and ISO 5964 (1985/6), based on BS 8723 (2005-2008), to include the following main sections:
1 Scope
2 Terms and definitions
3 Symbols, abbreviated terms and other conventions
4 Thesaurus overview and objectives
5 Concepts and their scope in a thesaurus
6 Thesaurus terms
7 Complex concepts
8 The equivalence relationship, in a monolingual context
9 Equivalence across languages
10 Relationships between concepts
11 Facet analysis
12 Presentation and layout
13 Managing thesaurus construction and maintenance
14 Guidelines for thesaurus management software
15 Data model
16 Integration of thesauri with applications
17 Exchange formats
18 Protocols
Annex A (informative) Examples of displays found in published thesauri
Annex B (informative) XML Schema for data exchange
Bibliography
Index

Part 1 can be bought and downloaded from ISO [2] or ordered from national standards bodies such as BSI, DIN, ANSI, AFNOR. It is available in PDF or print, in English language only. Standards distributors may sell it as well and certain reference libraries include it in their holdings.

Freely available is, however, the XML schema, that has been developed for exchange of thesaurus data (an Annex to the standard), cf. [3]. It is underpinned by a UML data model that provides for all the features a thesaurus may include, e.g. terms, relations among terms, concepts, relations among concepts, arrays, concept groups etc. Some constraints have been incorporated in the schema to ensure the quality of the represented thesauri.
The schema can be accessed without charge or password control at the NISO site [4] containing the schema and an introduction to it, an XML test example, HTML documentation files and an errata page for the standard.

A presentation at the recent NKOS 2011 workshop at TPDL in Berlin [5] did focus on the XML schema and made a comparison with the SKOS schema, pointing to necessary SKOS extensions.

Part 2 of the new standard, “Interoperability with other vocabularies”, covers mapping between thesauri and other types of vocabulary, such as classification schemes, file plans for records management, taxonomies, subject heading schemes, ontologies, terminologies, name authority lists, and synonym rings.

As ISO DIS 25964-2 (Draft International Standard) it has been submitted mid October for circulation among the member bodies of TC46/SC9 (Information and documentation/ Identification and description) and for comments.
Publication of Part 2 can be expected in 2012.
National standards bodies make the text available, applying different policies, however. BSI will, most probably, make it available to everybody during a commenting period, free of charge and passwords (cf. [6]).

[1] http://metadaten-twr.org/2009/05/27/new-iso-thesaurus-standard-under-development/
[2] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53657
[3] http://metadaten-twr.org/2010/02/12/thesaurus-data-model-and-schema-in-isodis-25964-1-available-for-public-comment/
[4] http://www.niso.org/schemas/iso25964/
[5] http://www.comp.glam.ac.uk/pages/research/hypermedia/nkos/nkos2011/presentations/DextreClarke_DeSmedt_ISO25964.pdf
[6] http://metadaten-twr.org/2009/11/06/commenting-and-balloting-on-the-new-draft-international-thesaurus-standard-opened/

Post a comment  |  Show this entry only


New geographic vocabularies available to machines and humans at LC and OCLC

Tuesday, April 12th, 2011 by traugottkoch

Early this year, a couple of new geographic vocabularies became freely available, to be accessed by machines and humans via web services.

The “Library of Congress Authorities and Vocabularies Service” (http://id.loc.gov) made available three new vocabularies, the MARC Code Lists for: a) Geographic Areas, b) Countries (with mappings to equivalent ISO 3166 codes) and c) Languages (mapped to ISO 639-1, 639-2 and 639-5). The same site offers Library of Congress Subject Headings (LCSH) since 2009.
The main goal is to provide machine access to LC data and selected links to other vocabularies, e.g. in the context of Linked Data. The data is encoded in SKOS/RDF.
Individual concepts are accessible via a web browsing and searching interface for human users or programmatically via content-negotiation.

OCLC offers a demonstration Web Service for FAST geographic headings. The complete FAST (Faceted Application of Subject Terminology) vocabulary contains over 1,6 million authority records, reworking LCSH for easier use and application (http://www.oclc.org/research/activities/fast/).
The Web Service (RESTful standard, http://www.oclc.org/developer/services/MapFAST) takes the chosen geographic coordinates and returns a ranked list of FAST headings near the specified location. Alternative name forms, the type of geographic feature, selected events at the location and other information from the authority records is displayed. Developers can use the Web Service to develop their own applications (mobile, geolocation services).

The MapFAST demonstrator (http://experimental.worldcat.org/mapfast), using the same Web Service, is a mashup prototype that uses a Google Maps interface to present FAST Geographic authority records and, via links, allows geographic subject searching in WorldCat.org or Google Books. The prototype demonstrates a strength of the subject faceting approach of FAST over coordinated subject headings.

This information is based on press releases and mails from LC and OCLC, respectively.

Post a comment  |  Show this entry only


SKOS: eine Sprache für die Übertragung von Thesauri ins Semantic Web

Wednesday, January 19th, 2011 by Kai Eckert

Autor: Kai Eckert, Universitätsbibliothek Mannheim

Das Semantic Web – bzw. Linked Data – hat das Potenzial, die Verfügbarkeit von Daten und Wissen, sowie den Zugriff darauf zu revolutionieren. Einen großen Beitrag dazu können Wissensorganisationssysteme wie Thesauri leisten, die die Daten inhaltlich erschließen und strukturieren. Leider sind immer noch viele dieser Systeme lediglich in Buchform oder in speziellen Anwendungen verfügbar. Wie also lassen sie sich für das Semantic Web nutzen? Das Simple Knowledge Organization System (SKOS) bietet eine Möglichkeit, die Wissensorganisationssysteme in eine Form zu “übersetzen”, die im Web zitiert und mit anderen Resourcen verknüpft werden kann. (more…)

Post a comment  |  Show this entry only


Thesaurus data model and schema in ISO/DIS 25964-1 available for public comment

Friday, February 12th, 2010 by traugottkoch

ISO 25964 “Thesauri and interoperability with other vocabularies” is the first thesaurus standard explicitly featuring a data model for monolingual and multilingual thesauri, recommendations for exchange formats and protocols and an XML schema. Part 1: “Thesauri for information retrieval”, officially numbered ISO/DIS 25964-1, has been released as a draft available for public comment until the end of February 2010 (the ballot will end March 26).

Clause 15 describes the data model underpinning a XML schema, presented both using UML (Fig. 15) and in tabular format. The full range of options described in the standard is accommodated. It models logically the underlying structure of thesaurus data, not necessarily representing the way data is held within a given computer system. The five main classes appearing are Thesaurus, ThesaurusConcept, ThesaurusTerm, ThesaurusArray and Note. How to see the text and to comment on it, has been described in an earlier blog message [1].

The XML schema for data exchange, derived from the data model, is included in the standard as an informative appendix (Annex B), and is available free of charge at [2]. The schema may be used for electronically transmitting a whole thesaurus or portions of a thesaurus. Everybody is invited to give feedback on the draft schema by advancing from that page to the comments page for the schema and clicking on the “Add a Comment” link there. In addition, a test XML document using the schema is available as well. All comments will be open for public viewing. Links to the schema are also provided at the ISO 25964 public project page [3].

One of the predecessors to the new thesaurus standard is BS 8723. A data model and XML schema developed in the context of this standard and documented in Part 5 of the standard (BS 8723-5), dealing with exchange formats and protocols for interoperability, is freely available for comparison at [4].

[1] http://metadaten-twr.org/2009/11/06/commenting-and-balloting-on-the-new-draft-international-thesaurus-standard-opened/
[2] http://www.niso.org/schemas/iso25964/
[3] http://www.niso.org/workrooms/iso25964/
[4] http://schemas.bs8723.org/Model.aspx

Read 1 comment or post a comment  |  Show this entry only


Commenting and balloting on the new draft international thesaurus standard opened

Friday, November 6th, 2009 by traugottkoch

ISO/DIS 25964-1 (compare our news item at [1]) has recently been released by ISO TC46/SC 9, as a draft available for public comment, in English language only. It is available from ISO as pdf (5 MB) or paper version, both at a cost of 98 CHF [2].

Information and documentation. Thesauri and interoperability with other vocabularies, Part 1:
Thesauri for information retrieval
Revises: ISO 2788:1986. Revises: ISO 5964:1985
134 pages

From now on until 26 March 2010, the ISO balloting process will take place, inviting comments and voting through the national standards bodies.
The publication, commenting and balloting procedure is, however, different in every country.

UK
In the UK, the DIS will be available for anyone to purchase from BSI at 20 pounds. This may sound like a high price, but it is much less than the price at ISO or the one applied when the standard eventually is approved. BSI will soon also provide a tool for viewing the draft online and submitting comments, whether or not you live in the UK [3]. The BSI national committee will review all of the comments received before formulating a UK response to the DIS.

In some other countries, the DIS might only be submitted to the national TC 46/SC9 committee for comments and voting. For details, you will have to ask your national standards body.

Germany
Please find below the details regarding Germany, in German, received by DIN NABD:

Der Text wird den 21 Mitarbeitern des NABD 9 zugaenglich gemacht und nur die koennen direkt kommentieren bzw. am Balloting teilnehmen. Die Livelink Notification E-Mail zum Balloting, mit kostenlosem Zugang zum Text des DIS, wird zeitnah an die Mitarbeiter des NABD 9 versendet. Die Geschäftsstelle des NABD organisiert das deutsche Balloting über das Livelink System. Von dort erfolgt die Weitergabe des konsolidierten Ergebnisses an die ISO.

Wenn deutschen Kollegen Mitarbeiter der NABD bekannt sind, koennen sie Kommentare an diese Personen geben.
Kommentare, die als Blog Comments hier im TWR-Blog abgegeben oder mir als Email (traugott.koch@mpdl.mpg.de) zugeschickt werden, koennen von mir als Teil meiner konsolidierten Stellungnahme an die Geschaeftsstelle des NABD weitergereicht werden.

[1] http://metadaten-twr.org/2009/05/27/new-iso-thesaurus-standard-under-development/

[2] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53657

[3] http://drafts.bsigroup.com/

Read 2 comments or post your comment  |  Show this entry only



Subscribe to this category