New thesaurus standard ISO 25964 Part 1 published

Monday, October 24th, 2011 by traugottkoch

After three and a half years of intensive work and a long formal process, ISO TC46/SC9 WG8 has finalized work on Part 1 of the new thesaurus standard. It has been published by ISO on August 18, 2011.

ISO 25964-1 “Information and documentation — Thesauri and interoperability with other vocabularies — Part 1: Thesauri for information retrieval”, is a standard for development of thesauri and exchange of thesaurus data, covering both mono- and multilingual thesauri.

It has been developed (cf. [1]) as an extensive revision and extension of ISO 2788 and ISO 5964 (1985/6), based on BS 8723 (2005-2008), to include the following main sections:
1 Scope
2 Terms and definitions
3 Symbols, abbreviated terms and other conventions
4 Thesaurus overview and objectives
5 Concepts and their scope in a thesaurus
6 Thesaurus terms
7 Complex concepts
8 The equivalence relationship, in a monolingual context
9 Equivalence across languages
10 Relationships between concepts
11 Facet analysis
12 Presentation and layout
13 Managing thesaurus construction and maintenance
14 Guidelines for thesaurus management software
15 Data model
16 Integration of thesauri with applications
17 Exchange formats
18 Protocols
Annex A (informative) Examples of displays found in published thesauri
Annex B (informative) XML Schema for data exchange

Part 1 can be bought and downloaded from ISO [2] or ordered from national standards bodies such as BSI, DIN, ANSI, AFNOR. It is available in PDF or print, in English language only. Standards distributors may sell it as well and certain reference libraries include it in their holdings.

Freely available is, however, the XML schema, that has been developed for exchange of thesaurus data (an Annex to the standard), cf. [3]. It is underpinned by a UML data model that provides for all the features a thesaurus may include, e.g. terms, relations among terms, concepts, relations among concepts, arrays, concept groups etc. Some constraints have been incorporated in the schema to ensure the quality of the represented thesauri.
The schema can be accessed without charge or password control at the NISO site [4] containing the schema and an introduction to it, an XML test example, HTML documentation files and an errata page for the standard.

A presentation at the recent NKOS 2011 workshop at TPDL in Berlin [5] did focus on the XML schema and made a comparison with the SKOS schema, pointing to necessary SKOS extensions.

Part 2 of the new standard, “Interoperability with other vocabularies”, covers mapping between thesauri and other types of vocabulary, such as classification schemes, file plans for records management, taxonomies, subject heading schemes, ontologies, terminologies, name authority lists, and synonym rings.

As ISO DIS 25964-2 (Draft International Standard) it has been submitted mid October for circulation among the member bodies of TC46/SC9 (Information and documentation/ Identification and description) and for comments.
Publication of Part 2 can be expected in 2012.
National standards bodies make the text available, applying different policies, however. BSI will, most probably, make it available to everybody during a commenting period, free of charge and passwords (cf. [6]).


Thesaurus data model and schema in ISO/DIS 25964-1 available for public comment

Friday, February 12th, 2010 by traugottkoch

ISO 25964 “Thesauri and interoperability with other vocabularies” is the first thesaurus standard explicitly featuring a data model for monolingual and multilingual thesauri, recommendations for exchange formats and protocols and an XML schema. Part 1: “Thesauri for information retrieval”, officially numbered ISO/DIS 25964-1, has been released as a draft available for public comment until the end of February 2010 (the ballot will end March 26).

Clause 15 describes the data model underpinning a XML schema, presented both using UML (Fig. 15) and in tabular format. The full range of options described in the standard is accommodated. It models logically the underlying structure of thesaurus data, not necessarily representing the way data is held within a given computer system. The five main classes appearing are Thesaurus, ThesaurusConcept, ThesaurusTerm, ThesaurusArray and Note. How to see the text and to comment on it, has been described in an earlier blog message [1].

The XML schema for data exchange, derived from the data model, is included in the standard as an informative appendix (Annex B), and is available free of charge at [2]. The schema may be used for electronically transmitting a whole thesaurus or portions of a thesaurus. Everybody is invited to give feedback on the draft schema by advancing from that page to the comments page for the schema and clicking on the “Add a Comment” link there. In addition, a test XML document using the schema is available as well. All comments will be open for public viewing. Links to the schema are also provided at the ISO 25964 public project page [3].

One of the predecessors to the new thesaurus standard is BS 8723. A data model and XML schema developed in the context of this standard and documented in Part 5 of the standard (BS 8723-5), dealing with exchange formats and protocols for interoperability, is freely available for comparison at [4].


Commenting and balloting on the new draft international thesaurus standard opened

Friday, November 6th, 2009 by traugottkoch

ISO/DIS 25964-1 (compare our news item at [1]) has recently been released by ISO TC46/SC 9, as a draft available for public comment, in English language only. It is available from ISO as pdf (5 MB) or paper version, both at a cost of 98 CHF [2].

Information and documentation. Thesauri and interoperability with other vocabularies, Part 1:
Thesauri for information retrieval
Revises: ISO 2788:1986. Revises: ISO 5964:1985
134 pages

From now on until 26 March 2010, the ISO balloting process will take place, inviting comments and voting through the national standards bodies.
The publication, commenting and balloting procedure is, however, different in every country.

In the UK, the DIS will be available for anyone to purchase from BSI at 20 pounds. This may sound like a high price, but it is much less than the price at ISO or the one applied when the standard eventually is approved. BSI will soon also provide a tool for viewing the draft online and submitting comments, whether or not you live in the UK [3]. The BSI national committee will review all of the comments received before formulating a UK response to the DIS.

In some other countries, the DIS might only be submitted to the national TC 46/SC9 committee for comments and voting. For details, you will have to ask your national standards body.

Please find below the details regarding Germany, in German, received by DIN NABD:

Der Text wird den 21 Mitarbeitern des NABD 9 zugaenglich gemacht und nur die koennen direkt kommentieren bzw. am Balloting teilnehmen. Die Livelink Notification E-Mail zum Balloting, mit kostenlosem Zugang zum Text des DIS, wird zeitnah an die Mitarbeiter des NABD 9 versendet. Die Geschäftsstelle des NABD organisiert das deutsche Balloting über das Livelink System. Von dort erfolgt die Weitergabe des konsolidierten Ergebnisses an die ISO.

Wenn deutschen Kollegen Mitarbeiter der NABD bekannt sind, koennen sie Kommentare an diese Personen geben.
Kommentare, die als Blog Comments hier im TWR-Blog abgegeben oder mir als Email ( zugeschickt werden, koennen von mir als Teil meiner konsolidierten Stellungnahme an die Geschaeftsstelle des NABD weitergereicht werden.




Conceptual model for Subject Authority data — FRSAD

Tuesday, September 8th, 2009 by traugottkoch

In 2005, IFLA started “Functional Requirements for Subject Authority Records (FRSAR)” as a working group in the FRBR (Functional Requirements for Bibliographic Records) family. The group was supposed to focus on subject authority data (information about subjects from authority files) and its use in a wide range of applications, the semantics, structures and interoperability issues of such data, independent from any implementation or specific context.

In June this year, the FRSAR working group published a second draft of a “Conceptual Model” [1]. It focuses on general functional requirements and the potential of subject authority data for broad sharing and use.

This draft was open for comments and review until the end of July in order for discussions by the Working Group during the IFLA 2009 conference in mid-August in Milan. Further comments can be sent to

The core of the model on the aboutness of works is the following:

work <<has as subject/is subject of>> thema <<has appellation/is appellation of>> nomen

The relationships between the three entities are many-to-many relationships and bi-directional. However, in a given controlled vocabulary and within a domain, a nomen should be an appellation of only one thema.

“Thema” is defined as “any entity that can be subject of a work”. Thema includes any of the entities which are originally defined by FRBR: work, expression, manifestation, item; person, corporate body; concept, object, event, place and all other subjects “work” might have.
The entity “Nomen” and the relationships ‘Thema has appellation Nomen/ Nomen is appellation of Thema’ are new proposals of the working group. “Nomen” is any sign or sequence of signs (alphanumeric characters, symbols etc.) by which a “Thema” is known, referred to or addressed as.

Two co-chairs of the working group, Marcia Zeng and Maja Zumer, compare the FRSAR model in a paper presented at IFLA 2009 [2] with related models (new thesauri standards BS8723 and ISO 25964-1; SKOS, OWL and the DCMI Abstract Model). They conclude that these models match rather well with the FRASAR conceptual model, and thus, that subject authority data that are modeled according to FRSAD and encoded in SKOS or OWL will have a high potential of interoperability and contribute to linked data and the semantic web.

[1] IFLA (2009). Functional Requirements for Subject Authority Data (FRSAD). A Conceptual Model. IFLA Working Group on FRSAR. 2nd draft 2009-06-10.

[2] Zeng, Marcia and Zumer, Maja (2009). Introducing FRSAD and mapping it with SKOS and other models. 75th IFLA General Conference, Papers, 23-27 August 2009, Milan, Italy. Available in 5 languages.

