The lack of standardised access and interchange formats for knowledge organisation systems (KOS) are a barrier to their interoperability and wider use in automated Web and retrieval applications. Programmatic access to thesaurus (and other types of KOS) resources requires a commonly agreed distributed service protocol, building on lower-level standards, such as Web services. This paper reflects on our experiences in building a Web demonstrator of some novel thesaurus browsing and search tools, developed as part of a research project on the role of the thesaurus in controlled vocabulary retrieval applications. The Web system provides dynamically generated interface components for finding terms and browsing the thesaurus, building a query and returning ranked results using term expansion from a collections database. We designed a custom application programming interface of lower-level thesaurus functions to support the various user interface requirements of the application demonstrator. Based on our experience with developing the system, we review the literature on protocols for distributed access to thesauri and offer suggestions for further development of thesaurus service protocols. The FACET project, its semantic expansion and ranked result, multi-concept matching capabilities are briefly outlined. We provide a detailed description of key elements of the Web demonstrator and their rationale, together with a discussion of the data elements required by the different interface components. Existing proposals (Ceres, Zthes and ADL) for thesaurus service protocols are reviewed. The paper concludes by reflecting on lessons from constructing the Web demonstrator and implications for separating the service protocol from the interface. We argue that basing distributed protocol services on the atomic elements of thesaurus data structures and standard relationships is not necessarily the best approach. Client interfaces with similar components to the Web demonstrator require a service-oriented approach, with base services that group primitive KOS data elements (via their relationships) into composites. This leads to a proposal for a novel, unified semantic expansion service, which can be used both for specifying composite display formats and for query expansion services. Thesaurus (KOS) representations and service protocols for retrieval are closely linked. A service protocol should be explicitly expressed in terms of a well defined but extensible set of KOS data elements and relationships.
|Journal of Digital Information
|Published - 2004