Query expansion via conceptual distance in thesaurus indexed collections

Research output: Contribution to journalArticlepeer-review

72 Downloads (Pure)

Abstract

Purpose
– The purpose of this paper is to explore query expansion via conceptual distance in thesaurus indexed collections

Design/methodology/approach
– An extract of the National Museum of Science and Industry's collections database, indexed with the Getty Art and Architecture Thesaurus (AAT), was the dataset for the research. The system architecture and algorithms for semantic closeness and the matching function are outlined. Standalone and web interfaces are described and formative qualitative user studies are discussed. One user session is discussed in detail, together with a scenario based on a related public inquiry. Findings are set in context of the literature on thesaurus‐based query expansion. This paper discusses the potential of query expansion techniques using the semantic relationships in a faceted thesaurus.

Findings
– Thesaurus‐assisted retrieval systems have potential for multi‐concept descriptors, permitting very precise queries and indexing. However, indexer and searcher may differ in terminology judgments and there may not be any exactly matching results. The integration of semantic closeness in the matching function permits ranked results for multi‐concept queries in thesaurus‐indexed applications. An in‐memory representation of the thesaurus semantic network allows a combination of automatic and interactive control of expansion and control of expansion on individual query terms.

Originality/value
– The application of semantic expansion to browsing may be useful in interface options where thesaurus structure is hidden.
Original languageEnglish
Pages (from-to)509-533
Number of pages25
JournalJournal of Documentation
Volume62
Issue number4
DOIs
Publication statusPublished - 2006

Keywords

  • controlled language construction
  • Controlled languages
  • semantics
  • knowledge management systems

Fingerprint

Dive into the research topics of 'Query expansion via conceptual distance in thesaurus indexed collections'. Together they form a unique fingerprint.

Cite this