Connecting archaeological data and grey literature via semantic cross search

Research output: Contribution to journalArticlepeer-review

Abstract

The advantages of making research data widely available have become widely recognised in recent years. In archaeology, this has been difficult to achieve at a detailed data level. Repositories of curated excavation datasets are emerging, such as the Archaeology Data Service in the UK, tDAR in the USA, Arachne in Germany but cross search across different organisational data structures remains difficult. Increasingly such digital libraries are including the 'grey literature' of excavation reports not formally published. However, these reports are not meaningfully connected with other online data. Significant problems impede achieving the full potential of exposing excavation datasets for wider reuse, even supposing that cross search is technically possible. Problems of semantic interoperability include differing terminology and database schema. Different archaeological teams may use different terms to mean the same thing and the same term may be used for different things. In addition, database structure varies and similar entities may have different names and field structure, making it difficult to search like with like. This paper discusses results from an AHRC funded research project that addressed the issue of semantic interoperability. STAR has demonstrated that semantic interoperability can be achieved by mapping and extracting different datasets and key concepts from OASIS reports to a core CRM-EH ontology via the central RDF based triple store.
Original languageEnglish
JournalInternet Archaeology
Issue number30
Early online date1 Jul 2011
DOIs
Publication statusPublished - 1 Jul 2011

Keywords

  • semantic interoperability
  • natural language processing
  • knowledge organization systems

Fingerprint

Dive into the research topics of 'Connecting archaeological data and grey literature via semantic cross search'. Together they form a unique fingerprint.

Cite this