Automatic Metadata Generation in an Archaeological Digital Library: Semantic Annotation of Grey Literature

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review


This paper discusses the automatic generation of rich metadata for semantic search of reports of archaeological excavations. An extension of the CIDOC CRM for the archaeological domain acts as a core ontology. This enables cross search between diverse excavation datasets and ‘grey literature’ excavation reports originating from the Archaeological Data Service OASIS library. Rich metadata is
automatically extracted from the reports, directed by the CRM, via a three phase process of semantic enrichment employing the GATE toolkit. This is expressed as XML annotations coupled with the reports and also as RDF metadata, both represented as CRM entities, qualified by SKOS archaeological concepts. A web portal delivers the annotated XML files for visual inspection while the STAR
research demonstrator offers unified search of excavation data and grey literature in terms of the conceptual structure. Initial evaluation results show operational precision and recall rates for three different semantic expansion configurations of the system
Original languageEnglish
Title of host publicationComputational Linguistics
EditorsAdam Przepiórkowski, Maciej Piasecki, Krzysztof Jassem , Piotr Fuglewicz
Number of pages16
ISBN (Electronic)978-3-642-34399-5
ISBN (Print)978-3-642-34398-8
Publication statusPublished - 2013

Publication series

NameStudies in Computational Intelligence
ISSN (Print)1860-949X


  • Automatic Metadata Generation
  • Digital Archaeology
  • Digital Library
  • GATE
  • Knowledge Organization Systems
  • Information Extraction
  • Semantic Annotation
  • Semantic Search
  • SKOS


Dive into the research topics of 'Automatic Metadata Generation in an Archaeological Digital Library: Semantic Annotation of Grey Literature'. Together they form a unique fingerprint.

Cite this