Abstract
This paper discusses the automatic generation of rich metadata for semantic search of reports of archaeological excavations. An extension of the CIDOC CRM for the archaeological domain acts as a core ontology. This enables cross search between diverse excavation datasets and ‘grey literature’ excavation reports originating from the Archaeological Data Service OASIS library. Rich metadata is
automatically extracted from the reports, directed by the CRM, via a three phase process of semantic enrichment employing the GATE toolkit. This is expressed as XML annotations coupled with the reports and also as RDF metadata, both represented as CRM entities, qualified by SKOS archaeological concepts. A web portal delivers the annotated XML files for visual inspection while the STAR
research demonstrator offers unified search of excavation data and grey literature in terms of the conceptual structure. Initial evaluation results show operational precision and recall rates for three different semantic expansion configurations of the system
automatically extracted from the reports, directed by the CRM, via a three phase process of semantic enrichment employing the GATE toolkit. This is expressed as XML annotations coupled with the reports and also as RDF metadata, both represented as CRM entities, qualified by SKOS archaeological concepts. A web portal delivers the annotated XML files for visual inspection while the STAR
research demonstrator offers unified search of excavation data and grey literature in terms of the conceptual structure. Initial evaluation results show operational precision and recall rates for three different semantic expansion configurations of the system
Original language | English |
---|---|
Title of host publication | Computational Linguistics |
Editors | Adam Przepiórkowski, Maciej Piasecki, Krzysztof Jassem , Piotr Fuglewicz |
Publisher | Springer |
Pages | 187-202 |
Number of pages | 16 |
ISBN (Electronic) | 978-3-642-34399-5 |
ISBN (Print) | 978-3-642-34398-8 |
DOIs | |
Publication status | Published - 2013 |
Publication series
Name | Studies in Computational Intelligence |
---|---|
Publisher | Springer |
Volume | 458 |
ISSN (Print) | 1860-949X |
Keywords
- Automatic Metadata Generation
- CIDOC CRM
- Digital Archaeology
- Digital Library
- GATE
- Knowledge Organization Systems
- Information Extraction
- Semantic Annotation
- Semantic Search
- SKOS