Archaeological reports contain a great deal of information that conveys facts and findings in different ways. This kind of information is highly relevant to the research and analysis of archaeological evidence but at the same time can be a hindrance for the accurate indexing of documents with respect to positive assertions. The paper presents a method for adapting the biomedicine oriented negation algorithm NegEx to the context of archaeology and discusses the evaluation results of the new modified negation detection module. The performance of the module is compared against a "Gold Standard" and evaluation results are encouraging, delivering overall 89% Precision, 80% Recall and 83% F-Measure scores. The paper addresses limitations and future improvements of the current work and highlights the need for ontological modelling to accommodate negative assertions. It concludes that adaptation of the NegEx algorithm to the archaeology domain is feasible and that rule-based information extraction techniques are capable of identifying a large portion of negated phrases from archaeological grey literature.

Original languageEnglish
Title of host publicationMetadata and Semantics Research
Subtitle of host publication7th Research Conference, MTSR 2013, Thessaloniki, Greece, November 19-22, 2013. Proceedings
EditorsEmmanouel Garoufallou, Jane Greenberg
PublisherSpringer Verlag
Pages188-200
Number of pages13
ISBN (Electronic)978-3-319-03437-9
ISBN (Print)978-3-319-03436-2
DOIs
StatePublished - 2013
Event7th Research Conference on Metadata and Semantics Research (MTSR) - Thessaloniki, Greece
Duration: 19 Nov 201322 Nov 2013

Publication series

NameCommunications in Computer and Information Science
PublisherSPRINGER-VERLAG BERLIN
Volume390
ISSN (Print)1865-0929

Conference

Conference7th Research Conference on Metadata and Semantics Research (MTSR)
CountryGreece
CityThessaloniki
Period19/11/1322/11/13

    Research areas

  • Negation Detection, Semantic Technologies, Digital Humanities, CIDOC-CRM, Semantic Annotation, Natural Language Processing

ID: 496698