Documents

The paper discusses the application of Natural Language Processing (NLP) techniques in the context of classical art text, for the aims of semantic annotation via rule-based Information Extraction (IE) techniques combined with ontological and domain vocabulary input. The CASIE (Classical Art Semantics Information Extraction) is a pilot collaborative project between the Hypermedia Research Unit (University of South Wales) and the Beazley Archive (Oxford University), which aims to automatically extract information about cultural objects from classical art scholarly texts and represent this information in terms of the ISO metadata standard for cultural heritage, the International Council of Museum’s CIDOC Conceptual Reference Model (CRM). In total 12 documents (fascicules – high quality catalogues) were processed, originating from the Corpus Vasorum Antiquorum (CVA) collection containing over 350 high quality catalogues of mostly ancient Greek painted pottery, illustrating more than 100,000 vases. The extracted information was expressed in interoperable RDF graphs consistent with the CLAROS project format. The role of CIDOC-CRM is central for enabling semantic interoperability across the range of datasets that contribute to CLAROS. The CASIE pilot enabled a complementary exploitation of terminological and ontological resources via rule-based information extraction techniques, delivering semantic annotation with respect to the CRM in the broader field of digital humanities.
Original languageEnglish
Title of host publicationConference of the British Chapter of the International Society for Knowledge Organization, London, UK, 8-9 July 2013.
Number of pages10
StatePublished - 8 Jul 2013
Event3rd ISKO UK biennial conference - University College London , London, United Kingdom
Duration: 7 Jul 20138 Jul 2013

Conference

Conference3rd ISKO UK biennial conference
CountryUnited Kingdom
CityLondon
Period7/07/138/07/13

ID: 2130753