Abstract
This paper deals with the development of advanced tools and technologies for creating relevant information and suitable metadata out of textual documentation produced by Italian archaeological research. A set of Natural Language Processing tools were developed to recognize and annotate various archaeological entities in Italian language textual reports. The CIDOC CRM is the ontology chosen for encoding resulting output, allowing for a maximum degree of standardisation of the produced metadata to guarantee interoperability with archaeological information already existing in other semantically enabled digital archives. The work took place as part of the development for the TEXTCROWD platform for the European Open Science Cloud for Research Pilot Project.
Original language | English |
---|---|
Title of host publication | 2018 3rd Digital Heritage International Congress (DigitalHERITAGE) held jointly with 2018 24th International Conference on Virtual Systems & Multimedia (VSMM 2018) |
Editors | Alonzo C. Addison, Harold Thwaites |
Publisher | Institute of Electrical and Electronics Engineers |
Number of pages | 8 |
ISBN (Electronic) | 978-1-7281-0292-4 , 978-1-7281-0293-1 |
DOIs | |
Publication status | Published - 11 Dec 2018 |
Event | Digital Heritage 2018 - 3rd International Congress & Expo: New Realities: Authenticity & Automation in the Digital Age - San Francisco , United States Duration: 26 Oct 2018 → 30 Oct 2018 |
Conference
Conference | Digital Heritage 2018 - 3rd International Congress & Expo |
---|---|
Abbreviated title | DH2018 |
Country/Territory | United States |
City | San Francisco |
Period | 26/10/18 → 30/10/18 |
Keywords
- NLP
- NER
- Italian language archaeology
- textual documents
- Grey Literature
- Metadata integration
- Standards
- CIDOC CRM