Reflections on experience with archaeological controlled vocabularies in indexing and retrieval

Douglas Tudhope*, Ceri Binding

*Awdur cyfatebol y gwaith hwn

Allbwn ymchwil: Pennod mewn Llyfr/Adroddiad/Trafodion CynhadleddCyfraniad i gynhadledd

15 Wedi eu Llwytho i Lawr (Pure)

Crynodeb

In the STAR project investigating semantic integration, we employed thesauri from the Forum on Information Standards in Heritage (FISH) and word lists from Historic England recording manuals. Semantic integration allowed search across both archaeological datasets and grey literature reports via data extraction and NLP (Tudhope et al. 2011; Vlachidis & Tudhope 2016). The ARIADNE and ARIADNEplus European Infrastructure projects confronted multi-lingual issues in seeking to integrate archaeological data and reports written (and indexed by CV) in various partner languages. We developed tools to help map partner CVs to a central ‘mapping hub’, the Getty Art and Architecture Thesaurus (AAT), allowing search across partner data and reports in different languages and also query expansion using the AAT’s hierarchical structure (Binding & Tudhope 2016). We have also provided tools to express English, Scottish, Welsh (including Gaelic and Welsh language) FISH vocabularies as Linked Open Data (HeritageData) facilitating programmatic use. We are currently collaborating with the Archaeology Data Service (ADS) on CV based NLP tools to make automatic indexing suggestions for the OASIS online index of fieldwork events and their unpublished reports, drawing on FISH vocabularies (Monuments, Objects, Periods) employed in OASIS subject indexing.

Reflections from this experience are discussed. These include the potential of mapping between CVs, possible need for an enhanced entry vocabulary (synonyms etc) in CVs when used in NLP and the challenge of compound phrases that combine concepts, possibly meriting a faceted approach. There may be a need to draw on standard CVs from other domains (eg for scientific areas). It is possible to index with multiple CVs. It is important to consider use cases; the indexing requirements of grey literature may differ from academic journal publishing. CVs should be continually maintained and evolve, alert to potential gaps or bias of different kinds.
Iaith wreiddiolSaesneg
TeitlSession 320, A controlled vocabulary for archaeology: a necessary requirement for the development of a sustainable research practice into the 21st century
StatwsCyhoeddwyd - 1 Medi 2023
Digwyddiad European Association of Archaeologists 29th Annual Meeting - Belfast, Y Deyrnas Unedig
Hyd: 30 Awst 20232 Medi 2023
https://www.e-a-a.org/EAA2023/Home/EAA2023/Home.aspx?hkey=c376135d-4d51-4a35-ae41-569699c7e496

Cynhadledd

Cynhadledd European Association of Archaeologists 29th Annual Meeting
Teitl crynoEAA 2023
Gwlad/TiriogaethY Deyrnas Unedig
DinasBelfast
Cyfnod30/08/232/09/23
Cyfeiriad rhyngrwyd

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'Reflections on experience with archaeological controlled vocabularies in indexing and retrieval'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn