TY - JOUR
T1 - KOS-based enrichment of archaeological fieldwork reports
AU - Binding, Ceri
AU - Tudhope, Doug
N1 - Extended version of ISKO-UK 2023 Conference paper selected for publication in Knowledge Organization Journal (DT, 4.7.24)
Publication switched to OA Winter 2024 so OA and licence details updated 06/01/2025, NR.
PY - 2024/8/12
Y1 - 2024/8/12
N2 - Semantic enrichment techniques and tools based on knowledge organization systems (KOS) have an important role to play in supporting information discovery. This paper reports on work investigating and developing automatic indexing techniques (for final intellectual judgment) based on KOS. Within the UK, the OASIS online index of fieldwork events and their unpublished reports represent a major initiative to make archaeological fieldwork available to a wider public. OASIS is hosted by the Archaeology Data Service and is funded by Historic England and Historic Environment Scotland. A wide variety of organisations provide OASIS reports. Subject indexing is inconsistent and sometimes sparse, although use of standard KOS from the Forum on Information Standards in Heritage is encouraged. Results from a case study for an automatic (KOS-based) subject indexing recommendation system are reported. Findings include the need to extend the KOS entry vocabularies and the need for post-processing filters to prioritise subject indexing significant for the document in question. The paper reflects on the experience with future work in mind, including discussion of evaluation issues and positioning the approach within the context of previous work on subject indexing, automatic indexing for Name Authorities and Named Entity Recognition (NER). The techniques followed in the case study can be characterised as a hybrid approach. The purpose for which the indexing is applied is a key distinguishing feature. In this case, the purpose or indexing policy for OASIS goes beyond overall aboutness to request indexers to include significant objects or artefacts found during the project. Future work will investigate contextual patterns reflecting significance and incorporate those patterns in post-processing prioritisation measures.
AB - Semantic enrichment techniques and tools based on knowledge organization systems (KOS) have an important role to play in supporting information discovery. This paper reports on work investigating and developing automatic indexing techniques (for final intellectual judgment) based on KOS. Within the UK, the OASIS online index of fieldwork events and their unpublished reports represent a major initiative to make archaeological fieldwork available to a wider public. OASIS is hosted by the Archaeology Data Service and is funded by Historic England and Historic Environment Scotland. A wide variety of organisations provide OASIS reports. Subject indexing is inconsistent and sometimes sparse, although use of standard KOS from the Forum on Information Standards in Heritage is encouraged. Results from a case study for an automatic (KOS-based) subject indexing recommendation system are reported. Findings include the need to extend the KOS entry vocabularies and the need for post-processing filters to prioritise subject indexing significant for the document in question. The paper reflects on the experience with future work in mind, including discussion of evaluation issues and positioning the approach within the context of previous work on subject indexing, automatic indexing for Name Authorities and Named Entity Recognition (NER). The techniques followed in the case study can be characterised as a hybrid approach. The purpose for which the indexing is applied is a key distinguishing feature. In this case, the purpose or indexing policy for OASIS goes beyond overall aboutness to request indexers to include significant objects or artefacts found during the project. Future work will investigate contextual patterns reflecting significance and incorporate those patterns in post-processing prioritisation measures.
KW - automatic subject indexing
KW - knowledge organization systems
KW - named entity recognition
U2 - 10.5771/0943-7444-2024-5-292
DO - 10.5771/0943-7444-2024-5-292
M3 - Article
SN - 0943-7444
VL - 51
SP - 292
EP - 299
JO - Knowledge Organization
JF - Knowledge Organization
IS - 5
ER -