Is Medieval Distant Viewing Possible? : Extending and Enriching Annotation of Legacy Image Collections using Visual Analytics
arxiv(2022)
摘要
Distant viewing approaches have typically used image datasets close to the
contemporary image data used to train machine learning models. To work with
images from other historical periods requires expert annotated data, and the
quality of labels is crucial for the quality of results. Especially when
working with cultural heritage collections that contain myriad uncertainties,
annotating data, or re-annotating, legacy data is an arduous task. In this
paper, we describe working with two pre-annotated sets of medieval manuscript
images that exhibit conflicting and overlapping metadata. Since a manual
reconciliation of the two legacy ontologies would be very expensive, we aim (1)
to create a more uniform set of descriptive labels to serve as a "bridge" in
the combined dataset, and (2) to establish a high quality hierarchical
classification that can be used as a valuable input for subsequent supervised
machine learning. To achieve these goals, we developed visualization and
interaction mechanisms, enabling medievalists to combine, regularize and extend
the vocabulary used to describe these, and other cognate, image datasets. The
visual interfaces provide experts an overview of relationships in the data
going beyond the sum total of the metadata. Word and image embeddings as well
as co-occurrences of labels across the datasets, enable batch re-annotation of
images, recommendation of label candidates and support composing a hierarchical
classification of labels.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要