Annotation in Digital Humanities (annDH)

Language and Computation Courses

Workshop

Annotation in Digital Humanities (annDH): How Can Linguistics/Computational Linguistics Help with Annotation in DH,
Sandra Kübler (Indiana University, USA) and Heike Zinsmeister (University of Hamburg, Germany)

Linguistic annotation is one of the core interfaces between linguistics and computational linguistics. It has also become a central interface between computational linguistics (CL) and digital humanities (DH). Texts are preprocessed and annotated, e.g. with parts of speech, for distant reading and other visualization applications, topic and network analyses, text mining and question answering for humanist research questions. In these applications the annotation is a means to an end and mostly invisible to the humanist researchers.

In this workshop, we will push the boundary of this interface and focus on annotation beyond the standard linguistic categories, looking at categories and relations relevant for humanist research questions themselves, such as metaphors, stereotypes, entities, causation of historical events, narratives, or philosophical reasoning. In this area, CL cannot necessarily provide tools, but instead it can provide methodology and best practices. Thus, lessons learned in linguistic annotation can be repurposed for annotation in DH. This includes CL support of the epistemological process of developing the annotation categories themselves, which are often inductively—or abductively—derived in a hermeneutically cyclic way. Also included in the scope of the workshop is research to the data types in the digital humanities, which mostly concern non-canonical language and thus pose challenges for automated annotation.