Computational Approaches to Narrative Space in 19th and 20th Century Novels (CANSpiN)

Ulrike Henny-Krahmer, Roger Labahn & Holger Helbig

Literary theory provides elaborate instruments for the conceptualization and description of narrative space, understood here as the concrete space of the narrated world, in which characters live, act, and move. However, to be able to use the existing concepts in computational analyses, they need to be formalized and mapped to features that can be captured on the linguistic surface of the texts. For distant reading approaches, in particular, it is necessary to design algorithms that allow to annotate the features automatically. In this area, so far, only basic preliminary work has been done for the computational, quantitative analysis of narrative space. In order to provide a solid basis for further computational research, we aim to develop methods for the recognition of spatial entities, i.e. all kinds of references to narrative space in the literary texts, such as toponyms, general space-related nouns or deictic expressions. For this task, we use existing semantic word nets and machine learning techniques, in particular neural large-scale language models, which are able to consider the linguistic context of the relevant spatial references. On top of the basic recognition of the spatial entities, methods are developed to classify them further into higher level categories such as mentions of space that are directly relevant to the plot vs. descriptions of the setting vs. further references to spatial entities. A third aim of the project is to approach the semantic structure of narrative worlds and the symbolic meaning of subspaces by connecting occurrences of spatial references in the texts with topics and sentiments. The research is based on empirical work, using corpora of novels in Spanish and German from the 19th and 20th century, which were chosen to develop multilingual and/or language-independent solutions. Gold standard reference data are created by manually annotating parts of the corpora, which are then used for the evaluation of the methods. From a literary historical perspective, we examine how the representation of space in the novels connects to questions of national, regional and other kinds of identity related to space, following the concept of “imagined communities” (Anderson). This is done by linking the analysis of spatial references in the texts to author nationalities, publication places of the novels, and topics of the texts. With the corpus-based studies, existing literary historical questions such as the role of novels in the process of nation-building in the 19th century and their function in the formation of two German identities in 20th century post-war literature are addressed, with a new set of computational methods and on a broader empirical basis than before, in order to substantiate and extend previous research results. More information can be found on the project website: