Computer-aided Analysis of Unreliability and Truth in Fiction – Interconnecting and Operationalizing Narratology (CAUTION)

Jonas Kuhn & Janina Jacke

The CAUTION project aims to strengthen interconnections across subfields of literary studies, (computational) linguistics and digital humanities (DH), addressing the aesthetic phenomenon of unreliable narration. Arthur Schnitzler’s novella ‘Andreas Thameyers letzter Brief’ (1902) is a typical example. The narrator tries to convince his addressees that his wife has been faithful, but to a commonsensical reader the ‘evidence’ he provides tells quite a different story. She gave birth to a dark-skinned baby nine months after being alone with foreigners from an exotic country.
The phenomenon presents challenges to approaches of text description/interpretation that emphasize text-based operationalization of concepts. The research literature abounds with discussions of interpretive implications for concrete examples of unreliable narration – but can text-independent guidelines be devised that lead to an intersubjectively stable assessment of plausible readings by trained literary scholars?
In this respect, unreliable narration is paradigmatic for phenomena whose key characteristics can only be captured at a ‘depth’ of analysis exceeding the level of literal semantic interpretation. Documenting the processes of (re)constructing non-literal meanings is notoriously difficult as it involves long inference chains and implicit or background knowledge. Most DH work on literary texts has so far addressed aspects associated with surface properties (e.g. style, topic structure, or networks of textually evident cross-character relations): Recent Natural Language Processing methods require large amounts of data, which are easier to obtain for surface-level phenomena. Moreover, annotated datasets for systematic narratological analysis are only beginning to become available. Hence it is currently hard to assess whether the generalizing potential, e.g., in neural modeling architectures, could capture aspects of the process of assigning non- literal meanings to texts: Theoretically grounded reference data for testing are missing.
Unreliable narration is an apt case for experimenting with the annotation of derived meanings and thus filling the gap. Attributing unreliability to a narrator involves inferences beyond plain linguistic interpretation – However, since the focus is on concrete questions about what happens in the fictional world of the narrative (i.e. content-specifying questions as opposed to content-transcending ones), the phenomenon of unreliability offers good starting points for research into literary principles of interpretation. If different readers of the same text suspect unreliability, then their inference chains will have a similar logical structure.
In CAUTION we work with a basic corpus of eight German-language short stories from the 19th to the 21st century, which are extensively collaboratively annotated, and an expanded corpus of approximately 30 additional German-language short stories and novels, which are explored using various text-mining techniques, and partially annotated. Our collaborative text annotation will target this parallelism in reasoning about the narrator: where the annotators categorize textual indicators of unreliable narration, identify and document relevant assumptions. We formalize interpretive strategies a reader can adopt using a multi-agent belief revision system from AI. Working on a corpus of systematically chosen texts, we will work cyclically towards a catalogue of formalizations for annotators to choose from to capture their reading of a text. Since inference chains for non-trivial stories are complicated, it is only through computational inference engines that the alternative formulations can be checked and improved – aiming to suitably capture patterns in the entire corpus.
In a complementary approach intertwined with collaborative annotation, we use existing text-mining techniques, annotation results, and relevant research literature on unreliable narratives to develop a heuristic for identifying textual indicators of unreliable narration, which allows the exploration of larger corpora with respect to unreliability and related phenomena from the field of truth in fiction. Accordingly, CAUTION’s objective is not only to approach the phenomenon of unreliable narration in a mixed-method approach, but also to address it on a more general level in two ways: From a literary-theoretical and methodological point of view, the project is concerned with the investigation of the principles of (content-specific) interpretation; for the field of Computational Literary Studies, the project is also to be understood as a test case of computer-aided research into interpretation-dependent text phenomena.