Advanced sentiment analysis for understanding affective-aesthetic responses to literary texts: A computational and experimental psychology approach to children’s literature

Arthur Jacobs, Gerhard Lauer & Jana Lüdtke

Emotional involvement is of pivotal importance when children learn to read, tell, and share stories. This crucial dimension of cultural literacy has received surprisingly little attention within literary studies, psychology, and digital humanities. Taking a large-scale and data-driven approach, the most promising method to assess emotional information in children’s reading material is sentiment analysis. It allows the analysis of larger text corpora to find verbal emotional patterns potentially guiding young readers’ affective-aesthetic responses to literary texts – to characters, events, narrator/voice, and poem lines. Consequently, it facilitates modelling the role of emotions in the interaction of emerging literary literacy and social-cognitive development. However, standard sentiment analysis tools were developed in the (industry-driven) framework of opinion mining, do not involve concepts and theories of emotion in psychology, and need domain-adaptation to literary discourse. The main of CHYLSA is exactly what is missing: to develop and validate sentiment analysis for computational literary studies.In this follow-up proposal CHYLSA II we continue to develop advanced sentiment analysis for the use in computational literary studies in general and for children’s and youth literature in particular. In line with CHYLSA I we continue to work on corpora, further develop the advanced sentiment analysis tool ‘SentiArt’ and run cross-validation of emotion prediction by machine and by humans. 1) In contrast to CHYLSA I we stick only to corpora of children’s and youth books and texts widely read today but we do no longer include historical text no longer read today. Instead of including historical aliened text we transform selections of the already collected texts into easy-to-read versions of the same texts. We also prepare the corpora (as training sets and as database for experimental use) to be publicly available in accordance with the FAIR principle within the NFDI Text+. 2) We validate and adjust the sentiment analysis tool ‘SentiArt’ by further annotating training sets and by cross-validating in experiments on emotions in readers of this age groups. In addition to CHYLSA I we now integrate aspect-oriented transformer models to understand the relation of emotion and aspects in the development of the sentiment analysis tool. 3) We test the validity of the tool via predicting children’s reading behaviour, following the understanding of emotions by the affective neuroscience approaches by Jaak Panksepp and the fundamental distinction between valence and arousal. In addition to CHYLSA I we now include text complexity as one of the major dimensions for testing. We hope that the COVD-19 pandemic situation will turn into an endemic situation and experiments specifically with children will be easier to run than in the previous CHYLSA I project.