2nd Conference for Computational Literary Studies, Würzburg, 23.06.2023
Luca Giovannini — Daniil Skorinkin
University of Potsdam, Germany
This presentation: plu.sh/libretti
Working definition: modern dramatic texts where music plays a central role
Born in the early 17th century in Italy and rapidly exported across Europe
Traditional scholarship focused more on music than on words
Even librettology is still largely non-computational
Is it possible to consider libretti a unitary genre with its own structural features?
Do libretti possess a peculiar "genre signal" which sets them apart from contemporary comedies and tragedies?
How did the structure of libretti evolve compared to the other genres?
(☞ Fischer et al. 2017, dracor.org)
libretto
(55 🇩🇪 /58 🇫🇷)libretto
as tag
'subtitle'
 containing one of these labels for operatic subgenres (e.g. drame lyrique, opéra-ballet, Singspiel, Spieloper)DraCor plays without genre tags?
Libretti not identified as such?
🇩🇪
+ 51%
Â
Â
Â
Â
🇫🇷
+ 55%
Vectorisation of plays according to structural features
(cf. Szemes and Vida 2022)
EDA on different textual aspects
num_of_segments, num_of_speakers,
num_of_person_groups, word_count_sp,
word_count_stage, average_degree, density,
average_clustering, max_degree,
num_of_connected_components,
diameter, average_path_length
A combination of network measures and size statistics
Results were unsatisfying: no meaningful clustering, no signs of libretto being a unitary genre
Semi-automatic labelling of libretti as comic/non-comic, based on their subtitles (e.g. komische Oper → comic libretto)
Results: clustering still messy BUT
significant topological patterns emerge
comic space
tragic zone
non-comic libretti
measuring statistical significance of feature variation
training a Random Forest Classifier
single out the most significant features for further inspection
word_count_stage
word_count_sp
num_connected_components
density
num_of_speakers
diameter
word_count_sp
num_of_person_groups
average_degree
four-class implementation
plotting each play individually
LOWESS-based smoothing curves to make trends visible
Libretti have consistently less spoken text (above) and more stage directions (below)
🇩🇪 num_of_person_groups / word_count_sp
🇫🇷 density
/ num_speakers
The two types of French libretti (blue) are structurally more distinct than the German ones (orange)
Comparison: topic modelling (Schöch 2017)
Individual structural features might be useful for distinguishing one (sub-)genre from the other
However, it is generally not easy to distinguish between plays formalised as vectors of multiple features
Drama often seems too homogenous, in terms of structural properties, for discriminative clustering
Need to find better features (or construct better measures) // rethink operationalisation patterns
🔄 plu.sh/libretti
Â