Multi-Author Writing Style Analysis: Dataset Explorer
Thesis Type | Bachelor |
Thesis Status |
Currently running
|
Student | David Schabert |
Start |
|
Thesis Supervisor | |
Contact | |
Research Field |
The goal of multi-author analysis is to investigate methods to scrutinize the unique writing styles of different authors. This task serves as a crucial precursor to tasks such as identifying changes in authorship within a text or determining the author of a given piece of writing, known as authorship attribution. However, advancing this field requires the development and training of models tailored for multi-author analysis as well as a sufficient amount of training data containing texts written by multiple authors with labels specifying the author of each section.
The goal of this thesis is to develop a dataset explorer application, specifically focussing on the exploration of such datasets. The envisioned web-based platform aims to offer users a suite of visual exploration tools, including: (i) general statistical insights into the dataset, (ii) exploration of paragraph similarities within documents across a variety of features and similarity metrics (including e.g., latent representations), and (iii) feedback on false positive and false negative predictions for a given style change detection approach.