Null models in authorship analysis - an alternative approach to established methods in stylometry

1. Abstract

In stylometry, authorship questions are currently approached in terms of two separate tasks: authorship attribution and authorship verification. Recent research in stylometry often interpreted both as classification tasks and concentrated on further improving the accuracy of these two procedures. This study proposes an alternative approach to authorship questions, treating them as hypothesis tests based on an empirical null model. The central question this approach evolves around: what does a text distance value of X actually mean for my authorship question? This essay outlines how a null model can be derived from empirical observations to answer this question. The approach allows to choose a rejection criterion for the null hypothesis that two texts have been written by different people, producing reasonable estimations for the alpha and beta error.