The Words Themselves A Content-Based Approach to Quote Attribution

1. Abstract

Quote attribution is the identification of the speaker of a quotation in a given text. It requires reasoning about conversational patterns and contextual clues, and is especially complex in literary texts. We present a semi-supervised iterative classification approach to quote attribution that is based on ideas from computational stylometry, using the content of the quotation to distinguish between speakers. We achieve an accuracy of 77.3% on the QuoteLi quote attribution corpus. Despite certain limitations, we show that our method is a competitive alternative to systems based on contextual clues, and a viable complement to them.

Adam Hammond (adam.hammond@utoronto.ca), University of Toronto, Canada, Krishnapriya Vishnubhotla , University of Toronto, Canada and Graeme Hirst , University of Toronto, Canada

Theme: Lux by Bootswatch.