Identifying relations between characters in Afrikaans, Tshiven?a, and Xitsonga book

1. Abstract

The usefulness of computational linguistic tools, such as named entity recognition (NER) systems, in linguistic or literary studies of under-resourced languages is an area that is still relatively unexplored. We applied NER systems to one Afrikaans novel and two scanned dramas, one in Tshiven?a and one in Xitsonga. Personal relations are identified through character name co-occurence in sentences and these relationships are visualized using Gephi, following the approach by Van de Ven et al. (2018). The research identified several practical problems: low quality OCR, low quality NER, limited amounts of NE and language specific issues.

Menno van Zaanen (respect.mlambo@nwu.ac.za), South African Centre for Digital Language Resources, South Africa, Benito Trollip , South African Centre for Digital Language Resources, South Africa, Phathuthsedzo Ramukhadi , South African Centre for Digital Language Resources, South Africa and Respect Mlambo , South African Centre for Digital Language Resources, South Africa

Theme: Lux by Bootswatch.