Information extraction across textual corpora: semi-automatic text-tagging workflow with Chinese local gazetteers

1. Abstract

Textual information extraction is necessary for many humanities projects. Since 2013, we have been developing “Local Gazetteers Research Tools” (LoGaRT), and its text-tagging component is designed for that purpose. This poster introduces the practical implementation of information extraction and organization in LoGaRT and discusses how this component could be applied to other corpora with consistent internal structures.

Calvin Yeh (cyeh@mpiwg-berlin.mpg.de), Max Planck Institute for the History of Science, Sean Wang (swang@mpiwg-berlin.mpg.de), Max Planck Institute for the History of Science and Shih-Pei Chen , Max Planck Institute for the History of Science

Theme: Lux by Bootswatch.