Visualizing the Transcribe Bentham corpus
How can we gain an overview of the 17,000 pages of Bentham's manuscripts made available by Transcribe Bentham? Methods to provide an overview of the corpus may help domain-experts find corpus areas relevant for their research. In this work we have applied computational techniques to visualize the corpus, providing a general view of its content.
First, a lexical extraction was performed to choose terms to model the corpus. Then, term clusters were created based on similarity between the terms' contexts, and visualized as corpus maps. The maps provide an overview of the corpus as a whole, as well as of corpus terms more prominent in different corpus periods. The issue of evaluating these corpus maps will also be discussed.