Digitization & citizen science

Digitization & citizen science

By Warwick History ‘Post-Doc’ club

Warwick History ‘Post-Doc’ club present: DIGITIZATION & CITIZEN SCIENCE – Inviting collaborative research to transcribe handwriting

Date and time

Location

Online

Good to know

Highlights

  • 1 hour, 30 minutes
  • Online

About this event

Science & Tech • Science

with Harry Smith and Emily Vine of the Material Culture of Wills, 1540-1790 project at the University of Exeter

Advances in machine learning present new opportunities for historical research, allowing new forms of analysis, and enabling digitization and transcription to happen at a far greater pace than was previously possible. This is particularly true for the early modern period since earlier automatic transcription methods using optical character recognition (OCR) performed poorly when applied to manuscript sources.

The rapid improvement of handwritten text recognition (HTR) methods has, however, opened up the possibility of digitizing manuscript sources at a scale not previously possible. Any output from such models needs to be checked, and the use of volunteer labour to do so is increasingly common. In this session we discuss the use of HTR models to transcribe 25,000 wills from the sixteenth, seventeenth and eighteenth centuries as part of the Material Culture of Wills project.

This project used the HTR platform Transkribus to transcribe the contents of these wills, but has also worked with a small group expert volunteers (to produce training data) and a wider pool of volunteers through the Zooniverse platform (to check model outputs). This combination of human and algorithmic labour has proved successful in developing accurate transcriptions of our wills sample and we will discuss our approach to this task, and the lessons learnt.

All welcome (not restricted to post-docs).


Organized by

Warwick History ‘Post-Doc’ club

Followers

--

Events

--

Hosting

--

Free
Oct 22 · 9:00 AM PDT