SADiLaR GROBID-Dictionaries workshop (Stellenbosch)

SADiLaR GROBID-Dictionaries workshop (Stellenbosch)

Actions and Detail Panel

Sales Ended

Date and time



Stellenbosch Institute for Advanced Studies

Stellenbosch, Western Cape 7599

South Africa

View map


SADiLaR GROBID-Dictionaries workshop

Over the last decade there has been a worldwide effort to make physical lexical resources, such as dictionaries, available in digital format. A large number of digitized lexical resources remain unexploited due to their unstructured content, and given their complexity, manually structuring such resources is a costly task. During this period a large number of natural language processing tools have become available to the lexicography community, but the usability of several e-lexicography tools represents a serious obstacle for researchers with little or no background in computer science.

GROBID-Dictionaries is the first machine learning infrastructure for automatically structuring digitised dictionaries, originally in PDF format, independently from the language or the lexicographic school or style. The system allows for the cascading extraction of lexical information using pre-trained models and the serialisation of each structuring level in a TEI-compliant output. The goal of the workshop is to get familiar with the training process of each model of the infrastructure and learn how to use them to drastically speed up the encoding of lexical samples in TEI.

Recommended reading


Lexicographers, librarians, and scholars involved in the digitisation of dictionaries and lexical resources are invited.

Workshop location

University of Stellenbosch


Participation in the colloquium is FREE.

NB Limited space available.

Coffee, tea and a light lunch will be provided.


Please register before or on 28 October 2018

If you have any question please liaise with Roald Eiselen via e-mail -

Phone (SADiLaR office): 018 285 2046 or Phone Ms Charmaine Jacobs: 076 529 7888

Save This Event

Event Saved