LexiVault Workshop

LexiVault Workshop

Join us at the LexiVault workshop for a deep dive in to the user-friendly web tool for querying annotated lexicons of low resource languages

Date and time

Thu, 12 Jun 2025 10:00 - Fri, 13 Jun 2025 16:00 GMT+1

Location

Francis Bancroft Building

Room 1.06 Francis Bancroft Building London E1 4NS United Kingdom

Agenda

10:00 AM - 12:00 PM

Day 1: Morning Session

1:00 PM - 4:00 PM

Day 1: Afternoon Session

10:00 AM - 12:00 PM

Day 2: Morning Session

1:00 PM - 4:00 PM

Day 2: Afternoon Session

About this event

  • Event lasts 1 day 6 hours

LexiVault workshop (led by Samantha Wray, the LexiVault Lead Developer)

LexiVault is an open-source, user-friendly web tool, developed as part of the SAVANT project, for querying annotated lexicons. It has been primarily developed for, but is not restricted to, the support of psycholinguistic research on low-resource languages. Psycholinguistically relevant measures from word frequency to phonological neighborhood density are readily available for well-resourced languages, whereas lesser-studied languages come with substantial overhead for the researcher to build corpora and calculate these measures from scratch. LexiVault aims to close that gap.


Currently the tool hosts lexicons of Tagalog, Bangla, and multiple Arabic dialects, with searchable annotations including part of speech tags, morpheme frequency, transition probability, and more, but we'd like to expand our offerings while helping you convert your bits and bobs of language data to a useable, shareable resource! This workshop is intended for those with any amount of corpus or behavioral data that they would like to process or annotate further for storage and usage on the LexiVault site.

The focus of this two-day workshop will differ from individual to individual depending on the starting state of your dataset and your interests, but could take the following forms:

  • Automatic transcription of auditory data to create a text corpus from speech -stemming a text corpus to create a list of morphemes and their frequencies
  • Part-of-speech tagging a text corpus
  • Calculating minimal pairs and phonological neighborhood density from a text corpus. And finally, all paths lead to your resource being in a form you (and others!) can easily query in the future.

Tickets

Frequently asked questions

What is the room number?

Room 1.06, Bancroft Building

Organised by