In this tutorial we will take a look at some fundamental skills that will get you started on your journey with text mining. To kick off, we will learn how to tell the computer what to search for. We will start out with simple search operations and explore their limitations. After that, we will look at more complex search operations. We will also introduce the first data wrangling steps for example converting text data into other formats for further processing.
First, we will look at how a computer searches in a text. We then cover relatively simple searching in documents and identify the limits of that. This is followed by an explanation of more complex search operations, which are followed by practical exercises. We end with reasoning about complex search and replace operations (and potentially a brief explanation of the theory behind this).
This tutorial is for people who want to improve their search skill. It is accessible to humanities and social sciences students and researchers with no prior exposure to programming. We will not be covering any advanced text mining strategies or tools. Skills learned will be applicable in other aspects of research such as well, e.g. literature reviews
Learning outcomes:
- Perform simple and more complex search operations in texts
- Automatically identify patterns in text (such as dates, numbers, etc)
- Perform search and replace operations that allow for simple data wrangling (in other words, to convert text into data formats for further processing).