Web scraping with R

Date and time


Online event

A one-day course to learn and practice effective skills in collecting and analysing online data using R.

About this event


In recent years, there has been an increase in interest in collecting and analysing data from online sources among social scientists. Online data can have many shapes and forms – from traditionally offline data made online (e.g., newspaper articles, speeches) to new data (e.g., social media).

Despite the growing interest in the data and the online environment in general, learning to access the data is seldom a part of university curriculums. This workshop will provide an introduction to the two most prominent ways of collecting such data - APIs (application programming interfaces) and screen scraping.

The workshop will include hands-on exercises in R. To get the most out of the workshop, participants should ideally have some prior experience with R (installing and loading packages, assigning variables, using existing functions).

Participants will learn:

- About the characteristics of online data – What are the (dis)advantages?

- How to access the data with both APIs and screen scraping of static websites with R

- To process the data into a structured format


- Installed R & RStudio

About the instructor

Renata Topinkova is a PhD candidate in Sociology within the Czech Academy of Sciences. Her domains of interest include data-heavy quantitative research projects, amongst which the study of behaviour on online dating platforms, age homophily in partner preferences and the relation between capital and housework. More details about her research can be found here.

