October 2016 (Senate House, London)
We are very excited to invite you to this intensive course on data science.
There is a variety of tickets available for multiple days and individual days. Discounts are applied for multi buy, student, academic and charity staff. For four or more attendees from the same company it is often cheaper to have a bespoke package for your company please contact us directly at firstname.lastname@example.org
- Comprehensive set of printed notes;
- Each course comes with its own R package that contains exercises and solutions;
- Attendance certificate from Jumping Rivers;
- Networking opportunites;
- Small class sizes;
- Central London location.
Oct 7th: R for Big data
This unit is a practical introduction to dealing with large data sets in R. We'll cover hardware, programming with Rcpp, out-of-memory datasets and SparkR.
It is expected that participants have previous R experience, in particular, they are familar with the topics in the programming with R course.
The course will be structured as follows:
- Hardware: a brief overview of CPU, memory sizes and RAM. The benefit of switching to the cloud.
- Rcpp: leveraging C++ for slow operations
- The remainder of the course will consider three classes of data sets:
- Large in-memory data sets: the dplyr package
- Out of memory: ff and the big memory suite of packages
- Distributed data sets: Spark
Participants are encourage to bring their own datasets and associated problems to the event.