1. Course description.
This two-day course covers one of the most exciting and current topics within the R community. Although traditionally R has not been used for Big Data analytics due to its memory limitations, recent R packages have provided much-needed connectivity for out-of-memory processing with popular Big Data tools such as Hadoop, Spark, SQL and NoSQL databases etc. During this intense and skills-oriented course you will learn:
- to use third-party R packages, which support parallel computing in order to increase the speed and processing capabilities of R,
- to work on large data sets in the Cloud (Microsoft Azure and Amazon EC2) through R deployed on the server,
- to implement MapReduce framework through Hadoop straight from R console,
- to manage Hadoop Distributed File System and HBase database through R,
- to connect to and extract, aggregate and manage the data in major relational SQL-based database management systems (RDBMSs) using a variety of R packages,
- to apply NoSQL queries to access, transform and manipulate large data sets in MongoDB using R packages,
- to improve the data flow and speed of processing of large data sets through R’s connectivity with Spark,
- to implement selected Big Data tools in the Big Data Product Cycle with R.
During this course attendees will run R on a Hadoop/Spark cluster and will accomplish several simple parallel jobs.
The course will be presented by Simon Walkowiak - an author of "Big Data Analytics with R" and Mind Project's expert in Big Data architecture for predictive modelling. Simon has delivered numerous "Big Data Methods in R" training courses at various institutions, financial/business organisations, governmental departments and UK universities (including Big Data & Analytics Summer School organised by the Institute for Analytics and Data Science). He is also a former Data Curator at the UK Data Archive - the largest socio-economic digital data depository in Europe.
The course will run for two days from 9:30am until ~5:00pm and will consist of alternating lecture-style presentations and practical tutorials. The example datasets used during tutorial sessions will come from social sciences, economics and business fields, however the contents may vary depending on specific interests of participants (based on the Participant's Skills Inventory). There will be two 15-minute coffee/tea breaks and one 1-hour lunch break during each day.
3. What is included?
Apart from the contents of the course, Mind Project will provide the participants with the following:
- a digital (USB memory stick) Course Manual including all presentation slides, R course codes and a list of reference books and online resources,
- additional home exercises and all data sets available to download,
- Wi-Fi access,
- Central London location - at the Ironmongers' Hall, next to the Barbican station and the Museum of London,
- networking opportunity,
- Mind Project course attendance certificate.
4. Further instructions.
- In order to fully benefit from the Seminar, we recommend that attendees bring their personal laptops to the session with the most recent version of R and R Studio software installed and at least one of the following web browsers: Chrome, Safari, Mozilla Firefox and/or Internet Explorer. As R is a free environment you can download it directly from www.r-project.org website and R Studio is available at https://www.rstudio.com/products/rstudio/#Desktop. Please contact us should you have any questions or issues with the installation process. No specific R packages are required before the course (the course tutors will explain this during the training).
- This course is targeted at users with some R experience (preferably at Intermediate level) and interest in Big Data analytics. Our “Applied Data Science in R” training course is a good pre-requisite to participate in this course.
- Participants are encouraged to complete the online Participant's Skills Inventory available at http://mindproject.co.uk/skillsinventory.html to allow Mind Project and our course tutors to customise the contents of the course depending on the level of participants' knowledge and their areas of interest. The data obtained through the Participant's Skills Inventory will be held fully-confidential and will only be used to provide a quality data analysis training.
- By purchasing a place on one of our courses you agree to the Terms and Conditions available at http://mindproject.co.uk/trainingterms.html. Please read the Terms and Conditions before making a booking.
- The deadline for registrations on this training course is Tuesday, 11th of October 2016 at 16:00 London (UK) time. However, Mind Project reserves the right to end the registration process earlier if all places are booked before the deadline.
Should you have any questions please contact Mind Project Ltd at firstname.lastname@example.org or by phone on 0203 322 3786. Please visit the course website at http://mindproject.co.uk/bigdata-london-oct16.html.