Materials


The following is a list of online courses and other resources that might be useful preparation for the MS ADS program. Completion of these courses does not replace the official program prerequisites. Rather, this page is mainly intended for prospective or entering students who may wish to reinforce their preparation before the program starts.


Python

R

SQL

Data Visualization

SAS

Microsoft Excel

Version Control (Github)

Datasets

Machine Learning and Statistics

Effective Writing




Python
Anaconda Installation Installation of Anaconda, an open-source and the most popular platform of choice for Python in data science
Getting Started with Jupyter Notebooks Installation of Jupyter Notebook, a web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text
Python Cheatsheet Cheatsheet of basic and advanced Python syntax
Python for Non-Programmers Tutorials for non-programmers to get started with Python
Programming in Python Introduction to programming in Python for people with little or no previous programming experience
LearnPython.org Short, interactive tutorial for those who just need a quick way to pick up Python syntax
Python Data Science Handbook Python Data Science Handbook
Learn Python, Break Python: A Beginner’s Guide to Programming Hands-on introduction to the Python programming language
R
Install R and RStudio A step-by-step installation guide for beginners
RStudio Cheatsheets Cheatsheet of basic and advanced R syntax
R Tutorial Developed for students who are new to R but have had some basic experience working with computers
Programming with R Basic concepts that R programming language depends on
Introduction to Data Analysis with R by David Langer Videos A playlist providing end-to-end data science training, including data exploration, data wrangling, data analysis, data visualization, feature engineering, and machine learning
R for Data Science by Garrett Grolemund and Hadley Wickham Free e-book that teaches you how to get data into R, get it into the most useful structure, transform it, visualise it and model it
Advanced R by Hadley Wickham A book designed primarily for R users who want to improve their programming skills and understanding of the language
Text Mining with R: A Tidy Approach by Julia Silge and David Robinson An introduction of text mining using the tidytext package and other tidy tools in R
SQL
How to Install SQL Server A tutorial on how to install SQL Server
SQL Cheat Sheet Cheatsheet of most commonly used SQL statements for your reference
SQL Server Microsoft Documentation Tutorials help in learning new functionality in SQL Server
Intro to SQL by Khan Academy (Course) Comprehensive video series that covers most important SQL topics
Databases and SQL How to use a database to explore the expeditions data
Head First SQL by Lynn Beighley Basic concept of SQL and its applications
SQL Tutorial by w3schools SQL tutorial on how to use SQL in: MySQL, SQL Server, MS Access, Oracle, Sybase, Informix, Postgres, and other database systems
Data Visualization
Data Visualisation Resources A comprehensive list of different types of data visualizations
Tableau Free Training Videos Tutorial on how to prepare, analyze, and share your data
Microsoft Power BI Guided Learning Microsoft sequenced collection of courses, and understand the extensive and powerful capabilities of Microsoft Power BI
Using the ggplot library in R Video series on how to visualize data with ggplot library in R
R Shiny Tutorials on using Shiny, an R package that makes it easy to build interactive web apps straight from R
Using the matplotlib library in Python Video series on how to visualize data with matplotlib library in Python
Python Seaborn Tutorial Tutorials on Seaborn, a high-level interface to matplotlib
Bokeh Tutorials on Bokeh, an interactive visualization library for modern web browsers
SAS
Free SAS E-Learning for Academics Free online courses and teaching materials from SAS
SAS Learning Module by UCLA Freely accessible SAS Learning Modules from UCLA
Microsoft Excel
Excel Training Documents A comprehensive guide on how to use Excel effectively
Excel 2013 tutorial Tutorial on how to create formulas and charts, use functions, format cells, and do more with your spreadsheets
Microsoft Excel by Analytics Vidhya Basic excel operations, leading up to the advanced features, such as pivot tables, conditional formatting, and many other things
Excel for Windows training Excel tutorial provided directly by Microsoft
Version Control (Github)
Github Getting started with Github
GitHub Learning Lab Grow your skills by completing fun, realistic projects from Github
Version Control with Git Basic concept of version control and how to use Github to keep track of what you’ve done and to collaborate with other people
Git Cheatsheets Cheatsheet of Github basic and advanced syntax
Datasets
UC Irvine Machine Learning Repository Currently maintaining 557 data sets as a service to the machine learning community
Biomedical & Life Sciences Datasets Library resources in Biology by Brown University
Awesome Public Datasets A list of a topic-centric public data sources in high quality, collected from blogs, answers, and user responses
Kaggle Datasets Containing over 19,000 public datasets and 200,000 public notebooks to conquer any analysis in no time
The home of the U.S. Government’s open data Data, tools, and resources to conduct research, develop web, mobile applications and design data visualizations
Bureau of Labor Statistics BLS Public Data gives the public access to raw economic data from all BLS programs
Amazon Web Services (AWS) datasets Help people discover and share datasets that are available via AWS resources
BigQuery Public Datasets Public datasets are available for you to analyze using either legacy SQL or standard SQL queries
YouTube-8M Segments Dataset A large-scale labeled video dataset that consists of millions of YouTube video IDs, with high-quality machine-generated annotations from a diverse vocabulary of 3,800+ visual entities
Harvard Dataverse A repository for research data by Harvard University
Large Health Data Sets A freely accessible repository for healthcare data
World Bank Open Data Free and open access to global development data
Popular Open Source Datasets Popular open source and public data sets, data visualization, data analytics and data lakes
Machine Learning and Statistics
Think Stats Probability and Statistics for Programmers by Allen B. Downey Computational approach to the use of statistics to explore large datasets
Think Bayes An introduction to Bayesian statistics using computational methods
Introduction to Probability An introductory probability course
Statistical inference for data science A brief, but rigorous, treatment of statistical inference intended for practicing Data Scientists
An Introduction to Statistical Learning with Applications in R An introduction to statistical learning methods, aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences
The Elements of Statistical Learning: Data Mining, Inference, and Prediction A valuable resource for statisticians and anyone interested in data mining in science or industry
Machine Learning by Andrew Ng A comprehensive series on machine learning
Machine Learning Yearning A free e-book from Andrew Ng, teaches you how to structure Machine Learning projects
Neural Networks and Deep Learning A free online book that provides the best solutions to many problems in image recognition, speech recognition, and natural language processing
Mathematics for Machine Learning A series of foundational mathematics for machine learning offered by Imperial College London
Effective Writing
Effective Writing in the Information Age Improving your technical writing skills
Ten Simple Rules for Better Figures Ten simple rules for effective visualization of data
APA Style Introduction A workshop that provides an overview of APA (American Psychological Association) style in technical writing
APA Videos by Walden University Videos on APA
APA 7 Tutorial on APA 7