The following is a list of online courses and other resources that might be useful preparation for the MS ADS program. Completion of these courses does not replace the official program prerequisites. Rather, this page is mainly intended for prospective or entering students who may wish to reinforce their preparation before the program starts.
• Python
• R
• SQL
• SAS
• Datasets
• Machine Learning and Statistics
Python | |
---|---|
Anaconda Installation | Installation of Anaconda, an open-source and the most popular platform of choice for Python in data science |
Getting Started with Jupyter Notebooks | Installation of Jupyter Notebook, a web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text |
Visual Studio (VS) | A versatile code editor supporting multiple programming languages with features for development operations, debugging, and version control |
Python for Non-Programmers | Tutorials for non-programmers to get started with Python |
Programming in Python | Introduction to programming in Python for people with little or no previous programming experience |
LearnPython.org | Short, interactive tutorial for those who just need a quick way to pick up Python syntax |
Python Data Science Handbook | Python Data Science Handbook |
Learn Python, Break Python: A Beginner’s Guide to Programming | Hands-on introduction to the Python programming language |
R | |
Install R and RStudio | A step-by-step installation guide for beginners |
RStudio Cheatsheets | Cheatsheet of basic and advanced R syntax |
R Tutorial | Developed for students who are new to R but have had some basic experience working with computers |
Programming with R | Basic concepts that R programming language depends on |
Introduction to Data Analysis with R by David Langer Videos | A playlist providing end-to-end data science training, including data exploration, data wrangling, data analysis, data visualization, feature engineering, and machine learning |
R for Data Science by Garrett Grolemund and Hadley Wickham | Free e-book that teaches you how to get data into R, get it into the most useful structure, transform it, visualise it and model it |
Advanced R by Hadley Wickham | A book designed primarily for R users who want to improve their programming skills and understanding of the language |
Text Mining with R: A Tidy Approach by Julia Silge and David Robinson | An introduction of text mining using the tidytext package and other tidy tools in R |
SQL | |
MySQL Workbench from Oracle | A visual tool for database design, SQL development, and server management |
How to Install SQL Server | A tutorial on how to install SQL Server |
SQL Cheat Sheet | Cheatsheet of most commonly used SQL statements for your reference |
SQL Server Microsoft Documentation | Tutorials help in learning new functionality in SQL Server |
Intro to SQL by Khan Academy (Course) | Comprehensive video series that covers most important SQL topics |
Databases and SQL | How to use a database to explore the expeditions data |
Head First SQL by Lynn Beighley | Basic concept of SQL and its applications |
SQL Tutorial by w3schools | SQL tutorial on how to use SQL in: MySQL, SQL Server, MS Access, Oracle, Sybase, Informix, Postgres, and other database systems |
Data Visualization | |
Data Visualisation Resources | A comprehensive list of different types of data visualizations |
Tableau Free Training Videos | Tutorial on how to prepare, analyze, and share your data |
Microsoft Power BI Guided Learning | Microsoft sequenced collection of courses, and understand the extensive and powerful capabilities of Microsoft Power BI |
Using the ggplot library in R | Video series on how to visualize data with ggplot library in R |
R Shiny | Tutorials on using Shiny, an R package that makes it easy to build interactive web apps straight from R |
Using the matplotlib library in Python | Video series on how to visualize data with matplotlib library in Python |
Python Seaborn Tutorial | Tutorials on Seaborn, a high-level interface to matplotlib |
Bokeh | Tutorials on Bokeh, an interactive visualization library for modern web browsers |
SAS | |
Free SAS E-Learning for Academics | Free online courses and teaching materials from SAS |
SAS Learning Module by UCLA | Freely accessible SAS Learning Modules from UCLA |
Microsoft Excel | |
Excel Training Documents | A comprehensive guide on how to use Excel effectively |
Excel 2013 tutorial | Tutorial on how to create formulas and charts, use functions, format cells, and do more with your spreadsheets |
Microsoft Excel by Analytics Vidhya | Basic excel operations, leading up to the advanced features, such as pivot tables, conditional formatting, and many other things |
Excel for Windows training | Excel tutorial provided directly by Microsoft |
Version Control (Github) | |
Github | Getting started with Github |
GitHub Learning Lab | Grow your skills by completing fun, realistic projects from Github |
Version Control with Git | Basic concept of version control and how to use Github to keep track of what you’ve done and to collaborate with other people |
Git Cheatsheets | Cheatsheet of Github basic and advanced syntax |
Datasets | |
UC Irvine Machine Learning Repository | Currently maintaining 557 data sets as a service to the machine learning community |
Biomedical & Life Sciences Datasets | Library resources in Biology by Brown University |
Awesome Public Datasets | A list of a topic-centric public data sources in high quality, collected from blogs, answers, and user responses |
Kaggle Datasets | Containing over 19,000 public datasets and 200,000 public notebooks to conquer any analysis in no time |
The home of the U.S. Government’s open data | Data, tools, and resources to conduct research, develop web, mobile applications and design data visualizations |
Bureau of Labor Statistics | BLS Public Data gives the public access to raw economic data from all BLS programs |
Amazon Web Services (AWS) datasets | Help people discover and share datasets that are available via AWS resources |
BigQuery Public Datasets | Public datasets are available for you to analyze using either legacy SQL or standard SQL queries |
YouTube-8M Segments Dataset | A large-scale labeled video dataset that consists of millions of YouTube video IDs, with high-quality machine-generated annotations from a diverse vocabulary of 3,800+ visual entities |
Harvard Dataverse | A repository for research data by Harvard University |
Large Health Data Sets | A freely accessible repository for healthcare data |
World Bank Open Data | Free and open access to global development data |
Popular Open Source Datasets | Popular open source and public data sets, data visualization, data analytics and data lakes |
Machine Learning and Statistics | |
Think Stats Probability and Statistics for Programmers by Allen B. Downey | Computational approach to the use of statistics to explore large datasets |
Think Bayes | An introduction to Bayesian statistics using computational methods |
Introduction to Probability | An introductory probability course |
Statistical inference for data science | A brief, but rigorous, treatment of statistical inference intended for practicing Data Scientists |
An Introduction to Statistical Learning with Applications in R | An introduction to statistical learning methods, aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences |
The Elements of Statistical Learning: Data Mining, Inference, and Prediction | A valuable resource for statisticians and anyone interested in data mining in science or industry |
Machine Learning by Andrew Ng | A comprehensive series on machine learning |
Machine Learning Yearning | A free e-book from Andrew Ng, teaches you how to structure Machine Learning projects |
Neural Networks and Deep Learning | A free online book that provides the best solutions to many problems in image recognition, speech recognition, and natural language processing |
Mathematics for Machine Learning | A series of foundational mathematics for machine learning offered by Imperial College London |
Effective Writing | |
Effective Writing in the Information Age | Improving your technical writing skills |
Ten Simple Rules for Better Figures | Ten simple rules for effective visualization of data |
APA Style Introduction | A workshop that provides an overview of APA (American Psychological Association) style in technical writing |
APA Videos by Walden University | Videos on APA |
APA 7 | Tutorial on APA 7 |