2017-18 Open-Course Challenges: Data Science on IS

Inspired by Scott H.Young - MIT Challenges, I decide to design an Open-course Challenges for myself.0. I hope that I could accomplish these goals in time.

Good luck!


  1. I am a year-1 PhD candidate in Information System. Thus the main goal of this challenge is to consolidate the academic foundation and broaden the knowledge. Instead of aim at obtaining a certificate (like on the MOOC), I try to selectively learn the materials which is beneficial for my research and study.
  2. As I’m interested in Machine Learning, Financial Engineering and FinTech, these learning materials focus on these areas.
  3. I’ve collected all the courses that I think are useful for me and listed them here. However, you don’t need to finish all of them (which is almost impossible!). Just select some courses that are beneficial for your research and enjoy it!
  4. Disclaimer: This is a collection of online materials, all the rights should be reserved by the original authors according to the relevant references. If infringement of copyright exist, please email me via Email button under Overview sidebar in the right side of the page.

General Information

Total Number Total Progress
OnProgress 4.77/13

  • $\mathbb{UG}$: Undergraduate Level
  • $\mathbb{G}$: Graduate Level
  • : You’d better watch the videos with normal speed.
  • : You can watch the videos with speed of 1.25X.
  • : You can watch the videos with speed of 1.5X or more.


Math Tree


Linear Algebra & Vector Calculus

  • $\mathbb{UG}$MIT 18.02 Multivariatble Calculus, Fall 2007, by Dennis Auroux: Course Page, Videos
    • Prerequisites: MIT 18.01 - Single Variable Calculus
    • Course Description: This is a basic subject on matrix theory and linear algebra. Emphasis is given to topics that will be useful in other disciplines, including systems of equations, vector spaces, determinants, eigenvalues, similarity, and positive definite matrices.
  • $\mathbb{UG}$MIT 18.06 Linear Algebra by Gilbert Strang: Course Page|Spring2005, Videos|Spring2005, Course Page|Fall2011, Videos-Recitations|Fall2011, Course Page|Spring2010 15/35
    • Prerequisites: MIT 18.02 - Multivariable Calculus
    • Course Description:
      • This course covers vector and multi-variable calculus. It is the second semester in the freshman calculus sequence. Topics include vectors and matrices, partial derivatives, double and triple integrals, and vector calculus in 2 and 3-space.
      • MIT OpenCourseWare offers another version of 18.02, from the Spring 2006 term. Both versions cover the same material, although they are taught by different faculty and rely on different textbooks. Multivariable Calculus (18.02) is taught during the Fall and Spring terms at MIT, and is a required subject for all MIT undergraduates.

Real Analysis

  • $\mathbb{UG}$MIT 18.100C - Real Analysis by Paul Seidel, Fall 2012: Course Page

Functional Analysis

  • $\mathbb{UG}$MIT 18.102 - Introduction to Functional Analysis by Richard Melrose, Spring 2009: Course Page
    • Prerequisites:
      • Analysis I (18.100)
      • Linear Algebra (18.06), Linear Algebra (18.700), or Algebra I (18.701)
    • Course Description: This is a undergraduate course. It will cover normed spaces, completeness, functionals, Hahn-Banach theorem, duality, operators; Lebesgue measure, measurable functions, integrability, completeness of L-p spaces; Hilbert space; compact, Hilbert-Schmidt and trace class operators; as well as spectral theorem.
  • $\mathbb{UG}$UCCS Math 535 - Applied Functional Analysis by Greg Morrow: Course Page, Videos
    • Course Description: This course is an introduction to the basic concepts, methods and applications of functional analysis. Topics covered will include metric spaces, normed spaces, Hilbert spaces, linear operators, spectral theory, fixed point theorems and approximation theorems
  • $\mathbb{UG}$ Coursera Ecole Central - Introduction to Functional Analysis: Videos 1/56

Numerical Analysis

  • $\mathbb{UG}$MIT 18.330 Introduction to Numerical Analysis by Laurent Demanet, Spring 2012: Course Page
    • Prerequisites: Calculus (18.01), Calculus (18.02), and Differential Equations (18.03). Some exposure to linear algebra (matrices) at the level of Linear Algebra (18.06) helps, but is not required. The assignments will involve basic computer programming in the language of your choice (Matlab recommended; this class encourages you to learn Matlab if you don’t already know it).
    • Course Description: This course analyzed the basic techniques for the efficient numerical solution of problems in science and engineering. Topics spanned root finding, interpolation, approximation of functions, integration, differential equations, direct and iterative methods in linear algebra.
  • $\mathbb{G}$ MIT 18.085 Computational Science and Engineering I by Gilbert Strang, Fall 2008: Course Page, Videos
    • Prerequisites: Calculus of Several Variables (18.02) and Differential Equations (18.03) or Honors Differential Equations (18.034)
    • Textbook: Strang, Gilbert. Computational Science and Engineering. Wellesley, MA: Wellesley-Cambridge Press, 2007. ISBN: 9780961408817.
    • Course Description: This course provides a review of linear algebra, including applications to networks, structures, and estimation, Lagrange multipliers. Also covered are: differential equations of equilibrium; Laplace’s equation and potential flow; boundary-value problems; minimum principles and calculus of variations; Fourier series; discrete Fourier transform; convolution; and applications.
    • Note: This course was previously called “Mathematical Methods for Engineers I.”
    • Course Outline: This course has four major topics:
      • Applied linear algebra (so important!)
      • Applied differential equations (for engineering and science)
      • Fourier methods
      • Algorithms (lu, qr, eig, svd, finite differences, finite elements, FFT)
  • $\mathbb{G}$ MIT 18.086 - Mathematical Methods for Engineers II by Gilbert Strang, Spring 2006: Course Page, Videos
    • Prerequisites: Calculus (18.02), Differential Equations (18.03) or Honors Differential Equations (18.034).
    • Textbook: Strang, Gilbert. Introduction to Applied Mathematics. Wellesley, MA: Wellesley-Cambridge Press, 1986. ISBN: 9780961408800.
    • Course Description: This graduate-level course is a continuation of Mathematical Methods for Engineers I (18.085). Topics include numerical methods; initial-value problems; network flows; and optimization.

Stochastic Analysis

  • $\mathbb{UG}$MIT EECS 6.262 - Discrete Stochastic Processes by Robert Gallager, Spring 2011: 1h20min/Lec, 25 Lectures, Course Page, Videos
    • Prerequisites: MIT 6.041/6.341 Probabilistic Systems Analysis and Applied Probability
    • Course Description: Discrete stochastic processes are essentially probabilistic systems that evolve in time via random changes occurring at discrete fixed or random intervals. This course aims to help students acquire both the mathematical principles and the intuition necessary to create, analyze, and understand insightful models for a broad range of these processes. The range of areas for which discrete stochastic-process models are useful is constantly expanding, and includes many applications in engineering, physics, biology, operations research and finance.


  • $\mathbb{UG}$Stanford EE364 Convex Optimization: Videos 1/37
    • Part A: by Sanjay Lall: 1h15min/Lec, 19 Lectures, Course Page 1, Course Page 2
      • Prerequisites:
        • Good knowledge of linear algebra (as in EE263), and exposure to probability. Exposure to numerical computing, optimization, and application fields helpful but not required; the applications will be kept basic and simple.
        • You will use one of CVX (Matlab), CVXPY (Python), or Convex.jl (Julia), to write simple scripts, so basic familiarity with elementary programming will be required. We refer to CVX, CVXPY, and Convex.jl collectively as CVX.
      • Course Description: Concentrates on recognizing and solving convex optimization problems that arise in applications. Convex sets, functions, and optimization problems. Basics of convex analysis. Least-squares, linear and quadratic programs, semidefinite programming, minimax, extremal volume, and other problems. Optimality conditions, duality theory, theorems of alternative, and applications. Interior-point methods. Applications to signal processing, statistics and machine learning, control and mechanical engineering, digital and analog circuit design, and finance.
    • Part B by Stephen Boyd and John Duchi: 1h15min/Lec, 18 Lectures, Course Page 1, Course Page 2, Videos
      • Prerequisites: EE364a
      • Course Description: Continuation of 364a. Subgradient, cutting-plane, and ellipsoid methods. Decentralized convex optimization via primal and dual decomposition. Alternating projections. Exploiting problem structure in implementation. Convex relaxations of hard problems, and global optimization via branch & bound. Robust optimization. Selected applications in areas such as control, circuit design, signal processing, and communications. Course requirements include a project or final exam (chosen individually as desired).
  • $\mathbb{UG}$CUHK ENGG 5501 - Foundations of Optimization by Anthony Man-Cho So: Course Page|2015-16, Course Page|2017-18
    • Course Description: In this course we will develop the basic machinery for formulating and analyzing various optimization problems. Topics include convex analysis, linear and conic linear programming, nonlinear programming, optimality conditions, Lagrangian duality theory, and basics of optimization algorithms. Applications from different fields, such as combinatorial optimization, communications, computational economics and finance, machine learning, and signal and image processing, will be used to complement the theoretical developments. No prior optimization background is required for this class. However, students should have workable knowledge in multivariable calculus, real analysis, linear algebra and matrix theory.
    • Very detailed lecture notes

Computer Science

Data Scientist Track


Data structure and algorithms

  • $\mathbb{UG}$MIT 6.006 - Introduction to Algorithms by Erik Demaine and Srinivas Devadas, Fall 2011: 50min/Lec, 47 Lectures, Course Page, Course Page|MIT OpenCourseWare, Videos 3/47
    • Prerequisite: A firm grasp of Python and a solid background in discrete mathematics are necessary prerequisites to this course. You are expected to have mastered the material presented in 6.01 Introduction to EECS I and 6.042J Mathematics for Computer Science.
    • Course Description: This course provides an introduction to mathematical modeling of computational problems. It covers the common algorithms, algorithmic paradigms, and data structures used to solve these problems. The course emphasizes the relationship between algorithms and programming, and introduces basic performance measures and analysis techniques for these problems.
  • $\mathbb{UG}$MIT 6.046J - Design and Analysis of Algorithms by Erik Demaine, Srinivas Devadas and Nancy Lynch, Spring 2015: Course Page, Course Page|MIT OpenCourseWare, Videos
    • Prerequisites: This course is the header course for the Theory of Computation concentration. You are expected, and strongly encouraged, to have taken:
      • 6.006 Introduction to Algorithms
      • 6.042J / 18.062J Mathematics for Computer Science
    • Course Description: This is an intermediate algorithms course with an emphasis on teaching techniques for the design and analysis of efficient algorithms, emphasizing methods of application. Topics include divide-and-conquer, randomization, dynamic programming, greedy algorithms, incremental improvement, complexity, and cryptography.
  • $\mathbb{G}$MIT 18.409 - Algorithmic Aspects of Machine Learning by Ankur Moitra, Spring 2015: Course Page, Course Page|MIT OpenCourseWare, Videos
    • Prerequisites: You will need a strong background in algorithms, probability and linear algebra.
      • 6.046J / 18.410J Design and Analysis of Algorithms or equivalent
      • 6.041SC Probabilistic Systems Analysis and Applied Probability or 18.440 Probability and Random Variables or equivalent.
    • Course Description: This course is organized around algorithmic issues that arise in machine learning. Modern machine learning systems are often built on top of algorithms that do not have provable guarantees, and it is the subject of debate when and why they work. In this class, we focus on designing algorithms whose performance we can rigorously analyze for fundamental machine learning problems.


  • Stanford Introduction to Databases by Jennifer Widom: Course Page, Collection of course materials, Videos
    • Course Description: This course covers database design and the use of database management systems for applications. It includes extensive coverage of the relational model, relational algebra, and SQL. It also covers XML data including DTDs and XML Schema for validation, and the query and transformation languages XPath, XQuery, and XSLT. The course includes database design in UML, and relational design principles based on dependencies and normal forms. Many additional key database topics from the design and application-building perspective are also covered: indexes, views, transactions, authorization, integrity constraints, triggers, on-line analytical processing (OLAP), JSON, and emerging NoSQL systems. Working through the entire course provides comprehensive coverage of the field, but most of the topics are also well-suited for “a la carte” learning.

Distributed System

  • Stanford CS246 Mining Massive Data Sets by Jeff Ullman, Winter 2017: Course Page, Videos
    • Prerequisites: Students are expected to have the following background:
      • Knowledge of basic computer science principles and skills, at a level sufficient to write a reasonably non-trivial computer program (e.g., CS107 or CS145 or equivalent are recommended).
      • Good knowledge of Java will be extremely helpful since most assignments will require the use of Hadoop which is written in Java.
      • Familiarity with the basic probability theory (CS109 or Stat116 or equivalent is sufficient but not necessary).
      • Familiarity with writing rigorous proofs (at a minimum, at the level of CS 103).
      • Familiarity with basic linear algebra (e.g., any of Math 51, Math 103, Math 113, CS 205, or EE 263 would be much more than necessary).
      • Familiarity with algorithmic analysis (e.g., CS 161 would be much more than necessary).
        The recitation sessions in the first weeks of the class will give an overview of the expected background.
    • Course Description: The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data. Topics include: Frequent itemsets and Association rules, Near Neighbor Search in High Dimensional Data, Locality Sensitive Hashing (LSH), Dimensionality reduction, Recommendation Systems, Clustering, Link Analysis, Large scale supervised machine learning, Data streams, Mining the Web for Structured Data, Web Advertising.

Statistics & Probability

Probability Cheatsheet:

Statistics Cheatsheet:



  • $\mathbb{UG}$MIT 6.041(UG)/6.431(G) - Probabilistic Systems Analysis and Applied Probability by John Tsitsiklis, Fall 2010: 25 Lectures, Course Page, Videos 61/76
    • Prerequisites: MIT 18.02 - Multivariable Calculus
    • Textbook: Bertsekas, Dimitri, and John Tsitsiklis. Introduction to Probability. 2nd ed. Athena Scientific, 2008. ISBN: 9781886529236.
    • Course Description: Welcome to 6.041/6.431, a subject on the modeling and analysis of random phenomena and processes, including the basics of statistical inference. Nowadays, there is broad consensus that the ability to think probabilistically is a fundamental component of scientific literacy. The aim of this class is to introduce the relevant models, skills, and tools, by combining mathematics with conceptual understanding and intuition.1
  • $\mathbb{UG}$ MIT 18.440 - Probability and Random Variables by Scott Sheffield, Spring 2014: Course Page
    • Prerequisites: 18.02SC Multivariable Calculus
    • Textbook: Ross, Sheldon. A First Course in Probability. 8th ed. Pearson Prentice Hall, 2009. ISBN: 9780136033134.
    • Course Description: This course introduces students to probability and random variables. Topics include distribution functions, binomial, geometric, hypergeometric, and Poisson distributions. The other topics covered are uniform, exponential, normal, gamma and beta distributions; conditional probability; Bayes theorem; joint distributions; Chebyshev inequality; law of large numbers; and central limit theorem.
  • $\mathbb{G}$MIT 18.175 - Theory of Probability by Scott Sheffield, Spring 2014: Course Page
    • Prerequisites: MIT 18.100C Real Analysis
    • Course Description: This course covers topics such as sums of independent random variables, central limit phenomena, infinitely divisible laws, Levy processes, Brownian motion, conditioning, and martingales.
  • $\mathbb{UG}$/$\mathbb{G}$MIT 18.650 - Statistics for Applications by Philippe Rigollet, Fall 2016: 1h15min/Lec, 21 Lectures, Course Page, Videos 16/22
    • Prerequisites: Probability theory at the level of 18.440 Probability and Random Variables. Some linear algebra (matrices, vectors, eigenvalues).
    • Course Description: This course offers an in-depth the theoretical foundations for statistical methods that are useful in many applications. The goal is to understand the role of mathematics in the research and development of efficient statistical methods.
    • Rigollet’s style is very competent, so you are suggested to watch the video in normal speed.
  • $\mathbb{G}$ CMU 36-401 Modern Regression by Larry Wasserman: Course Page
    • Prerequisites: A minimum grade of C in any one of the pre-requisites is required. A grade of C is required to move on to 36-402 or any 36-46x course.
      • At least a C grade in (36-226 or 36-625 or 73-407 or 36-310) and (21-240 or 21-241).
    • Textbook: Applied Linear Regression Models, Fourth Edition by Kutner, Nachtsheim and Neter.
    • Course Description: This course is an introduction to applied data analysis. We will explore data sets, examine various models for the data, assess the validity of their assumptions, and determine which conclusions we can make (if any). Data analysis is a bit of an art; there may be several valid approaches. We will strongly emphasize the importance of critical thinking about the data and the question of interest. Our overall goal is to use a basic set of modeling tools to explore and analyze data and to present the results in a scientific report. The course includes a review and discussion of exploratory methods, informal techniques for summarizing and viewing data. We then consider simple linear regression, a model that uses only one predictor. After briefly reviewing some linear algebra, we turn to multiple linear regression, a model that uses multiple variables to predict the response of interest. For all models, we will examine the underlying assumptions. More specifically, do the data support the assumptions? Do they contradict them? What are the consequences for inference? Finally, we will explore extra topics such as nonlinear regression or regression with time-dependent data.
  • $\mathbb{G}$ CMU 36-705 Intermediate Statistics
    • by Larry WassermanFall, 2016: Course Page, Videos 1/18
    • by Siva Balakrishnan, Fall 2017: Course Page
    • Textbook: Wasserman, L. (2004). All of Statistics: A concise course in statistical inference.
    • Prerequisites: I assume that you know the material in Chapters 1-3 of of the book (basic probability) are familiar to you. If not, then you should take 36-700 Probability and Mathematical Statistics I.
    • Course Description: This course will cover the fundamentals of theoretical statistics. We will cover Chapters 1 — 12 from the text plus some supplementary material. This course is excellent preparation for advanced work in statistics and machine learning.
  • $\mathbb{G}$MIT 18.S997 High-Dimensional Statistics by Philippe Rigollet, Spring 2015: 5 Chapters, Course Page 1/5
    • This course is mainly about learning a regression function from a collection of observations. In this chapter, after defining this task formally, we give an overview of the course and the questions around regression. We adopt the statistical learning point of view where the task of prediction prevails. Nevertheless many interesting questions will remain unanswered when the last page comes: testing, model selection, implementation…

Statistical Learning

  • $\mathbb{G}$Stanford Online-StatLearning Statistical Learning by Trevor Hastie and Rob Tibshirani: 10 Lectures, Course Page, Notes&Excercise 4/10
    • Prerequisites: First courses in statistics, linear algebra, and computing.
    • Textbook: James, Gareth, et al. An introduction to statistical learning. Vol. 112. New York: springer, 2013.
    • Course Description:
      • This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical).
      • This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data analysis. Computing is done in R. There are lectures devoted to R, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter.
  • $\mathbb{G}$CMU 10-702/36-702 - Statistical Machine Learning by Larry Wasserman: Course Page|2016Spring, Videos|2016Spring, Course Page|2017Spring, Videos|2017Spring 1/24
    • Prerequisites: You should have taken 10-701 Introduction to Machine Learning and 36-705 Intermediate Statistics. If you did not take these courses, it is your responsibility to do background reading to make sure you understand the concepts in those courses. We will assume that you are familiar with the following concepts:
      1. Convergence in probability and convergence in distribution.
      2. The central limit theorem and the law of large numbers.
      3. Maximum likelihood, Fisher information.
      4. Bayesian inference.
      5. Regression.
      6. The Bias-Variance tradeoff.
      7. Bayes classifiers; linear classifiers; support vector machines.
      8. Determinants, eigenvalues and eigenvectors.
    • Course Description: Statistical Machine Learning is a second graduate level course in advanced machine learning, assuming students have taken Machine Learning (10-715) - 2014Fall, 2015Fall and Intermediate Statistics (36-705). The term “statistical” in the title reflects the emphasis on statistical theory and methodology. The course combines methodology with theoretical foundations. Theorems are presented together with practical aspects of methodology and intuition to help students develop tools for selecting appropriate methods and approaches to problems in their own research. The course includes topics in statistical theory that are important for researchers in machine learning, including nonparametric theory, consistency, minimax estimation, and concentration of measure.
  • $\mathbb{G}$MIT 9.520 - Statistical Learning Theory and Applications by Tomaso Poggio and Lorenzo Rosasco: Course Page|2015Fall, Course Page|2017Fall, Videos|Fall2015
    • Prerequisites: We will make extensive use of basic notions of calculus, linear algebra and probability. The essentials are covered in class and in the math camp material. We will introduce a few concepts in functional/convex analysis and optimization. Note that this is an advanced graduate course and some exposure on introductory Machine Learning concepts or courses is expected. Students are also expected to have basic familiarity with MATLAB/Octave.
    • Course Description: The course covers foundations and recent advances of Machine Learning from the point of view of Statistical Learning and Regularization Theory…The goal of the course is to provide students with the theoretical knowledge and the basic intuitions needed to use and develop effective machine learning solutions to challenging problems.

Statistical Physics

  • $\mathbb{G}$MIT 8.333 - Statistical Mechanics I: Statistical Mechanics of Particles by Mehran Kardar: Course Page, Videos
    • Prerequisites: 8.044 Elementary Statistical Mechanics and 8.05 Quantum Physics II.
    • Course Description: Statistical Mechanics is a probabilistic approach to equilibrium properties of large numbers of degrees of freedom. In this two-semester course, basic principles are examined. Topics include: Thermodynamics, probability theory, kinetic theory, classical statistical mechanics, interacting systems, quantum statistical mechanics, and identical particles.
  • $\mathbb{G}$MIT 8.334 - Statistical Mechanics II: Statistical Physics of Fields by Mehran Kardar: Course Page, Videos
    • Prerequisites: MIT 8.333 - Statistical Mechanics
    • Course Description: This is the second term in a two-semester course on statistical mechanics. Basic principles are examined in this class, such as the laws of thermodynamics and the concepts of temperature, work, heat, and entropy. Topics from modern statistical mechanics are also explored, including the hydrodynamic limit and classical field theories.


  • $\mathbb{UG}$Charles University in Prague NMFM404 Selected Software Tools for Finance and Insurance by Michal Pešta: Course Page
    • Course Structure:
      • Robust regression
      • Logistic regression and Exact logistic regression
      • Multinomial logistic regression and Ordinal logistic regression
      • Probit regression
      • Poisson regression and Negative binomial regression
      • Zero-inflated Poisson regression and Zero-inflated NB regression
      • Zero-truncated Poisson regression and Zero-truncated NB regression
      • Survival data analysis
      • Proportional hazards
      • Tobit regression
      • Truncated regression and Interval regression
      • Generalized linear mixed models
      • Rector’s day
      • Mixed effects logistic regression
  • $\mathbb{UG}$ISU - STAT 544 Introduction to Bayesian statistics by Jarad Niemi: UG, 28 Lectures, Course Page, Videos
    • Prerequisite: STAT 543 or equivalent, e.g. Econ 672.
    • Textbook: Gelman, Andrew, et al. Bayesian data analysis. Vol. 2. Boca Raton, FL: CRC press, 2014.
    • Course Description: Specification of probability models; subjective, conjugate, and noninformative prior distributions; hierarchical models; analytical and computational techniques for obtaining posterior distributions; model checking, model selection, diagnostics; comparison of Bayesian and traditional methods.
  • $\mathbb{UG}$Statistics by @Professor Lenard on YouTube: 28 videos 923K+ views, Videos
  • $\mathbb{UG}$MIT 18.05 Introduction to Probability and Statistics by Jeremy Orloff and Jonathan Bloom, Spring 2014: Course Page
    • Course Description: This course provides an elementary introduction to probability and statistics with applications. Topics include: basic combinatorics, random variables, probability distributions, Bayesian inference, hypothesis testing, confidence intervals, and linear regression.
  • $\mathbb{UG}$Harvard Stat 110 Probability by Joe Blitzstein, : Course Page, Videos, GitHub
    • Joe Blitzstein (Professor of the Practice in Statistics and Co-Director of Undergraduate Studies in Statistics, Harvard University) has taught Statistics 110: Probability at Harvard each year since 2006. The on-campus Stat 110 course has grown from 80 students to over 300 students per year in that time. The lecture videos are available on iTunes U and YouTube.
    • Course Description: Stat 110 is an introduction to probability as a language and set of tools for understanding statistics, science, risk, and randomness. The ideas and methods are useful in statistics, science, engineering, economics, finance, and everyday life. Topics include the following. Basics: sample spaces and events, conditioning, Bayes’ Theorem. Random variables and their distributions: distributions, moment generating functions, expectation, variance, covariance, correlation, conditional expectation. Univariate distributions: Normal, t, Binomial, Negative Binomial, Poisson, Beta, Gamma. Multivariate distributions: joint, conditional, and marginal distributions, independence, transformations, Multinomial, Multivariate Normal. Limit theorems: law of large numbers, central limit theorem. Markov chains: transition probabilities, stationary distributions, reversibility, convergence. The prerequisites are calculus (mainly single variable) and familiarity with matrices.
  • $\mathbb{G}$ MIT 15.075J - Statistical Thinking and Data Analysis by Cynthia Rudin, Fall 2011: Course Page
    • Prerequisites: An understanding of Calculus and 6.041 Probabilistic Systems Analysis and Applied Probability or 18.440 Probability and Random Variables.
    • Textbook: Tamhane, Ajit C., and Dorothy D. Dunlop. Statistics and Data Analysis: From Elementary to Intermediate. Prentice Hall, 1999. ISBN: 9780137444267.
    • Course Description: This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and nonparametric statistics.
  • $\mathbb{G}$MIT 18.465 - Topics in Statistics: Nonparametrics and Robustness by Prof. Richard Dudley, Spring 2005: Course Page
    • Prerequisites: Statistics for Applications (18.443) or Statistical Inference (the former 18.441)
    • Course Description:
      • This graduate-level course focuses on one-dimensional nonparametric statistics developed mainly from around 1945 and deals with order statistics and ranks, allowing very general distributions.
      • For multidimensional nonparametric statistics, an early approach was to choose a fixed coordinate system and work with order statistics and ranks in each coordinate. A more modern method, to be followed in this course, is to look for rotationally or affine invariant procedures. These can be based on empirical processes as in computer learning theory.
      • Robustness, which developed mainly from around 1964, provides methods that are resistant to errors or outliers in the data, which can be arbitrarily large. Nonparametric methods tend to be robust.
  • $\mathbb{G}$MIT 18.465 - Topics in Statistics: Statistical Learning Theory by Prof. Dmitry Panchenko, Spring 2007: Course Page
    • Prerequisites: Theory of Probability (18.175) and either Statistical Learning Theory and Applications (9.520) or Machine Learning (6.867)
    • Course Description: The main goal of this course is to study the generalization ability of a number of popular machine learning algorithms such as boosting, support vector machines and neural networks. Topics include Vapnik-Chervonenkis theory, concentration inequalities in product spaces, and other elements of empirical process theory.
  • $\mathbb{G}$MIT 18.655 - Mathematical Statistics by Dr. Peter Kempthorne, Spring 2016: Course Page
    • Textbook: Bickel, Peter J., and Kjell A. Doksum. Mathematical Statistics: Basic Ideas and Selected Topics, Volume 1. 2nd edition. Chapman and Hall / CRC, 2015. ISBN: 9781498723800.
    • Course Description: This course provides students with decision theory, estimation, confidence intervals, and hypothesis testing. It introduces large sample theory, asymptotic efficiency of estimates, exponential families, and sequential analysis.


Machine Learning

Machine Learning Cheat Sheet

Essential Cheat Sheets for deep learning and machine learning researchers: GitHub


Artificial Intelligence

  1. $\mathbb{UG}$MIT 6.034 Artificial Intelligence by Patrick H. Winston, Fall 2010: 23 Lectures, Course Page, Videos 6/30
    • Course Description: This course introduces students to the basic knowledge representation, problem solving, and learning methods of artificial intelligence. Upon completion of 6.034, students should be able to develop intelligent systems by assembling solutions to concrete computational problems; understand the role of knowledge representation, problem solving, and learning in intelligent-system engineering; and appreciate the role of problem solving, vision, and language in understanding human intelligence from a computational perspective.
  2. $\mathbb{UG}$Berkeley CS188 Introduction to Artificial Intelligence by Pieter Abbeel,Spring 2014: 25 Lectures, Course Page, Videos 1/25
    • Prerequisites
      • CS 61A Structure and Interpretation of Computer Programs and 61B Data Structure: Prior computer programming experience and a good understanding of clean, elegant implementations of nontrivial algorithms and data structures.
      • CS 70 Discrete Mathematics and Probability Theory: Facility with basic concepts of propositional logic and probability are expected, as well as the ability to understand and construct proofs. CS 70 is the better choice for this course, but Math 55 could be substituted. Some linear algebra and calculus will be used, but the necessary content will be covered in the course.
    • Course Description: This course will cover Ideas and techniques underlying the design of intelligent computer systems. Topics include search, game playing, knowledge representation, inference, planning, reasoning under uncertainty, machine learning, robotics, perception, and language understanding. You will learn to build intelligent agents - systems that perceive and act - for fully observable, partially observable and adversarial settings. Your agents will generate provably successful plans in deterministic environments, draw inferences in uncertain environments, and optimize actions for arbitrary reward structures. They will learn from observation and from rewards. The techniques you learn in this course apply to a wide variety of artificial intelligence problems and will serve as the foundation for further study in any application area you choose to pursue.
  3. $\mathbb{UG}$Udacity|Intro to Artificial Intelligence by Sebastian Thrun: Course Page, Videos
    • Course Description: Artificial Intelligence (AI) is a field that has a long history but is still constantly and actively growing and changing. In this course, you’ll learn the basics of modern AI as well as some of the representative applications of AI. Along the way, we also hope to excite you about the numerous applications and huge possibilities in the field of AI, which continues to expand human capability beyond our imagination.
  4. $\mathbb{G}$Stanford CS221 - Artificial Intelligence: Principles and Techniques by Percy Liang and Stefano Ermon, Fall 2017: Course Page, Course Materials
    • Prerequisites: This course is fast-paced and covers a lot of ground, so it is important that you have a solid foundation on both the theoretical and empirical fronts. You should have taken the following classes (or their equivalents):
      • Programming (CS 106A, CS 106B, CS 107)
      • Discrete math (CS 103)
      • Probability (CS 109)
    • Course Description: What do web search, speech recognition, face recognition, machine translation, autonomous driving, and automatic scheduling have in common? These are all complex real-world problems, and the goal of artificial intelligence (AI) is to tackle these with rigorous mathematical tools. In this course, you will learn the foundational principles that drive these applications and practice implementing some of these systems. Specific topics include machine learning, search, game playing, Markov decision processes, constraint satisfaction, graphical models, and logic. The main goal of the course is to equip you with the tools to tackle new AI problems you might encounter in life.

Machine Learning

  1. $\mathbb{UG}$Stanford CS229 - Machine Learning by Andrew Ng: 1h15min/Lec, 20 Lectures, Course Page, Videos-1, Videos-2, Code-Matlab|CourseAce, Code-Python|icrtiou 19/20
    • One of the most classic open course for Machine Learning
    • Prerequisites: Students are expected to have the following background:
      • Knowledge of basic computer science principles and skills, at a level sufficient to write a reasonably non-trivial computer program.
      • Familiarity with probability theory (CS 109 or STATS 116)
      • Familiarity with linear algebra (any one of Math 104, Math 113, or CS 205 should be sufficient)
    • Course Description: This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs; VC theory; large margins); reinforcement learning and adaptive control. The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing.
  2. $\mathbb{UG}$NTU Machine Learning Foundations by Hsuan-Tien Lin林軒田: Course Page|Part I Mathematical Foundations, Course Page|Part II Algorithmic Foundations, Videos
    • Intermediate level
    • Course Description: Machine learning is the study that allows computers to adaptively improve their performance with experience accumulated from the data observed. Our two sister courses teach the most fundamental algorithmic, theoretical and practical tools that any user of machine learning needs to know. This first course of the two would focus more on mathematical tools, and the other course would focus more on algorithmic tools.
  3. $\mathbb{UG}$NTU Machine Learning Techniques by Hsuan-Tien Lin林軒田: Course Page, Videos
    • Prerequisite:
      • Machine Learning Foundations by Hsuan-Tien Lin or equivalent
      • Recommended reading: Learning from Data
    • This course covers:
      • Embedding numerous features: linear support vector machine, dual support vector machine, kernel support vector machine, soft-margin support vector machine, kernel logistic regression, support vector regression
      • Combining predictive features: blending and bagging, adaptive boosting, decision tree, random forest, gradient boosted decision tree
      • Distilling hidden features: neural network, deep learning, radial basis function network, matrix factorization
  4. $\mathbb{G}$ Columbia 4772 Advanced Machine Learning by Tony Jebara, Spring 2015: Course Page
    • Prerequisites: COMS W4771 Machine Learning or permission. Background in linear algebra and statistics.
    • Course Description: Advanced topics in machine learning including: Linear Modeling, Nonlinear Dimension Reduction, Maximum Entropy, Exponential Family Models, Conditional Random Fields, Graphical Models, Structured Support Vector Machines, Feature Selection, Kernel Selection, Meta-Learning, Multi-Task Learning, Semi-Supervised Learning, Graph-Based Semi-Supervised Learning, Approximate Inference, Clustering, and Boosting.
  5. $\mathbb{G}$ CMU 10-701/15-781 - Machine Learning
    • To choose between the Introduction to Machine Learning courses (10-401, 10-601, 10-701, and 10-715), please read the Intro to ML Course Comparison.23
    • by Tom Mitchell: Course Page|Spring2011, Videos|Spring2011
      • Prerequisites: Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate. In addition, recitation sessions will be held to review some basic concepts.
      • Course Description: This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as Bayesian networks, decision tree learning, Support Vector Machines, statistical learning methods, unsupervised learning and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, margin-based learning, and Occam’s Razor. Short programming assignments include hands-on experiments with various learning algorithms, and a larger course project gives students a chance to dig into an area of their choice. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.
    • by Alexander J. Smola: Course Page|Fall2013, Videos||Fall2013
    • by Alexander J. Smola: Course Page|Spring2015, Videos|Spring2015
      • More difficult than T. Mitchell’s 10-701.
      • Prerequisites:
        • Basic probability and statistics are a plus.
        • Basic linear algebra (matrices, vectors, eigenvalues) is a plus. Knowing functional analysis would be great but not required.
        • Ability to write code that exceeds ‘Hello World’. Preferably beyond Matlab or R.
        • Basic knowledge of optimization. Having attended a convex optimization class would be great but the recitations will cover this.
        • You should have no trouble answering the questions of the self evaluation handed out for the 10-601 Introduction to Machine Learning course.
      • Course Description: This course is designed to give PhD students a thorough grounding in the methods, theory, mathematics and algorithms needed to do research and applications in machine learning. The topics of the course draw from machine learning, classical statistics, data mining, Bayesian statistics and information theory. Students entering the class with a pre-existing working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate.
    • by Eric Xing and Ziv Bar-Joseph: Course Page|Fall2015, Videos|Fall2015
      • Course Description: Machine learning studies the question “how can we build computer programs that automatically improve their performance through experience?” This includes learning to perform many types of tasks based on many types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments, medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records, and speech recognition systems that lean to better understand your speech based on experience listening to you.This course is designed to give PhD students a thorough grounding in the methods, theory, mathematics and algorithms needed to do research and applications in machine learning. The topics of the course draw from from machine learning, from classical statistics, from data mining, from Bayesian statistics and from information theory. Students entering the class with a pre-existing working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate.
  6. $\mathbb{G}$CMU 10-715 Advanced Introduction to Machine Learning: Eric Xing|2014Fall, Alex Smola|2015Fall, Videos|2015Fall
    • Prerequisites: Students entering the class are expected to have a pre-existing strong working knowledge of algorithms, linear algebra, probability, and statistics. If you are interested in this topic, but do not have the required background or are not planning to work on a PhD thesis with machine learning as the main focus, you might consider the general graduate Machine Learning course (10-701) or the Masters-level Machine Learning course (10-601).
    • Course Description: This course is designed for Ph.D. students whose primary field of study is machine learning, or who intend to make machine learning methodological research a main focus of their thesis. It will give students a thorough grounding in the algorithms, mathematics, theories, and insights needed to do in-depth research and applications in machine learning. The topics of this course will in part parallel those covered in the general graduate machine learning course (10-701), but with a greater emphasis on depth in theory and algorithms. The course will also include additional advanced topics such as RKHS and representer theory, Bayesian nonparametrics, additional material on graphical models, manifolds and spectral graph theory, reinforcement learning and online learning, etc.

Deep Learning

  1. $\mathbb{UG}$Coursera - Neural Networks for Machine Learning by Geoffrey Hinton: 16 Lectures, Course Page 1, Course Page 2, Videos 34/78
    • Prerequisites:
      • Calculus: (MAT135H1, MAT136H1)/MAT135Y1/MAT137Y1/MAT157Y1
      • Linear Algebra: MAT223H1/MAT240H1
      • Statistics and Probability: STA247H1/STA255H1/STA257H1
    • Course Description: Learn about artificial neural networks and how they’re being used for machine learning, as applied to speech and object recognition, image segmentation, modeling language and human motion, etc. We’ll emphasize both the basic algorithms and the practical tricks needed to get them to work well. Please be advised that the course is suited for an intermediate level learner - comfortable with calculus and with experience programming (Python).
  2. $\mathbb{UG}$Coursera - Neural Networks and Deep Learning by Andrew Ng: 46 parts, Course Page,Videos
    • Prerequisites:
      • Expected:
        • Programming: Basic Python programming skills, with the capability to work effectively with data structures.
      • Recommended:
        • Mathematics: Matrix vector operations and notation.
        • Machine Learning: Understanding how to frame a machine learning problem, including how data is represented will be beneficial. (If you have taken Stanford CS229 - Machine Learning by Andrew Ng, you have much more than the needed level of knowledge.)
    • Course Description: In this course, you will learn the foundations of deep learning. This course also teaches you how Deep Learning actually works, rather than presenting only a cursory or surface-level description. So after completing it, you will be able to apply deep learning to a your own applications. If you are looking for a job in AI, after this course you will also be able to answer basic interview questions. When you finish this class, you will:
      • Understand the major technology trends driving Deep Learning
      • Be able to build, train and apply fully connected deep neural networks
      • Know how to implement efficient (vectorized) neural networks
      • Understand the key parameters in a neural network’s architecture
  3. $\mathbb{UG}$UCambridge - Course on Information Theory, Pattern Recognition, and Neural Networks by David MacKay: 1h30min/Lec, 16 Lectures, Course Page, Videos-1, Videos-2 5/16
    • Prerequisites:
      • Probability and statistics
      • Linear algebra
      • Statistical physics
    • Course Description: This course will cover Introduction to information theory, Entropy and data compression, Communication over noisy channels, Statistical inference, data modelling and pattern recognition, Approximation of probability distributions and Neural networks and content-addressable memories.
  4. $\mathbb{UG}$Stanford CS231N - Convolutional Neural Networks for Visual Recognition Spring 2017 by Li Feifei: 1h15min/Lec, 16 Lectures, Course Page,Videos
    • Prerequisites:
      • Proficiency in Python, high-level familiarity in C/C++
      • College Calculus, Linear Algebra (e.g. MATH 19 or 41, MATH 51)
      • Basic Probability and Statistics (e.g. CS 109 or other stats course)
      • Equivalent knowledge of CS229 (Machine Learning)
    • Course Description: This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

Natural Language Processing

  1. $\mathbb{UG}$Stanford CS124 - Natural Language Processing by Dan Jurafsky & Chris Manning: 23 Lectures, Course Page, Resources, Videos 37/102
    • Course Description: This course covers a broad range of topics in natural language processing, including word and sentence tokenization, text classification and sentiment analysis, spelling correction, information extraction, parsing, meaning extraction, and question answering, We will also introduce the underlying theory from probability, statistics, and machine learning that are crucial for the field, and cover fundamental algorithms like n-gram language modeling, naive bayes and maxent classifiers, sequence models like Hidden Markov Models, probabilistic dependency and constituent parsing, and vector-space models of meaning.
  2. $\mathbb{UG}$Stanford CS224N - Deep Learning for Natural Language Processing by Chris Manning: 1h15min/Lec, 18 Lectures, Course Page, Videos 2/18
    • Prerequisites:
      • Proficiency in Python
      • College Calculus, Linear Algebra (e.g. MATH 51, CME 100)
      • Basic Probability and Statistics (e.g. CS 109 or other stats course)
      • Foundations of Machine Learning
    • Course Description: The course provides a thorough introduction to cutting-edge research in deep learning applied to NLP. On the model side we will cover word vector representations, window-based neural networks, recurrent neural networks, long-short-term-memory models, recursive neural networks, convolutional neural networks as well as some recent models involving a memory component. Through lectures and programming assignments students will learn the necessary engineering tricks for making neural networks work on practical problems.
  3. $\mathbb{UG}$Columbia COMS W4705 - Natural Language Processing by Michael Collins: Course Page, Videos
    • Course Description: This course covers Introduction to NLP, Language Modeling, Tagging, and Hidden Markov Models, Parsing, and Context-free Grammars, Probabilistic Context-free Grammars, Lexicalized PCFGs, The IBM Translation Models, Phrase-Based Translation Models, Log-Linear Models and Log-Linear Models for Tagging, and Global Linear Models, Global Linear Models for Tagging and for dependency parsing.
  4. $\mathbb{G}$MIT 6.864 Advanced Natural Language Processing: Course Page Fall 2010, Course Page Fall 2012, Course Page Fall 2016
    • Course Description: The need to study human languages from a computational perspective has never been greater. Much of the vast amounts of information available today is in a textual form, requiring us to develop automated tools to search, extract, translate, and summarize the data. This course on natural language processing (NLP) focuses exactly on such problems, covering syntactic, semantic and discourse processing models, and their applications to information extraction, machine translation, and text summarization. As a new feature this year, the course will emphasize deep learning techniques for NLP, introducing them in parallel and comparatively with more traditional approaches to NLP.

Probabilistic Graphical Models

  1. $\mathbb{UG}$/$\mathbb{G}$Coursera - Probabilistic Graphical Models by Daphne Koller: 15 Lectures, Course Page 1 - Representation, Course Page 2 - Inference, Course Page 3 - Learning, Textbook, Videos 35/94
    • Prerequisite: Familiarity with programming, basic linear algebra (matrices, vectors, matrix-vector multiplication), and basic probability (random variables, basic properties of probability) is assumed. Basic calculus (derivatives and partial derivatives) would be helpful and would give you additional intuitions about the algorithms, but isn’t required to fully complete this course.
    • Course Description:
      • Part 1 Inference: This course describes the two basic PGM representations: Bayesian Networks, which rely on a directed graph; and Markov networks, which use an undirected graph. The course discusses both the theoretical properties of these representations as well as their use in practice. The (highly recommended) honors track contains several hands-on assignments on how to represent some real-world problems. The course also presents some important extensions beyond the basic PGM representation, which allow more complex models to be encoded compactly.
      • Part 2 Inference: This course addresses the question of probabilistic inference: how a PGM can be used to answer questions. Even though a PGM generally describes a very high dimensional distribution, its structure is designed so as to allow questions to be answered efficiently. The course presents both exact and approximate algorithms for different types of inference tasks, and discusses where each could best be applied. The (highly recommended) honors track contains two hands-on programming assignments, in which key routines of the most commonly used exact and approximate algorithms are implemented and applied to a real-world problem.
      • Part 3 Learning: This course addresses the question of learning: how a PGM can be learned from a data set of examples. The course discusses the key problems of parameter estimation in both directed and undirected models, as well as the structure learning task for directed models. The (highly recommended) honors track contains two hands-on programming assignments, in which key routines of two commonly used learning algorithms are implemented and applied to a real-world problem.
  2. $\mathbb{G}$CMU 10-708 - Probabilistic Graphical Models by Eric Xing, Spring 2014: Course Page, Videos
    • Prerequisites: In order to take this class, students are required to have successfully completed Machine Learning 10-701/15-781, or an equivalent class.
    • Course Description:
      • Many of the problems in artificial intelligence, statistics, computer systems, computer vision, natural language processing, and computational biology, among many other fields, can be viewed as the search for a coherent global conclusion from local information. The probabilistic graphical models framework provides an unified view for this wide range of problems, enabling efficient inference, decision-making and learning in problems with a very large number of attributes and huge datasets. This graduate-level course will provide you with a strong foundation for both applying graphical models to complex problems and for addressing core research topics in graphical models.
      • The class will cover three aspects: the core representation, including Bayesian and Markov networks, and dynamic Bayesian networks; probabilistic inference algorithms, both exact and approximate; and learning methods for both the parameters and the structure of graphical models. Students entering the class should have a pre-existing working knowledge of probability, statistics, and algorithms, though the class has been designed to allow students with a strong mathematical background to catch up and fully participate.
      • It is expected that after taking this class, students will have obtained sufficient working knowledge of multi-variate probablistic modeling and inference for practical applications, should be able to fomulate and solve a wide range of problems in their own domain using graphical models, and can advance into more specialized technical literature by themselves.

Other courses

  1. $\mathbb{UG}$Berkeley CS 294 Practical Machine Learning by Michael I. Jordan, Fall 2009: Course Page
    • Prerequisites: some prior exposure to probability and to linear algebra.
    • Course Description: This course introduces core statistical machine learning algorithms in a (relatively) non-mathematical way, emphasizing applied problem-solving. The prerequisites are light; some prior exposure to basic probability and to linear algebra will suffice.
  2. $\mathbb{UG}$Harvard CS 109 - Data Science by Joe Blitzstein: 1h15min/Lec, 20 Lectures, Course Page|2014, Course Page|2015, Videos|2014, Videos|2015, Official Resources
    • Prerequisites: Both undergraduates and graduate students are welcome to take the course. You should have been equipped with:
      • programming knowledge at the level of CS 50 (or above)
      • statistics knowledge at the level of Stat 100 (or above).
    • Course Description: Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries. We will be using Python for all programming assignments and projects.
  3. $\mathbb{UG}$CMU 10-601 - Machine Learning by Tom Mitchell, Spring 2015: Course Page, Videos|CMU, Videos
    • Prerequisites: Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate. In addition, recitation sessions will be held to review some basic concepts.
    • Course Description: This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as Bayesian networks, decision tree learning, Support Vector Machines, statistical learning methods, unsupervised learning and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, margin-based learning, and Occam’s Razor. Short programming assignments include hands-on experiments with various learning algorithms. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.
    • To choose between the Introduction to Machine Learning courses (10-401, 10-601, 10-701, and 10-715), please read the Intro to ML Course Comparison.
    • Quora|Which is better at CMU: 10-601 or 10-701?
  4. $\mathbb{G}$ CMU 10-707 Topics in Deep Learning by Russ Salakhutdinov, Fall 2017: Course Page, Videos
    • Prerequisites: This is an advanced graduate course, designed for Masters and Ph.D. level students, and will assume a reasonable degree of mathematical maturity:
      • ML: 10-701 or 10-715, and strong programming skills.
    • Course Description: The goal of this course is to introduce students to the recent and exciting developments of various deep learning methods. Some topics to be covered include: restricted Boltzmann machines (RBMs) and their multi-layer extensions Deep Belief Networks and Deep Boltzmann machines; sparse coding, autoencoders, variational autoencoders, convolutional neural networks, recurrent neural networks, generative adversarial networks, and attention-based models with applications in vision, NLP, and multimodal learning. We will also address mathematical issues, focusing on efficient large-scale optimization methods for inference and learning, as well as training density models with intractable partition functions.
  5. $\mathbb{G}$UBC CPSC540 Machine Learning by Nando de Freitas: 21 Lectures, Course Page, Videos
  6. Oxford Deep learning by Nando de Freitas, 2015: Course Page, Videos
  7. $\mathbb{UG}$Machine Learning by @mathematicalmonk (Jeff Miller on Duke): 160 videos, 909k+ views, Videos
  8. WUSTL CSE 515T Bayesian Methods in Machine Learning by Roman Garnett, Spring 2015: Course Page
  9. $\mathbb{G}$MIT 6.867 - Machine Learning by Tommi Jaakkola: Course Page
    • Course Description: 6.867 is an introductory course on machine learning which gives an overview of many concepts, techniques, and algorithms in machine learning, beginning with topics such as classification and linear regression and ending up with more recent topics such as boosting, support vector machines, hidden Markov models, and Bayesian networks. The course will give the student the basic ideas and intuition behind modern machine learning methods as well as a bit more formal understanding of how, why, and when they work. The underlying theme in the course is statistical inference as it provides the foundation for most of the methods covered.
  10. $\mathbb{UG}$MIT 6.803/6.833 The Human Intelligence Enterprise: Course Page 2002,Course Page 2006,Course Page 2017
  11. $\mathbb{G}$MIT Probabilistic Graphical Models by David Sontag, Spring 2013: Course Page
  12. $\mathbb{G}$Stanford CS224U - Natural Language Understanding by Bill MacCartney: Course Page
  13. $\mathbb{G}$Stanford CS224M : Multi Agent Systems by Yoav Shoham: Course Page, Videos
  14. $\mathbb{G}$Harvard CS281: Advanced Machine Learning by Ryan Adams, Fall 2013: Course Page
  15. $\mathbb{G}$Stanford CS 228 Probabilistic Graphical Models by Stefano Ermon, Winter 2016: Course Page
  16. $\mathbb{G}$MIT 18.657 - Mathematics of Machine Learning by Philippe Rigollet, Fall 2015: Course Page
    • P. Rigollet has another course MIT 18.650.
  17. $\mathbb{G}$MIT DS-GA-1003/CSCI-GA.2567 - Machine Learning and Computational Statistics by David Sontag, Spring 2014: Course Page
  18. Willamette University CS449 - Neural Networks: Course Page, No Videos found
  19. MLSS 2013 Tübingen: Graphical Models by Christopher Bishop: Videos
  20. 龙星计划 清华大学-机器学习 by YU Kai余凯 and Zhang Tong张潼: 50min/Lec, 19 Lectures, Videos
  21. Coursera|Johns Hopkins University - Data Science Specialization: 10 courses, Home Page
    • Beginner Specialization. No prior experience required.
    • This Specialization covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.
    1. The Data Scientist’s Toolbox
    2. R Programming
    3. Getting and Cleaning Data
    4. Exploratory Data Analysis
    5. Reproducible Research
    6. Statistical Inference
    7. Regression Models
    8. Practical Machine Learning
    9. Developing Data Products
    10. Data Science Capstone
  22. Coursera|Michigan University - Applied Data Science with Python Specialization: 5 courses, Home Page
    • Intermediate Specialization. Some related experience required.
    • The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language. This skills-based specialization is intended for learners who have basic a python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network analysis techniques through popular python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx to gain insight into their data.
    • Introduction to Data Science in Python (course 1), Applied Plotting, Charting & Data Representation in Python (course 2), and Applied Machine Learning in Python (course 3) should be taken in order and prior to any other course in the specialization. After completing those, courses 4 and 5 can be taken in any order. All 5 are required to earn a certificate.
    1. Introduction to Data Science in Python
    2. Applied Plotting, Charting & Data Representation in Python
    3. Applied Machine Learning in Python
    4. Applied Text Mining in Python
    5. Applied Social Network Analysis in Python
  23. Machine Learning with Python by @sentdex: 72 videos 1,817,742 views, Videos


Introduction to Artificial Intelligence

  • Artificial Intelligence: A Modern Approach by S. Ruseull, P. Norvig: Amazon, Book Page

Machine Learning

Deep Learning:

Natural Language Processing

  • Foundations of Statistical Natural Language Processing: Amazon
  • 统计自然语言处理(第2版)by 宗成庆: Amazon, 豆瓣读书

Information Retrieval

  • Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schütze: Amazon, Book Page

Recommender System


List of list

Notes, Blog, Talks and so on

  • Microsoft-Social Computing (Asia)
  • Geoffrey E. Hinton’s Neural Network Tutorials
  • Michael I. Jordan’s Courses
  • Professor Di Cook, Department of Econometrics and Business Statistics, Monash Uni
    • Professor Di Cook : I am a Fellow of the American Statistical Association, and Ordinary Member of the R Foundation. My research is in data visualisation, exploratory data analysis, multivariate methods, data mining and statistical computing. I have developed methods for visualising high-dimensional data using tours, projection pursuit, manual controls for tours, pipelines for interactive graphics, a grammar of graphics for biological data, and visualizing boundaries in high-d classifiers…
  • R-blogger
    • R news and tutorials contributed by (580) R bloggers
  • Data Science Greedy
    • A passionate data scientist, machine learning professional, business analytics guy, visualization, big data, Data management, research professional with a passion for data and its application to problem solving.
  • Andrew Patton’s MATLAB code page
    • This page contains some of the MATLAB code I’ve written during the course of my research. If you find any mistakes or bugs in the code please let me know.
      This code is being released under a BSD license, which means that you can do pretty much what ever you want with it, including make money by selling it.
  • Datatau
    • Hacker News for Data
  • Blog of Sebastian Ruder
    • I’m a PhD student in Natural Language Processing and a research scientist at AYLIEN. I blog about Machine Learning, Deep Learning, NLP, and startups.
  • Videolecture net: Machine learning
    • VideoLectures.net is the world’s biggest academic online video repository with 14,251 video lectures delivered by 10,763 presenters since 2006. It is hosted at Jozef Stefan Institute in Slovenia, Europe. All content is released under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
  • Dataschool
    • You’re trying to launch your career in data science, and I want to help you reach that goal! My name is Kevin Markham. I’m a data scientist and a teacher.
  • A Machine Learning-Based Trading Strategy Using Sentiment Analysis Data by Tucker Balch (Lucena Research): Videos
  • Expectation Maximization Algorithm by Victor Lavrenko (20171017)
  • Machine Learning Mastery-Start Here!
    • Do You Need Help Getting Started with Applied Machine Learning? This is The Step-by-Step Guide that You’ve Been Looking For!
  • DATAQUEST:Learn real-world data science skills
    • Our practical approach teaches you Data Science using interactive coding challenges.
    • You’ll learn how to think like a data scientist. You’ll learn how to solve problems like a data scientist. You’ll work on the same projects a data scientist works on. And eventually, you’ll become a data scientist.
  • Tianqi Chen’s Project
    • As a large-scale machine learning researcher, I like to build real things that can be used in production. I build some very widely used machine learning and systems packages. I am initiator DMLC group, to make large-scale machine learning widely available to the community. Here I list projects that I created and heavily involved in. Some of these projects are also great example of large-scale machine learning research. Most of these projects are actively developed and maintained as open source package. I am honored to work with many outstanding collaborators on these projects.
  • Getting Started with Kaggle: House Prices Competition
    • Founded in 2010, Kaggle is a Data Science platform where users can share, collaborate, and compete. One key feature of Kaggle is “Competitions”, which offers users the ability to practice on real world data and to test their skills with, and against, an international community.
  • Honglang Wang’s Blog|Machine Learning Books Suggested by Michael I. Jordan from Berkeley
    • There has been a Machine Learning (ML) reading list of books in hacker news for a while, where Professor Michael I. Jordan recommend some books to start on ML for people who are going to devote many decades of their lives to the field, and who want to get to the research frontier fairly quickly. Recently he articulated the relationship between CS and Stats amazingly well in his recent reddit AMA, in which he also added some books that dig still further into foundational topics. I just list them here for people’s convenience and my own reference.
  • CMU CS Andrew W. Moore’s Tutorial
    • The following links point to a set of tutorials on many aspects of statistical data mining, including the foundations of probability, the foundations of statistical data analysis, and most of the classic machine learning and data mining algorithms.

Finance & Economics

Financial Theories

Investment Desicion Making Problem


Financial Theory

  • $\mathbb{G}$MIT 15.401 - Finance Theory I by Andrew Lo, Fall 2008: 1h10min/Lec, 20 Lectures, Course Page, Videos
    • Course Description: This course introduces the core theory of modern financial economics and financial management, with a focus on capital markets and investments. Topics include functions of capital markets and financial intermediaries, asset valuation, fixed-income securities, common stocks, capital budgeting, diversification and portfolio selection, equilibrium pricing of risky assets, the theory of efficient markets, and an introduction to derivatives and options.
  • $\mathbb{G}$MIT 15.402 - Finance Theory II by Dirk Jenter & Katharina Lewellen, Spring 2003: Course Page
    • Prerequisites: Finance Theory I (15.401). It will also help if you know some basic economics and accounting.
    • Course Description: The objective of this course is to learn the financial tools needed to make good business decisions. The course presents the basic insights of corporate finance theory, but emphasizes the application of theory to real business decisions. Each session involves class discussion, some centered on lectures and others around business cases.

Economics & Econometrics

  • $\mathbb{G}$MIT 14.382 - Econometrics by Victor Chernozhukov, Spring 2017: Course Page
    • Prerequisites: MIT 14.381 Statistical Method in Economics or permission of the instructor.
    • Course Description: The course will cover several key models as well as identification and estimation methods used in modern econometrics. We shall being with exploring some leading models of econometrics, then seeing structures, then providing methods of identification, estimation, and inference. You will get lots of hands-on experience with using the methods on real data sets.
  • $\mathbb{G}$MIT 14.387 - Applied Econometrics: Mostly Harmless Big Data by Joshua Angrist and Victor Chernozhukov, Fall 2014: Course Page
    • Prerequisites: MIT 14.382 Econometrics is the prerequisite for this course.
    • Course Description: This course covers empirical strategies for applied micro research questions. Our agenda includes regression and matching, instrumental variables, differences-in-differences, regression discontinuity designs, standard errors, and a module consisting of 8-9 lectures on the analysis of high-dimensional data sets a.k.a. “Big Data”.
  • $\mathbb{UG}$Economic 421 - Econometrics by Mark Thoma: 1h20min/Lec, 19 Lectures, Course Page, Videos
    • Course Description: We covered the following topics in the course: Assumption required for estimates to be BLUE, Hypothesis testing, Heteroskedasticity, Autocorrelation, Testing for ARCH errors, Stochastic Regressors and Measurement Errors, Simultaneous equation models, Multicollinearity, Specification tests, Qualitative and limited dependent variables, Maximum likelihood etc.
  • $\mathbb{UG}$Yale Econ 159 - Game Theory by Ben Polak: Course Page, Videos
    • Prerequisites: Introductory microeconomics (115 or equivalent) is required. Intermediate micro (150/2) is not required, but it is recommended. We will use calculus (mostly one variable) in this course. We will also refer to ideas like probability and expectation. Some may prefer to take the course next academic year once they have more background. Students who have already taken Econ 156b should not enroll in this class.
    • Course Description: This course is an introduction to game theory and strategic thinking. Ideas such as dominance, backward induction, Nash equilibrium, evolutionary stability, commitment, credibility, asymmetric information, adverse selection, and signaling are discussed and applied to games played in class and to examples drawn from economics, politics, the movies, and elsewhere.
  • $\mathbb{UG}$NTU IM 7011 - Information Economics by Ling-Chieh Kung, Fall 2014: Course Page, Videos
    • Prerequisites: Students need to know the basic ideas of calculus, optimization, and probability. Some knowledge about game theory will be helpful.
    • Course Description: There are four modules in this course: decentralization and inefficiency, the screening theory, the signaling theory, and final project presentations. The course starts with discussions about the incentive issues in decentralized systems. We then spend most of our time studying the economics of information to understand the impacts of possessing information or being lack of information. The focus will be on adverse selection, one of the most well studied types of information asymmetry in the field of economics. We will discuss how one may screen others’ private information and signal its own private information. Finally, students’ final project presentations conclude this course.

Financial Engineering

  • $\mathbb{G}$Udacity|Georgia Tech CS7646 - Machine Learning for Trading by Tucker Balch: 28 parts, Course Page|GaTech_Spring 2018 , Course Page|Udacity, Videos, Resources 1/28
    • Course Description: This course introduces students to the real world challenges of implementing machine learning based trading strategies including the algorithmic steps from information gathering to market orders. The focus is on how to apply probabilistic machine learning approaches to trading decisions. We consider statistical approaches like linear regression, KNN and regression trees and how to apply them to actual stock trading situations.
    • ML4T Software Setup
  • $\mathbb{UG}$Coursera - Financial Engineering and Risk Management Part I by Martin Haugh and Garud Iyengar: 51 parts, Course Page
    • Course Description: Financial Engineering is a multidisciplinary field drawing from finance and economics, mathematics, statistics, engineering and computational methods. The emphasis of FE & RM Part I will be on the use of simple stochastic models to price derivative securities in various asset classes including equities, fixed income, credit and mortgage-backed securities. We will also consider the role that some of these asset classes played during the financial crisis. A notable feature of this course will be an interview module with Emanuel Derman, the renowned ``quant’’ and best-selling author of “My Life as a Quant”. We hope that students who complete the course will begin to understand the “rocket science” behind financial engineering but perhaps more importantly, we hope they will also understand the limitations of this theory in practice and why financial models should always be treated with a healthy degree of skepticism. The follow-on course FE & RM Part II will continue to develop derivatives pricing models but it will also focus on asset allocation and portfolio optimization as well as other applications of financial engineering such as real options, commodity and energy derivatives and algorithmic trading.
  • $\mathbb{UG}$MIT 18.S096 - Topics in Mathematics with Applications in Finance: 1h20min/Lec, 26 Lectures, Course Page, Videos
    • Prerequisites:
      • MIT 18.01 Single Variable Calculus
      • MIT 18.02 Multivariable Calculus
      • MIT 18.03 Differential Equations
      • MIT 18.05 Introduction to Probability and Statistics or 18.440 Probability and Random Variables
      • MIT 18.06 Linear Algebra
    • Course Description: The purpose of the class is to expose undergraduate and graduate students to the mathematical concepts and techniques used in the financial industry. Mathematics lectures are mixed with lectures illustrating the corresponding application in the financial industry. MIT mathematicians teach the mathematics part while industry professionals give the lectures on applications in finance.


Information System


  • $\mathbb{G}$MIT 15.082J/6.855J/ESD.78J - Network optimization by James Orlin, Fall 2010: Course Page
    • Prerequisites: 6.251J/15.081J Introduction to Mathematical Programming or a course on data structures.
    • Course Description: 15.082J/6.855J/ESD.78J is a graduate subject in the theory and practice of network flows and its extensions. Network flow problems form a subclass of linear programming problems with applications to transportation, logistics, manufacturing, computer science, project management, and finance, as well as a number of other domains. This subject will survey some of the applications of network flows and focus on key special cases of network flow problems including the following: the shortest path problem, the maximum flow problem, the minimum cost flow problem, and the multi-commodity flow problem. We will also consider other extensions of network flow problems.
  • $\mathbb{G}$MIT 6.231 Stochastic Control Theory by Dimitri Bertsekas, Fall 2015: Course Page, Special Videos
    • Prerequisites:
      • Solid knowledge of undergraduate probability, at the level of 6.041 Probabilistic Systems Analysis and Applied Probability, especially conditional distributions and expectations, and Markov chains.
      • Mathematical maturity and the ability to write down precise and rigorous arguments are also important. A class in analysis (e.g. 18.100C Real Analysis) will be helpful, although this prerequisite will not be strictly enforced.
    • Course Description: The course covers the basic models and solution techniques for problems of sequential decision making under uncertainty (stochastic control). We will consider optimal control of a dynamical system over both a finite and an infinite number of stages. This includes systems with finite or infinite state spaces, as well as perfectly or imperfectly observed systems. We will also discuss approximation methods for problems involving large state spaces. Applications of dynamic programming in a variety of fields will be covered in recitations.
  • $\mathbb{G}$Northwestern University, ESAM 3950 - Introduction to Complex Networks by Dirk Brockmann, Fall 2011: Course Page
    • Prerequisites: Calculus through Math 234, EA1-EA4 or Math 240 and Math 250 and a knowledge of programming, e.g. Matlab.
    • Course Description: The course provides an introduction to complex network theory and its applications in physics, biology, technology and social sciences. Basic graph theory and the statistical physics foundations as well as applications to real world networks will be covered. A hands-on approach to analytical and computational techniques for real world networks will be provided. Essential network models, e.g. small world networks, scale free networks, spatial and hierarchical networks will be discussed and methods to generate them with a computer will be covered. Different network visualization techniques and complex network tools will be explored as well. The course will cover three main branches of network science: 1.) Network structure, 2.) Dynamical processes on networks, and 3.) Network evolution.
  • $\mathbb{G}$Stanford CS224W: Social and Information Network Analysis by Jure Leskovec, Autumn 2014: Course Page, Videos
    • Students are expected to have the following background:
      • Knowledge of basic computer science principles at a level sufficient to write a reasonably non-trivial computer program. (e.g., CS107 or CS145 or equivalent are recommended).
      • Familiarity with the basic probability theory. (CS109 or Stat116 is sufficient but not necessary.)
      • Familiarity with the basic linear algebra (any one of Math 51, Math 103, Math 113, or CS 205 would be much more than necessary.)
    • Course description: World Wide Web, blogging platforms, instant messaging and Facebook can be characterized by the interplay between rich information content, the millions of individuals and organizations who create and use it, and the technology that supports it.
      The course will cover recent research on the structure and analysis of such large social and information networks and on models and algorithms that abstract their basic properties. Class will explore how to practically analyze large scale network data and how to reason about it through models for network structure and evolution.
      Topics include methods for link analysis and network community detection, diffusion and information propagation on the web, virus outbreak detection in networks, and connections with work in the social sciences and economics.
  • $\mathbb{UG}$Princeton - Bitcoin and Cryptocurrency Technologies by Arvind Narayanan: Course Page|Coursera, , Book Page, Videos
    • Course Description: To really understand what is special about Bitcoin, we need to understand how it works at a technical level. We’ll address the important questions about Bitcoin, such as:
      • How does Bitcoin work? What makes Bitcoin different?
      • How secure are your Bitcoins?
      • How anonymous are Bitcoin users?
      • What determines the price of Bitcoins?
      • Can cryptocurrencies be regulated? What might the future hold?
    • After this course, you’ll know everything you need to be able to separate fact from fiction when reading claims about Bitcoin and other cryptocurrencies. You’ll have the conceptual foundations you need to engineer secure software that interacts with the Bitcoin network. And you’ll be able to integrate ideas from Bitcoin in your own projects.
  • Operation Research
  • Data mining/Text mining
  • Knowledge management
  • Social media/Social computing/Social network
  • Decision Support System

IS Professors on Data Science



  • Github
  • Hexo
  • Markdown
  • LatTex


Learn Anything is an Open Source Website built by community to Learn Anything with Interactive Maps: GitHub, Home Page

0. Similar Open-course lists include: mvillaloboz/The Open-Source Computer Science Degree, datasciencemasters/The Open-Source Data Science Masters, Compilation of resources found around the web connected with: Machine Learning, Deep Learning, Data Science in general
1. Quora|What’s the difference between Harvard Stat 110 (Probability) and MIT 6.041 (Probabilistic Systems Analysis and Applied Probability)?
2. Quora|Which is better at CMU: 10-601 or 10-701?
3. 知乎|在CMU学习10701 Machine learning是种怎样的体验?
-------------End of postThanks for your time-------------
BaoDuGe_飽蠹閣 wechat
Enjoy it? Subscribe to my blog by scanning my public wechat account