Course syllabus: Machine Learning

The module Master's Machine Learning explores the principles of machine learning algorithms. The main goal of the module is to discuss how to state a practical problem correctly, according to theoretical principles. This module explains what hypotheses should be applied to the dataset, how to set error functions, how to select an optimal model, and analyze its features. The module includes lectures on clustering, classification, regression, model selection, and multi-modeling. It includes a series of homework problems based on real-life projects.

Probabilistic nature of datasets. Data generation hypothesis. Conditional and marginal distributions. Exponential family.
- Practice: small-size bio-medical dataset processing.
Regression models. Maximum likelihood and error function. Bias-variance decomposition. Bayesian regression.
- Practice: energy consumption forecasting.
Classification models. Generative and discriminative models. Model parameters covariance. Laplace approximation.
- Practice: scoring model constructing.
Feature selection algorithms. Add and Del strategies, group method for data handling. Analysis of variance. Regularization, Lasso, LARS. Multicollinearity tests.
- Practice: feature selection for immunological data.
Neural networks. Parameter and structure optimization. Regularization. Optimal brain surgery. Bayesian networks.
- Practice: network structure selection for the red vine quality dataset.
Graphical models. Bayesian networks. Conditional independence. Bayesian inference. Learning the graph structure.
- Practice: image de-noising.
Kernel methods and support vector machines. Radial basis functions and networks. Maximum margin classifiers. Relevance vector machines. Privileged learning.
- Practice: document ranking.
Principal component analysis and continuous latent variables. Probabilistic PCA. Kernel PCA. Non-linear PCA. Manifold learning.
- Practice: ecological footprint integral indicator constructing.
Mixture models and Expectation-maximization. Mixture of Gaussians. EM for Bayesian regression.
- Practice: image segmentation.
Clustering. Metrics learning. Kohonen maps. Topological mapping. Deformation energy functions.
- Practice: trajectories clustering.
Representation learning. Restricted Boltzmann machine. Autoencoder. Predictive sparse decomposition. Probabilistic and direct encoding models.
- Practice: human physical activity classification using deep learning.
Approximate inference. Variational linear and logistic regression. Variational mixture of Gaussians. Expectation propagation.
- Practice: comparison of parameter estimation methods.
Model selection. Model evidence. Coherent Bayesian inference. Minimum description length principle.
- Practice: econometric model selection.
Multi-modeling. Adaboost. Tree-based voting. Mixture of models. Mixture of experts.
- Practice: sociological data modeling.

Review: clustering, regression and classification models. Model selection. Parameter and error analysis.

- Practice: creating error analysis report.
Exam.

References

Christopher Bishop, 2006. Pattern Recognition and Machine Learning
David MacKay, 2003. Information Theory, Inference, and Learning Algorithms
David Barber, 2012. Bayesian Reasoning and Machine Learning

Supplementary

Peter Flach, 2013. Machine Learning: The Art and Science of Algorithms that Make Sense of Data
Trevor Hastie, Robert Tibshirani and Jerome Friedman, 2013 The Elements of Statistical Learning: Data Mining, Inference, and Prediction
Shai Shalev-Shwartz and Shai Ben-David, 2014. Understanding Machine Learning: From Theory to Algorithms

Prerequisites

Calculus, Linear algebra, Probability, Statistics, Python (Matlab)