Course syllabus: Structure learning and forecasting

This lecture course is devoted to the problems of data modeling and forecasting. The emphasis is on automatic model generation and model selection. The course collects structure learning methods. It discusses theoretical aspects and various applications. The main goal of the course is to show how a quantitative model could be recognized among the data analysis daily routine problems and how to create a model of the optimal structure.

Prerequisites: discrete analysis, linear algebra, statistics, optimization.

Problem statement in forecasting
- Basic notations, basic sample structures, problem statement for model selection: deterministic and statistical approaches
Data generation hypothesis
- The maximum likelihood principle, univariate and multivariate distribution inference, Statistical analysis of model parameters
Bayesian Inference
- Data generation hypotheses, first and second level of inference, example of model comparison, model generation and model selection flow, the model evidence and Occam factor
Structure learning
- Model as a superposition, admissible superposition, tree/DAG representation, superposition identity matrix, model structure discovering procedure
Structure complexity
- Notation and description of structure, complexity of tree and DAG, distance between trees and between DAGs
Statistical complexity
- Minimum description length principle using Bayesian inference, entropy and complexity, Akaike/Bayesian information criterion
Parametric methods
- Generalized linear and nonlinear parametric models, radial basis functions, neural networks, network superpositions for deep learning
Non-parametric methods
- Smoothing, kernels and regression models, spline approximation, empirical distribution function estimation
Challenges of mixed-scale forecasting
- Linear, interval, ordinal-categorical scales and algebraic structures, scale conversion, isotonic regression, conic representation, pareto slicing and classification
Time series analysis
- multivariate and multidimensional time series, stationarity and ergodicity; trend and fluctuations, heteroscedasticity, singular structure analysis, vector autoregression, local forecasting, self-modelling
Residual analysis
- General statistics of the residuals, dispersion analysis, correlation of the residuals, Durbin-Watson criterion, bootstrap of the samples, error function in the data space and in the parameter space, penalty for the parameter values on the linear models, Lipshitz constant and data generation hypothesis
Problem statement and optimization algorithms
- Parameter estimation, model selection, multimodel selection, multicriterial optimization