Course syllabus: Structure learning and forecasting
From Research management course
This lecture course is devoted to the problems of data modeling and forecasting. The emphasis is on automatic model generation and model selection. The course collects structure learning methods. It discusses theoretical aspects and various applications. The main goal of the course is to show how a quantitative model could be recognized among the data analysis daily routine problems and how to create a model of the optimal structure.
Prerequisites: discrete analysis, linear algebra, statistics, optimization.
- Problem statement in forecasting
- Basic notations, basic sample structures, problem statement for model selection: deterministic and statistical approaches
- Data generation hypothesis
- The maximum likelihood principle, univariate and multivariate distribution inference, Statistical analysis of model parameters
- Bayesian Inference
- Data generation hypotheses, first and second level of inference, an example of model comparison, model generation, and model selection flow, the model evidence, and Occam factor
- Structure learning
- Model as a superposition, admissible superposition, tree/DAG representation, superposition identity matrix, model structure discovering procedure
- Structure complexity
- Notation and description of the structure, the complexity of the tree and DAG, the distance between trees and between DAGs
- Statistical complexity
- Minimum description length principle using Bayesian inference, entropy and complexity, Akaike/Bayesian information criterion
- Parametric methods
- Generalized linear and nonlinear parametric models, radial basis functions, neural networks, and network superpositions for deep learning
- Non-parametric methods
- Smoothing, kernels and regression models, spline approximation, empirical distribution function estimation
- Challenges of mixed-scale forecasting
- Linear, interval, ordinal-categorical scales and algebraic structures, scale conversion, isotonic regression, conic representation, Pareto slicing, and classification
- Time series analysis
- multivariate and multidimensional time series, stationarity and ergodicity; trend and fluctuations, heteroscedasticity, singular structure analysis, vector autoregression, local forecasting, self-modeling
- Residual analysis
- General statistics of the residuals, dispersion analysis, correlation of the residuals, Durbin-Watson criterion, bootstrap of the samples, error function in the data space and in the parameter space, penalty for the parameter values on the linear models, Lipshitz constant and data generation hypothesis
- Problem statement and optimization algorithms
- Parameter estimation, model selection, multimodel selection, multicriterial optimization