|
|
(19 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
| + | {{#seo: |
| + | |title=Data Science Course syllabus |
| + | |titlemode=replace |
| + | |keywords=Data Science Course |
| + | |description=Below is listed the Data Science Course syllabus. |
| + | }} |
| Below are listed the course syllabi on Data Science topics. | | Below are listed the course syllabi on Data Science topics. |
| | | |
− | *[[Course syllabus: Mathematics of decision making|Mathematics of decision making]]
| + | ===Course Syllabi=== |
− | *[[Course syllabus: Applied regression analysis|Applied regression analysis]]
| + | #[[Course syllabus: Human-computer interfaces|Human-computer interfaces]] |
− | | + | #[[Course schedule|My first scientific paper]] |
− | Course Syllabus | + | #[[Course syllabus: Bayesian model selection|Bayesian model selection]] |
− | The lecture course is devoted to data modeling problems in regression analysis. The emphasis is the automatic model generation and the optimal model structure selection. This lecture course has been delivered since 2006.
| + | #[[Fundamental theorems|Fundamental theorems of Machine Learning]] |
− | | + | #[[Mathematical forecasting]] |
− | The lecture course consists of the theoretical part (40 hours) and the practice (40 hours). The theory includes methods of regression analysis and their basis. The practice includes a series of algorithms to develop using Matlab or Scilab software tools.
| + | #[[Course syllabus: Structure learning and forecasting|Structure learning and forecasting]] |
− | Prerequisites: linear algebra, statistics, and programming skills. The knowledge of the optimization methods is appreciated.
| + | #[[Course syllabus: Bayesian model selection and multimodeling|Bayesian multimodeling]] |
− | # Introduction to regression analysis | + | #[[Course syllabus: Introduction to Machine Learning|Introduction to Machine Learning]] |
− | #* Terminology: approximation, interpolation, extrapolation, regression | + | #[[Course syllabus: Machine Learning|Machine Learning]] |
− | #* Standard notation. Problem statement | + | #[[Course syllabus: Generative deep learning|Generative deep learning]] |
− | #* What is the regression model?
| + | #[[Course syllabus: Applied regression analysis|Applied regression analysis]] |
− | #* The main problems of regression analysis
| + | #[[Course syllabus: Neural architecture search|Neural architecture search]] |
− | #* Linear regression and least squares
| + | #[[Course syllabus: Big data analysis|Big data analysis]] |
− | #* Introduction to Scilab/SAS | + | #[[Course syllabus: Mathematics of decision making|Mathematics of decision making]] |
− | # Computational linear methods | + | #[[Course syllabus: Data Mining in Business Analytics|Data Mining in Business Analytics]] |
− | #* Singular values decomposition | + | #[[Course syllabus: Category theory for Machine Learning|Category theory for Machine Learning]] |
− | #* The features of the SVD | |
− | #* Using the SVD: Fisher segmentation
| |
− | #* Principal component analysis
| |
− | #* Substitutions in the linear models
| |
− | # Regularization in the linear methods
| |
− | #* Spaces of the singular vectors
| |
− | #* The matrix norms and conditionality
| |
− | #* Regularization for the LS, SVD, PCA
| |
− | #* The weighted regression
| |
− | #* Scales and the Pareto-slicing | |
− | #* The integral indicators and the expert estimations | |
− | #* Expert estimation concordance and regularization: linear and quadratic
| |
− | # Group method for data handling and cross-validation
| |
− | #* The GMDH principles
| |
− | #* Cross validation principles and overtraining
| |
− | #* External and internal criterions
| |
− | #* Criterions of the regularity, minimal bias and forecast ability
| |
− | #* Linear combinations of the criterions
| |
− | #* Criterion space and Pareto-optimal front
| |
− | # GMDH and model generation
| |
− | #* The GMDH basic model and Kolmogorov-Gabor polynomial
| |
− | #* Substitution in the basic model
| |
− | #* The GMDH algorithm and its termination
| |
− | #* The multilayer, combinatorial and genetic algorithms
| |
− | # Non-linear parametric model generation
| |
− | #* Problem statements and model representations
| |
− | #* Four techniques of the model generation
| |
− | #* Symbolic regression and problems of inductive generation
| |
− | #* Substitutions and algebra of the trees
| |
− | #* Expert way of initial model construction
| |
− | #* Interpretable models
| |
− | # Residual analysis
| |
− | #* General statistics of the residuals
| |
− | #* Dispersion analysis
| |
− | #* Correlation of the residuals, Durbin-Watson criterion | |
− | #* Bootstrap of the samples
| |
− | #* Error function in the data space and in the parameter space | |
− | #* Penalty for the parameter values on the linear models
| |
− | #* Lipshitz constant and data generation hypothesis
| |
− | # Data generation hypothesis
| |
− | #* Random variable distribution
| |
− | #* Joint distribution
| |
− | #* The maximum likelihood principle
| |
− | #* Univariate and multivariate normal distribution inference
| |
− | #* The simplest method to estimate the distribution for given hypothesis
| |
− | #* Statistic features of the parameters: consistency, efficiency, biasness
| |
− | #* Graphics analysis of the parameter estimations
| |
− | # Coherent Bayesian Inference
| |
− | #* The first level of the inference
| |
− | #* The parameter distribution
| |
− | #* Example of the finite parametric model comparison
| |
− | #* Model generation and model selection flow
| |
− | #* The model evidence
| |
− | #* The posterior distribution and Occam factor
| |
− | #* Example of the model selection process
| |
− | # Parameter space analysis
| |
− | #* Optimal brain surgery and importance of model elements
| |
− | #* Laplace approximation, one and multidimensional
| |
− | #* Integration in the parameter space
| |
− | #* Estimation of the hyperparameters
| |
− | #* Algorithms of the Hessian matrix approximation
| |
− | # Minimum description length
| |
− | #* The MDL principle using Bayesian inference
| |
− | #* Kolmogorov complexity
| |
− | #* Entropy and complexity
| |
− | #* Akaike information criterion
| |
− | #* Bayesian information criterion
| |
− | #* Data complexity and model complexity
| |
− | # Non-parametric regression
| |
− | #* Data smoothing
| |
− | #* Exponential smoothing
| |
− | #* Kernels and regression models
| |
− | #* Splne approximation
| |
− | #* Regression using the radial basic functions
| |
− | #* Regression using the support vector machines
| |
− | # Time series analysis
| |
− | #* Examples of the time series
| |
− | #* Stationarity and ergodicity; trend and fluctuations
| |
− | #* Heteroscedasticity
| |
− | #* Singular structure analysis
| |
− | #* Vector auto-regression
| |
− | #* Numerical experiment organization
| |
− | #* Requirements and expectations in the field of applications
| |
− | #* Expert point-of-view and expert estimations | |
− | #* Data preprocessing organization
| |
− | #* Choice of the model class and algorithms
| |
− | #* Architecture of the software
| |
− | #* Report fulfillment
| |
− |
| |
− | Prerequisites: Knowledge of linear algebra and statistics is required.
| |
− | * Grading:
| |
− | * Exam 80%
| |
− | * Coursework 10%
| |
− | * Intermediate case test 10%
| |