Week 3
From Research management course
The goal is to understand the type of problem to state and solve.
P: Problem statement
In the paradigm Idea\(\to\)Formula\(\to\)Code state the problem to find an optimal solution.
- Discuss the problem statement with your adviser.
- See the examples below and in the past projects.
- Discuss terminology and notation see [pdf] and [tex] with notations and a useful style file.
- In the beginning of Problem statement write a general problem description.
- Describe the elements of your problem statement:
- the sample set,
- its origin, or its algebraic structure,
- statistical hypotheses of data generation,
- [conditions of measurements] ,
- [restrictions of the sample set and its values],
- your model in the class of models,
- restrictions on the class of models,
- the error function (and its inference) or a loss function, or a quality criterion,
- cross-validation procedure,
- restrictions to the solutions,
- external (industrial) quality criteria,
- the optimization statement as \(\arg\min\).
- Define the main termini: what is called the model, the solution, the algorithm.
Note that:
- The model is a parametric family of functions to map design space to target space.
- The criterion (error function) is a function to optimize in order to obtain an optimal solution (model parameters, a function).
- The algorithm transforms solution space, usually iteratively.
- The method combines a model, a criterion, and an algorithm to produce a solution.
Check it:
- the regression model,
- the sum of squared errors,
- the Newton-Raphson algorithm,
- the method of least squares.
Resources
- Slides for week 3.
- Video for week 3.
- Recommended notations, 2019: pdf and .tex with .sty
- Slides with a plan of Problem statement
- Examples of problem statements
- Katrutsa A.M., Strijov V.V. Stresstest procedure for feature selection algorithms // Chemometrics and Intelligent Laboratory Systems, 2015, 142 : 172-183 article
- Katrutsa A.M., Strijov V.V. Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria // Expert Systems with Applications, 2017 article
- Motrenko A., Strijov V., Weber G.-W. Bayesian sample size estimation for logistic regression // Journal of Computational and Applied Mathematics, 2014, 255 : 743-752. article
- Kulunchakov A.S., Strijov V.V. Generation of simple structured Information Retrieval functions by genetic algorithm without stagnation // Expert Systems with Applications, 2017, 85 : 221-230. article
- Ivkin N.P. Feature generation for classification and forecasting problems, MIPT, 2013 draft
- Notations for wiki Ru
- Basic notations, pdf
- Simple and useful notations
- Notations for Bayesian model selection, pdf