# Week 3

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The goal is to understand the type of problem to state and solve.

## P: Problem statement

In the paradigm Idea$$\to$$Formula$$\to$$Code state the problem to find an optimal solution.

1. Discuss the problem statement with your adviser.
2. See the examples below and in the past projects.
3. Discuss terminology and notation see [pdf] and [tex] with notations and a useful style file.
4. In the beginning of Problem statement write a general problem description.
5. Describe the elements of your problem statement:
1. the sample set,
2. its origin, or its algebraic structure,
3. statistical hypotheses of data generation,
4. [conditions of measurements] ,
5. [restrictions of the sample set and its values],
6. your model in the class of models,
7. restrictions on the class of models,
8. the error function (and its inference) or a loss function, or a quality criterion,
9. cross-validation procedure,
10. restrictions to the solutions,
11. external (industrial) quality criteria,
12. the optimization statement as $$\arg\min$$.
6. Define the main termini: what is called the model, the solution, the algorithm.

Note that:

• The model is a parametric family of functions to map design space to target space.
• The criterion (error function) is a function to optimize in order to obtain an optimal solution (model parameters, a function).
• The algorithm transforms solution space, usually iteratively.
• The method combines a model, a criterion, and an algorithm to produce a solution.

Check it:

• the regression model,
• the sum of squared errors,
• the Newton-Raphson algorithm,
• the method of least squares.

## Resources

• Slides for week 3.
• Video for week 3.
• Recommended notations: pdf and .tex with .sty
• Slides with a plan of Problem statement
• Examples of problem statements
1. Katrutsa A.M., Strijov V.V. Stresstest procedure for feature selection algorithms // Chemometrics and Intelligent Laboratory Systems, 2015, 142 : 172-183 article
2. Katrutsa A.M., Strijov V.V. Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria // Expert Systems with Applications, 2017 article
3. Motrenko A., Strijov V., Weber G.-W. Bayesian sample size estimation for logistic regression // Journal of Computational and Applied Mathematics, 2014, 255 : 743-752. article
4. Kulunchakov A.S., Strijov V.V. Generation of simple structured Information Retrieval functions by genetic algorithm without stagnation // Expert Systems with Applications, 2017, 85 : 221-230. article
5. Ivkin N.P. Feature generation for classification and forecasting problems, MIPT, 2013 draft
• Notations for wiki Ru
• Basic notations, pdf
• Simple and useful notations
• Notations for Bayesian model selection, pdf