# Week 3

From My first scientific paper

The goal is to understand **the type of problem** to state and solve.

## P: Problem statement

In the paradigm Idea\(\to\)Formula\(\to\)Code state the problem to find an optimal solution.

- Discuss the problem statement with your adviser.
- See the examples below and in the past projects.
- Discuss terminology and notation see [pdf] and [tex] with notations and a useful style file.
- In the beginning of Problem statement write a general problem description.
- Describe the elements of your problem statement:
- the sample set,
- its origin, or its algebraic structure,
- statistical hypotheses of data generation,
- [conditions of measurements] ,
- [restrictions of the sample set and its values],
- your model in the class of models,
- restrictions on the class of models,
- the error function (and its inference) or a loss function, or a quality criterion,
- cross-validation procedure,
- restrictions to the solutions,
- external (industrial) quality criteria,
- the optimization statement as \(\arg\min\).

- Define the main termini: what is called the model, the solution, the algorithm.

Note that:

- The
**model**is a parametric family of functions to map design space to target space. - The
**criterion**(error function) is a function to optimize in order to obtain an optimal solution (model parameters, a function). - The
**algorithm**transforms solution space, usually iteratively. - The
**method**combines a model, a criterion, and an algorithm to produce a solution.

Check it:

- the regression
*model*, - the sum of squared
*errors*, - the Newton-Raphson
*algorithm*, - the
*method*of least squares.

## Resources

- Slides for week 3.
- Video for week 3.
- Recommended notations: pdf and .tex with .sty
- Slides with a plan of Problem statement
- Examples of problem statements
- Katrutsa A.M., Strijov V.V. Stresstest procedure for feature selection algorithms // Chemometrics and Intelligent Laboratory Systems, 2015, 142 : 172-183 article
- Katrutsa A.M., Strijov V.V. Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria // Expert Systems with Applications, 2017 article
- Motrenko A., Strijov V., Weber G.-W. Bayesian sample size estimation for logistic regression // Journal of Computational and Applied Mathematics, 2014, 255 : 743-752. article
- Kulunchakov A.S., Strijov V.V. Generation of simple structured Information Retrieval functions by genetic algorithm without stagnation // Expert Systems with Applications, 2017, 85 : 221-230. article
- Ivkin N.P. Feature generation for classification and forecasting problems, MIPT, 2013 draft

- Notations for wiki Ru
- Basic notations, pdf
- Simple and useful notations
- Notations for Bayesian model selection, pdf