Week 4
From Research management course
The goal is to get the simplest possible solution to your problem: it is models and its parameters. So make the model fit data with the minimum of your efforts.
Contents
X: Experiment planning
Plan your computational experiment.
- Discuss the experiment goal with your adviser and team.
- Put this goal in the section Computational experiment
- Describe your basic data set, a synthetic, or a simple real one:
- put in the text the title, source and set up of measurements (it is the technical description, the theoretical one is in the problem statement section),
- write down the number of objects, features, describe general statistics,
- for a synthetic data set describe the generation model, its parameters (for example, uniform random independent sampling some given interval).
- Describe the configuration of algorithm run.
- Plan the whole experimental part.
- List expected tables and figures:
- make short and long list, for each
- describe axes,
- make a draft with a pencil.
R: Preliminary report
- Make sure that the obtained results are in not logical (sic!) contradiction with the goals of the computational experiment.
- Illustrate the obtained results with the preliminary plot. Optimally this plot is hand-made. Just draw it with a pencil on a piece of paper. See an example. For the final version use this format.
- Write a mini-report on the results with
- a short description of the figure: what the reader could see, what are the consequences,
- the results in numbers and comments on it,
- put the report to the section computational experiment.
B: Run basic code
Select the basic algorithm and run it using a simple data set.
- Run your basic algorithm:
- select a simplest algorithm (with your adviser) to (partially) solve the problem you set.
- Collect a synthetic data set or download a simple real-word data set of small size.
- Upload your data to the repository (in case the data size exceed 5MB or the data set consists of numerous files, please discuss with your adviser and team).
- Run the basic algorithm on the synthetic data set, estimate the error.
- Describe the basic algorithm, analyst its features, list competitive models. Here the examples of the description style.
- Description refers to the name of some black box model. It is advisable to indicate the source, where the contents of the black box model are described in detail. The description specifies the structural parameters of the black box.
- Description defines a model as a map from the design space of features to the space of target variables. Since the model has its parameters the description may refer to the algorithm for optimizing the model parameters in the form of a black box.
- Description of the model and algorithm for optimizing its parameters in the form of pseudocode.
Resources
- Slides for week 4. Slides 2022.
- Video for week 4.
- See examples of the reports.
- Бахтеев О.Ю. Системы и средства глубокого обучения, статья
- Мотренко А.П. Повышение качества классификации, статья
- Исаченко Р.В. Снижение размерности в задаче декодирования, статья
- The goals of computational experiments А. Грабовой, В. Алексеев, А. Рогозина, И. Игашов, Н. Уваров
- Example of the measurement description, Bishop C.P. Pattern recognition and machine learning, 2006. Pp. 677-683.]
- Top 8 Sources For Machine Learning Datasets
Homework
- Watch the video.
- Find the code that works.
- Write the goal of your computational experiment. A couple of sentences help you focus your efforts.
- Write a draft of your desired report and draw a plot for the error analysis next step.
- Run the code on the simplest dataset.