Difference between revisions of "Week 4"
From Research management course
(14 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | {{#seo: | |
+ | |title=Course My first scientific paper: Week 4 | ||
+ | |titlemode=replace | ||
+ | |keywords=My first scientific paper | ||
+ | |description=Course My first scientific paper: The goal is to get the simplest possible solution to your problem: it is models and its parameters. So make the model fit data with the minimum of your efforts. | ||
+ | }} | ||
+ | |||
+ | The goal is to get the simplest possible solution to your problem: it is models and its parameters. So make the model fit data with the minimum of your efforts. | ||
==X: Experiment planning == | ==X: Experiment planning == | ||
Line 6: | Line 13: | ||
#* Put this goal in the section Computational experiment | #* Put this goal in the section Computational experiment | ||
# Describe your basic data set, a synthetic, or a simple real one: | # Describe your basic data set, a synthetic, or a simple real one: | ||
− | #* put in the text the title, source and set up of measurements (it is the technical description, the theoretical one is in the problem statement section), | + | #* put in the text the title, source, and set up of measurements (it is the technical description, the theoretical one is in the problem statement section), |
− | #* write down the number of objects, features, describe general statistics, | + | #* write down the number of objects, and features, describe general statistics, |
− | #* for a synthetic data set describe the generation model, its parameters (for example, uniform random independent sampling some given interval). | + | #* for a synthetic data set describe the generation model, and its parameters (for example, uniform random independent sampling at some given interval). |
− | # Describe the configuration of algorithm run. | + | # Describe the configuration of the algorithm run. |
# Plan the whole experimental part. | # Plan the whole experimental part. | ||
# List expected tables and figures: | # List expected tables and figures: | ||
Line 18: | Line 25: | ||
==R: Preliminary report == | ==R: Preliminary report == | ||
# Make sure that the obtained results are in not logical (sic!) contradiction with the goals of the computational experiment. | # Make sure that the obtained results are in not logical (sic!) contradiction with the goals of the computational experiment. | ||
− | # Illustrate the obtained results with the preliminary plot [http://www.machinelearning.ru/wiki/index.php?title=JMLDA/Fig | + | # Illustrate the obtained results with the preliminary plot. Optimally this plot is hand-made. '''Just draw it with a pencil on a piece of paper.''' See [http://www.machinelearning.ru/wiki/images/3/30/Likelihood_handdrawn.pdf an example]. For the final version [http://www.machinelearning.ru/wiki/index.php?title=JMLDA/Fig use this format]. |
# Write a mini-report on the results with | # Write a mini-report on the results with | ||
## a short description of the figure: what the reader could see, what are the consequences, | ## a short description of the figure: what the reader could see, what are the consequences, | ||
Line 28: | Line 35: | ||
# Run your basic algorithm: | # Run your basic algorithm: | ||
− | #* select | + | #* select the simplest algorithm (with your adviser) to (partially) solve the problem you set. |
− | # Collect a synthetic data set or download a simple real- | + | # Collect a synthetic data set or download a simple real-world data set of small size. |
− | # Upload your data to the repository (in case the data size | + | # Upload your data to the repository (in case the data size exceeds 5MB or the data set consists of numerous files, please discuss with your adviser and team). |
− | # Run the basic algorithm on the synthetic data set, estimate the error. | + | # Run the basic algorithm on the synthetic data set, and estimate the error. |
− | # Describe the basic algorithm, | + | # Describe the basic algorithm, analyze its features, and list competitive models. Here the examples of the description style. |
− | ## | + | ## Description refers to the name of some black box model. It is advisable to indicate the source, where the contents of the black box model are described in detail. The description specifies the structural parameters of the black box. |
− | ## | + | ## Description defines a model as a map from the design space of features to the space of target variables. Since the model has its parameters the description may refer to the algorithm for optimizing the model parameters in the form of a black box. |
− | ## | + | ## Description of the model and algorithm for optimizing its parameters in the form of pseudocode. |
==Resources== | ==Resources== | ||
− | * [http://www.machinelearning.ru/wiki/images/ | + | * [http://www.machinelearning.ru/wiki/images/4/45/M1p_lect4.pdf Slides for week 4]. Slides [http://www.machinelearning.ru/wiki/images/c/c3/M1p2022lect4.pdf 2022]. |
− | + | * [https://youtu.be/8viZLYFfBsM Video for week 4]. | |
− | * Бахтеев О.Ю. Системы и средства глубокого обучения, [http://strijov.com/papers/Bakhteev2016AWS.pdf статья] | + | * See examples of the reports. |
− | * Мотренко А.П. Повышение качества классификации, [http://strijov.com/papers/MolybogMotrenko2017DimRed.pdf статья] | + | *# Бахтеев О.Ю. Системы и средства глубокого обучения, [http://strijov.com/papers/Bakhteev2016AWS.pdf статья] |
− | * Исаченко Р.В. Снижение размерности в задаче декодирования, [https://github.com/Intelligent-Systems-Phystech/2017-Isachenko-PLS/raw/master/doc/Isachenko2017PLS.pdf статья] | + | *# Мотренко А.П. Повышение качества классификации, [http://strijov.com/papers/MolybogMotrenko2017DimRed.pdf статья] |
− | * The goals of computational experiments [http://svn.code.sf.net/p/mlalgorithms/code/Group574/Grabovoy2018OptimalBrainDamage/doc/slides/Grabovoy2018OptimalBrainDamage.pdf А. Грабовой], [http://svn.code.sf.net/p/mlalgorithms/code/Group474/Alekseev2017IntraTextCoherence/doc/Alekseev2017Presentation.pdf В. Алексеев], [http://svn.code.sf.net/p/mlalgorithms/code/Group574/Rogozina2018StructurePredictionRNA/doc/slides/Rogozina2018RNAPredictionsSlides.pdf А. Рогозина], [https://github.com/Intelligent-Systems-Phystech/Group594/raw/master/Igashov2018ProteinLigandComplexes/presentation/presentation.pdf И. Игашов], [http://svn.code.sf.net/p/mlalgorithms/code/Group474/Uvarov2017DynamicGraphicalModels/slides/Uvarov2017DynamicGraphicalModels.pdf Н. Уваров] | + | *# Исаченко Р.В. Снижение размерности в задаче декодирования, [https://github.com/Intelligent-Systems-Phystech/2017-Isachenko-PLS/raw/master/doc/Isachenko2017PLS.pdf статья] |
− | * Example of the measurement description, [http://www.machinelearning.ru/wiki/images/3/35/Old_Faithful_dataset_description.pdf Bishop C.P. Pattern recognition and machine learning, 2006. Pp. 677-683.]] | + | *# The goals of computational experiments [http://svn.code.sf.net/p/mlalgorithms/code/Group574/Grabovoy2018OptimalBrainDamage/doc/slides/Grabovoy2018OptimalBrainDamage.pdf А. Грабовой], [http://svn.code.sf.net/p/mlalgorithms/code/Group474/Alekseev2017IntraTextCoherence/doc/Alekseev2017Presentation.pdf В. Алексеев], [http://svn.code.sf.net/p/mlalgorithms/code/Group574/Rogozina2018StructurePredictionRNA/doc/slides/Rogozina2018RNAPredictionsSlides.pdf А. Рогозина], [https://github.com/Intelligent-Systems-Phystech/Group594/raw/master/Igashov2018ProteinLigandComplexes/presentation/presentation.pdf И. Игашов], [http://svn.code.sf.net/p/mlalgorithms/code/Group474/Uvarov2017DynamicGraphicalModels/slides/Uvarov2017DynamicGraphicalModels.pdf Н. Уваров] |
+ | *# Example of the measurement description, [http://www.machinelearning.ru/wiki/images/3/35/Old_Faithful_dataset_description.pdf Bishop C.P. Pattern recognition and machine learning, 2006. Pp. 677-683.]] | ||
<!-- * Построение выборки в задачах прогнозирования, [http://svn.code.sf.net/p/mvr/code/lectures/DataFest/Strijov2016Tutorial.pdf слайды]. EXTRACT The feature generation part--> | <!-- * Построение выборки в задачах прогнозирования, [http://svn.code.sf.net/p/mvr/code/lectures/DataFest/Strijov2016Tutorial.pdf слайды]. EXTRACT The feature generation part--> | ||
<!-- * Постановка задачи прогнозирования дефолтов по картам на год вперед, [[Media:Strijov2018ProbStCardScoring.pdf|слайды]] --> | <!-- * Постановка задачи прогнозирования дефолтов по картам на год вперед, [[Media:Strijov2018ProbStCardScoring.pdf|слайды]] --> | ||
<!-- * [http://www.machinelearning.ru/wiki/images/4/49/Strijov2019IDEF0.pdf The IDEF standard for project planning] OLD version --> | <!-- * [http://www.machinelearning.ru/wiki/images/4/49/Strijov2019IDEF0.pdf The IDEF standard for project planning] OLD version --> | ||
+ | * [https://medium.datadriveninvestor.com/top-8-sources-for-machine-learning-and-analytics-datasets-5d2d94ada8ab Top 8 Sources For Machine Learning Datasets] | ||
+ | ==Homework== | ||
+ | # Watch the [https://www.youtube.com/watch?v=YnWVsjmZ2LI&list=PLk4h7dmY2eYE2Lp2ScMRSGDxLIbJr4vJ8&index=8 video]. | ||
+ | # Find the code that works. | ||
+ | # Write the goal of your computational experiment. A couple of sentences help you focus your efforts. | ||
+ | # Write a draft of your desired report and draw a plot for the error analysis next step. | ||
+ | # Run the code on the simplest dataset. |
Latest revision as of 16:55, 19 February 2024
The goal is to get the simplest possible solution to your problem: it is models and its parameters. So make the model fit data with the minimum of your efforts.
Contents
X: Experiment planning
Plan your computational experiment.
- Discuss the experiment goal with your adviser and team.
- Put this goal in the section Computational experiment
- Describe your basic data set, a synthetic, or a simple real one:
- put in the text the title, source, and set up of measurements (it is the technical description, the theoretical one is in the problem statement section),
- write down the number of objects, and features, describe general statistics,
- for a synthetic data set describe the generation model, and its parameters (for example, uniform random independent sampling at some given interval).
- Describe the configuration of the algorithm run.
- Plan the whole experimental part.
- List expected tables and figures:
- make short and long list, for each
- describe axes,
- make a draft with a pencil.
R: Preliminary report
- Make sure that the obtained results are in not logical (sic!) contradiction with the goals of the computational experiment.
- Illustrate the obtained results with the preliminary plot. Optimally this plot is hand-made. Just draw it with a pencil on a piece of paper. See an example. For the final version use this format.
- Write a mini-report on the results with
- a short description of the figure: what the reader could see, what are the consequences,
- the results in numbers and comments on it,
- put the report to the section computational experiment.
B: Run basic code
Select the basic algorithm and run it using a simple data set.
- Run your basic algorithm:
- select the simplest algorithm (with your adviser) to (partially) solve the problem you set.
- Collect a synthetic data set or download a simple real-world data set of small size.
- Upload your data to the repository (in case the data size exceeds 5MB or the data set consists of numerous files, please discuss with your adviser and team).
- Run the basic algorithm on the synthetic data set, and estimate the error.
- Describe the basic algorithm, analyze its features, and list competitive models. Here the examples of the description style.
- Description refers to the name of some black box model. It is advisable to indicate the source, where the contents of the black box model are described in detail. The description specifies the structural parameters of the black box.
- Description defines a model as a map from the design space of features to the space of target variables. Since the model has its parameters the description may refer to the algorithm for optimizing the model parameters in the form of a black box.
- Description of the model and algorithm for optimizing its parameters in the form of pseudocode.
Resources
- Slides for week 4. Slides 2022.
- Video for week 4.
- See examples of the reports.
- Бахтеев О.Ю. Системы и средства глубокого обучения, статья
- Мотренко А.П. Повышение качества классификации, статья
- Исаченко Р.В. Снижение размерности в задаче декодирования, статья
- The goals of computational experiments А. Грабовой, В. Алексеев, А. Рогозина, И. Игашов, Н. Уваров
- Example of the measurement description, Bishop C.P. Pattern recognition and machine learning, 2006. Pp. 677-683.]
- Top 8 Sources For Machine Learning Datasets
Homework
- Watch the video.
- Find the code that works.
- Write the goal of your computational experiment. A couple of sentences help you focus your efforts.
- Write a draft of your desired report and draw a plot for the error analysis next step.
- Run the code on the simplest dataset.