Difference between revisions of "Course syllabus: Neural architecture search"

From Research management course
Jump to: navigation, search
(Created page with "Twelve lectures with practical exercises. The first part of the class is devoted to the theoretical search for architecture. It ends with a technical application. As part of t...")
 
 
(4 intermediate revisions by one other user not shown)
Line 1: Line 1:
Twelve lectures with practical exercises. The first part of the class is devoted to the theoretical search for architecture. It ends with a technical application. As part of the practical work, the architecture of a neural network of a given type is analysed.  
+
{{#seo:
 +
|title=Neural architecture search
 +
|titlemode=replace
 +
|keywords=Neural architecture search
 +
|description=The course Neural Architecture Search includes twelve lectures with practical exercises. The first part of the class is devoted to the theoretical search for architecture. As part of the practical work, the architecture of a neural network of a given type is analyzed.
 +
}}
 +
Twelve lectures with practical exercises. The first part of the class is devoted to the theoretical search for architecture. It ends with a technical application. As part of the practical work, the architecture of a neural network of a given type is analyzed.  
  
 
# Overview of neural network types and architecture descriptions
 
# Overview of neural network types and architecture descriptions
 
# Genetic Algorithms from GMDH to WANN
 
# Genetic Algorithms from GMDH to WANN
# Structure selection quality criteria (to be discussed in a couple of weeks)
+
# Structure selection quality criteria
# A priori rights to individual models, types of distributed structural parameters
+
# A priori hypothesis for individual models, types of distributed structural parameters
# Structural parameter analysis methods
+
# Structural parameter analysis
 
# Online learning and multi-armed bandits to generate structure
 
# Online learning and multi-armed bandits to generate structure
 
# Reinforcement learning to generate structure
 
# Reinforcement learning to generate structure
Line 14: Line 20:
 
# Bilevel Bayesian Selection and Metropolis-Hastings Sampling
 
# Bilevel Bayesian Selection and Metropolis-Hastings Sampling
  
Laboratory works
+
===Laboratory works===
  
 
The laboratory work is based on the application of the architecture search method. The first job is to evaluate the finished method, the second job is to propose and program your own method. Work report - a page of text with a formal description of the method with sufficient detail to recover the code, and error analysis (basic diagnostic criteria, cases, cases). The interface to the class is constant and common to all, just like the selections. There are general tables with results, and a private analysis of the errors of each method.
 
The laboratory work is based on the application of the architecture search method. The first job is to evaluate the finished method, the second job is to propose and program your own method. Work report - a page of text with a formal description of the method with sufficient detail to recover the code, and error analysis (basic diagnostic criteria, cases, cases). The interface to the class is constant and common to all, just like the selections. There are general tables with results, and a private analysis of the errors of each method.
  
Computational experiment with the report
+
===Computational experiment with the report===
 
Each student makes a short report in 3 minutes on 7 and 14 weeks on the first laboratory work and the second, respectively.
 
Each student makes a short report in 3 minutes on 7 and 14 weeks on the first laboratory work and the second, respectively.
  
Grading  
+
===Grading===
 
Total 10 points, two points for answering questions during classes, and four points for two laboratory works. It is not the accuracy of the approximation that is evaluated but the quality of the code and error analysis.
 
Total 10 points, two points for answering questions during classes, and four points for two laboratory works. It is not the accuracy of the approximation that is evaluated but the quality of the code and error analysis.
 +
 +
 +
 +
===References===
 +
hidden <!--
 +
https://weightagnostic.github.io/
 +
https://github.com/danielskachkov/WANN/blob/master/WANNs.ipynb
 +
https://arxiv.org/abs/1806.09055
 +
http://strijov.com/papers/Kulunchakov2014RankingBySimpleFun.pdf
 +
https://github.com/MarkPotanin/GeneticOpt
 +
https://towardsdatascience.com/7-of-the-most-commonly-used-regression-algorithms-and-how-to-choose-the-right-one-fc3c8890f9e3
 +
https://scikit-learn.org/stable/modules/gaussian_process.html
 +
DARTS: DIFFERENTIABLE ARCHITECTURE SEARCH https://arxiv.org/pdf/1806.09055.pdf ICLR 2019
 +
Searching for A Robust Neural Architecture in Four GPU Hou https://arxiv.org/pdf/1910.04465.pdf
 +
When NAS Meets Robustness https://arxiv.org/pdf/1911.10695.pdf
 +
Random Search and Reproducibility for Neural Architecture Search http://proceedings.mlr.press/v115/li20c/li20c.pdf
 +
One-Shot Neural Architecture Search via Novelty Driven Sampling https://www.ijcai.org/Proceedings/2020/0441.pdf
 +
Auto-Keras: An Efficient Neural Architecture Search System https://arxiv.org/pdf/1806.10282.pdf
 +
Regularized Evolution for Image Classifier Architecture Search https://arxiv.org/pdf/1802.01548.pdf
 +
HYPERMODELS FOR EXPLORATION Vikranth Dwaracherla https://openreview.net/pdf?id=ryx6WgStPB
 +
https://math.stackexchange.com/questions/2031373/finding-the-fourier-series-of-deltax-on-pi-pi-dirac-delta
 +
https://arxiv.org/pdf/1701.03281.pdf
 +
MODULARIZED MORPHING OF NEURAL NETWORKS
 +
https://openreview.net/pdf?id=r1Ue8Hcxg
 +
NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING
 +
Unsupervised Deep Structure Learning
 +
by Recursive Independence Testing
 +
http://bayesiandeeplearning.org/2017/papers/18.pdf
 +
Non-parametric Bayesian Learning with Deep Learning Structure and Its Applications in Wireless Networks
 +
https://arxiv.org/pdf/1410.4599.pdf
 +
https://papers.nips.cc/paper/6205-swapout-learning-an-ensemble-of-deep-architectures.pdf
 +
https://arxiv.org/pdf/1001.0160.pdf
 +
https://arxiv.org/pdf/1711.03130.pdf
 +
https://arxiv.org/pdf/1706.00046.pdf
 +
Knowledge Matters: Importance of Prior Information for (Optimization)
 +
-->

Latest revision as of 00:07, 14 February 2024

Twelve lectures with practical exercises. The first part of the class is devoted to the theoretical search for architecture. It ends with a technical application. As part of the practical work, the architecture of a neural network of a given type is analyzed.

  1. Overview of neural network types and architecture descriptions
  2. Genetic Algorithms from GMDH to WANN
  3. Structure selection quality criteria
  4. A priori hypothesis for individual models, types of distributed structural parameters
  5. Structural parameter analysis
  6. Online learning and multi-armed bandits to generate structure
  7. Reinforcement learning to generate structure
  8. Transfer of knowledge between neural networks and optimization of structural parameters
  9. Random processes for generating models
  10. Generative Adversarial Networks and Search Structure
  11. Creation and rejection of structure
  12. Bilevel Bayesian Selection and Metropolis-Hastings Sampling

Laboratory works

The laboratory work is based on the application of the architecture search method. The first job is to evaluate the finished method, the second job is to propose and program your own method. Work report - a page of text with a formal description of the method with sufficient detail to recover the code, and error analysis (basic diagnostic criteria, cases, cases). The interface to the class is constant and common to all, just like the selections. There are general tables with results, and a private analysis of the errors of each method.

Computational experiment with the report

Each student makes a short report in 3 minutes on 7 and 14 weeks on the first laboratory work and the second, respectively.

Grading

Total 10 points, two points for answering questions during classes, and four points for two laboratory works. It is not the accuracy of the approximation that is evaluated but the quality of the code and error analysis.


References

hidden