Difference between revisions of "Course syllabus: Neural architecture search"
Line 23: | Line 23: | ||
Grading | Grading | ||
Total 10 points, two points for answering questions during classes, and four points for two laboratory works. It is not the accuracy of the approximation that is evaluated but the quality of the code and error analysis. | Total 10 points, two points for answering questions during classes, and four points for two laboratory works. It is not the accuracy of the approximation that is evaluated but the quality of the code and error analysis. | ||
+ | <!-- | ||
+ | References | ||
+ | https://weightagnostic.github.io/ | ||
+ | https://github.com/danielskachkov/WANN/blob/master/WANNs.ipynb | ||
+ | https://arxiv.org/abs/1806.09055 | ||
+ | http://strijov.com/papers/Kulunchakov2014RankingBySimpleFun.pdf | ||
+ | https://github.com/MarkPotanin/GeneticOpt | ||
+ | https://towardsdatascience.com/7-of-the-most-commonly-used-regression-algorithms-and-how-to-choose-the-right-one-fc3c8890f9e3 | ||
+ | https://scikit-learn.org/stable/modules/gaussian_process.html | ||
+ | DARTS: DIFFERENTIABLE ARCHITECTURE SEARCH https://arxiv.org/pdf/1806.09055.pdf ICLR 2019 | ||
+ | Searching for A Robust Neural Architecture in Four GPU Hou https://arxiv.org/pdf/1910.04465.pdf | ||
+ | When NAS Meets Robustness https://arxiv.org/pdf/1911.10695.pdf | ||
+ | Random Search and Reproducibility for Neural Architecture Search http://proceedings.mlr.press/v115/li20c/li20c.pdf | ||
+ | One-Shot Neural Architecture Search via Novelty Driven Sampling https://www.ijcai.org/Proceedings/2020/0441.pdf | ||
+ | Auto-Keras: An Efficient Neural Architecture Search System https://arxiv.org/pdf/1806.10282.pdf | ||
+ | Regularized Evolution for Image Classifier Architecture Search https://arxiv.org/pdf/1802.01548.pdf | ||
+ | HYPERMODELS FOR EXPLORATION Vikranth Dwaracherla https://openreview.net/pdf?id=ryx6WgStPB | ||
+ | https://math.stackexchange.com/questions/2031373/finding-the-fourier-series-of-deltax-on-pi-pi-dirac-delta | ||
+ | https://arxiv.org/pdf/1701.03281.pdf | ||
+ | MODULARIZED MORPHING OF NEURAL NETWORKS | ||
+ | https://openreview.net/pdf?id=r1Ue8Hcxg | ||
+ | NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING | ||
+ | Unsupervised Deep Structure Learning | ||
+ | by Recursive Independence Testing | ||
+ | http://bayesiandeeplearning.org/2017/papers/18.pdf | ||
+ | Non-parametric Bayesian Learning with Deep Learning Structure and Its Applications in Wireless Networks | ||
+ | https://arxiv.org/pdf/1410.4599.pdf | ||
+ | https://papers.nips.cc/paper/6205-swapout-learning-an-ensemble-of-deep-architectures.pdf | ||
+ | https://arxiv.org/pdf/1001.0160.pdf | ||
+ | https://arxiv.org/pdf/1711.03130.pdf | ||
+ | https://arxiv.org/pdf/1706.00046.pdf | ||
+ | Knowledge Matters: Importance of Prior Information for (Optimization) | ||
+ | --> |
Revision as of 18:29, 2 March 2023
Twelve lectures with practical exercises. The first part of the class is devoted to the theoretical search for architecture. It ends with a technical application. As part of the practical work, the architecture of a neural network of a given type is analyzed.
- Overview of neural network types and architecture descriptions
- Genetic Algorithms from GMDH to WANN
- Structure selection quality criteria
- A priori hypothesis for individual models, types of distributed structural parameters
- Structural parameter analysis
- Online learning and multi-armed bandits to generate structure
- Reinforcement learning to generate structure
- Transfer of knowledge between neural networks and optimization of structural parameters
- Random processes for generating models
- Generative Adversarial Networks and Search Structure
- Creation and rejection of structure
- Bilevel Bayesian Selection and Metropolis-Hastings Sampling
Laboratory works
The laboratory work is based on the application of the architecture search method. The first job is to evaluate the finished method, the second job is to propose and program your own method. Work report - a page of text with a formal description of the method with sufficient detail to recover the code, and error analysis (basic diagnostic criteria, cases, cases). The interface to the class is constant and common to all, just like the selections. There are general tables with results, and a private analysis of the errors of each method.
Computational experiment with the report Each student makes a short report in 3 minutes on 7 and 14 weeks on the first laboratory work and the second, respectively.
Grading Total 10 points, two points for answering questions during classes, and four points for two laboratory works. It is not the accuracy of the approximation that is evaluated but the quality of the code and error analysis.