Difference between revisions of "Fundamental theorems"

From Research management course
Jump to: navigation, search
Line 20: Line 20:
 
So in our practice we mimic the first part of progression, and then we learn to discover patterns and formulate theorems. The theoretical talks give us series of good examples.
 
So in our practice we mimic the first part of progression, and then we learn to discover patterns and formulate theorems. The theoretical talks give us series of good examples.
  
 +
==Lecture topics==
 +
# Fundamental theorem of linear algebra [https://www.engineering.iastate.edu/~julied/classes/CE570/Notes/strangpaper.pdf S]
 +
# Singular values decomposition and spectral theorem [https://en.wikipedia.org/wiki/Spectral_theorem W]
 +
# Gauss–Markov-(Aitken) theorem [https://en.wikipedia.org/wiki/Gauss–Markov_theorem W]
 +
# Principal component analysis [https://en.wikipedia.org/wiki/Principal_component_analysis W]
 +
# Karhunen–Loève theorem [https://en.wikipedia.org/wiki/Karhunen–Loève_theorem W]
 +
# Kolmogorov–Arnold representation theorem [https://en.wikipedia.org/wiki/Kolmogorov–Arnold_representation_theorem W]
 +
# Universal approximation theorem by Cybenko [https://en.wikipedia.org/wiki/Universal_approximation_theorem W]
 +
# Deep neural network theorem
 +
# No free lunch theorem by Wolpert [https://en.wikipedia.org/wiki/No_free_lunch_theorem W]
 +
# RKHS by Aronszajn and Mercer's theorem [https://en.wikipedia.org/wiki/Mercer%27s_theorem W]
 +
# Representer theorem by Schölkopf, Herbrich, and Smola [https://en.wikipedia.org/wiki/Representer_theorem W]
 +
# Convolution theorem (FT, convolution, correlation with CNN examples) [https://en.wikipedia.org/wiki/Convolution_theorem W]
 +
# Fourier inversion theorem [https://en.wikipedia.org/wiki/Fourier_inversion_theorem W]
 +
# Wiener–Khinchin theorem about autocorrelation and spectral decomposition [https://en.wikipedia.org/wiki/Wiener–Khinchin_theorem W]
 +
# Parseval's theorem (and uniform, non-uniform convergence) [https://en.wikipedia.org/wiki/Parseval%27s_theorem W]
 +
# Probably approximately correct learning with the theorem about compression means learnability
 +
# Bernstein–von Mises theorem [https://en.wikipedia.org/wiki/Bernstein–von_Mises_theorem W]
 +
# Holland's schema theorem [https://en.wikipedia.org/wiki/Holland%27s_schema_theorem W]
 +
# Variational approximation
 +
# Convergence of random variables and Kloek's theorem [https://en.wikipedia.org/wiki/Big_O_in_probability_notation W]
 +
# Exponential family of distributions and Nelder's theorem
 +
# Multi-armed bandit theorem
 +
# Copulas and Sklar's theorem [https://en.wikipedia.org/wiki/Copula_(probability_theory) W]
 +
# Boosting theorem Freud, Shapire, 1996, 1995
 +
# Bootstrap theorem (statistical estimations): Ergodic theorem
 +
# SVM theorem
  
 
===Each lesson contains===
 
===Each lesson contains===
Line 63: Line 90:
 
** theorem formulation and poof scheme are hand-written
 
** theorem formulation and poof scheme are hand-written
 
** two random theorems from the list below, 10 min to write the text
 
** two random theorems from the list below, 10 min to write the text
 
==Lecture topics==
 
# Fundamental theorem of linear algebra [https://www.engineering.iastate.edu/~julied/classes/CE570/Notes/strangpaper.pdf S]
 
# Singular values decomposition and spectral theorem [https://en.wikipedia.org/wiki/Spectral_theorem W]
 
# Gauss–Markov-(Aitken) theorem [https://en.wikipedia.org/wiki/Gauss–Markov_theorem W]
 
# Principal component analysis [https://en.wikipedia.org/wiki/Principal_component_analysis W]
 
# Karhunen–Loève theorem [https://en.wikipedia.org/wiki/Karhunen–Loève_theorem W]
 
# Kolmogorov–Arnold representation theorem [https://en.wikipedia.org/wiki/Kolmogorov–Arnold_representation_theorem W]
 
# Universal approximation theorem by Cybenko [https://en.wikipedia.org/wiki/Universal_approximation_theorem W]
 
# Deep neural network theorem
 
# No free lunch theorem by Wolpert [https://en.wikipedia.org/wiki/No_free_lunch_theorem W]
 
# RKHS by Aronszajn and Mercer's theorem [https://en.wikipedia.org/wiki/Mercer%27s_theorem W]
 
# Representer theorem by Schölkopf, Herbrich, and Smola [https://en.wikipedia.org/wiki/Representer_theorem W]
 
# Convolution theorem (FT, convolution, correlation with CNN examples) [https://en.wikipedia.org/wiki/Convolution_theorem W]
 
# Fourier inversion theorem [https://en.wikipedia.org/wiki/Fourier_inversion_theorem W]
 
# Wiener–Khinchin theorem about autocorrelation and spectral decomposition [https://en.wikipedia.org/wiki/Wiener–Khinchin_theorem W]
 
# Parseval's theorem (and uniform, non-uniform convergence) [https://en.wikipedia.org/wiki/Parseval%27s_theorem W]
 
# Probably approximately correct learning with the theorem about compression means learnability
 
# Bernstein–von Mises theorem [https://en.wikipedia.org/wiki/Bernstein–von_Mises_theorem W]
 
# Holland's schema theorem [https://en.wikipedia.org/wiki/Holland%27s_schema_theorem W]
 
# Variational approximation
 
# Convergence of random variables and Kloek's theorem [https://en.wikipedia.org/wiki/Big_O_in_probability_notation W]
 
# Exponential family of distributions and Nelder's theorem
 
# Multi-armed bandit theorem
 
# Copulas and Sklar's theorem [https://en.wikipedia.org/wiki/Copula_(probability_theory) W]
 
# Boosting theorem Freud, Shapire, 1996, 1995
 
# Bootstrap theorem (statistical estimations): Ergodic theorem
 
# SVM theorem
 
  
 
==Theorem types==
 
==Theorem types==

Revision as of 19:53, 30 June 2021

Fundamental theorems of Machine learning (with proofs) BS-4 MIPT course

Motivation and syllabus

The goal of the course is to boost the quality of students bachelor and master thesis works; to make results of student scientific research well-founded. The course studies techniques and practice of theorem formulations and proofs in the field of machine learning.

Why one needs to convey an important message, a scientific result as a theorem?

  1. Theorem is a most important message in the field of research.
  2. Theorem present results in the language of mathematics by virtue of generality and rigor.
  3. Theorems are at the heart of mathematics, they also play a central role in its aesthetics.

Theorems present the message immediately and left reasoning after. The direct narration put the reasoning first and left the results after.

  • How direct narration transform to fast narration?
  • How to find, state and prove theorems in our work?

This course show both narration styles. It refers to our educational study and our work experience:

  1. Educational mimic progression
    • Definition \(\to\) (Axiom set) \(\to\) Theorem \(\to\) Proof \(\to\) Corollaries \(\to\) Examples \(\to\) Impact to applications
  2. Scientific discovery progression
    • Application problems \(\to\) Problem generalisations \(\to\) Useful algebraic platform \(\to\) Definitions \(\to\) Axiom set

So in our practice we mimic the first part of progression, and then we learn to discover patterns and formulate theorems. The theoretical talks give us series of good examples.

Lecture topics

  1. Fundamental theorem of linear algebra S
  2. Singular values decomposition and spectral theorem W
  3. Gauss–Markov-(Aitken) theorem W
  4. Principal component analysis W
  5. Karhunen–Loève theorem W
  6. Kolmogorov–Arnold representation theorem W
  7. Universal approximation theorem by Cybenko W
  8. Deep neural network theorem
  9. No free lunch theorem by Wolpert W
  10. RKHS by Aronszajn and Mercer's theorem W
  11. Representer theorem by Schölkopf, Herbrich, and Smola W
  12. Convolution theorem (FT, convolution, correlation with CNN examples) W
  13. Fourier inversion theorem W
  14. Wiener–Khinchin theorem about autocorrelation and spectral decomposition W
  15. Parseval's theorem (and uniform, non-uniform convergence) W
  16. Probably approximately correct learning with the theorem about compression means learnability
  17. Bernstein–von Mises theorem W
  18. Holland's schema theorem W
  19. Variational approximation
  20. Convergence of random variables and Kloek's theorem W
  21. Exponential family of distributions and Nelder's theorem
  22. Multi-armed bandit theorem
  23. Copulas and Sklar's theorem W
  24. Boosting theorem Freud, Shapire, 1996, 1995
  25. Bootstrap theorem (statistical estimations): Ergodic theorem
  26. SVM theorem

Each lesson contains

  1. A lecturer's talk on one of fundamental theorems (\(40' = 30' + 10'\) discussion)
  2. Two students' talks (each \(20' = 15' + 5'\) discussion)

Each student delivers two talks

  1. On a theorem, which is formulated in a paper from the list of student thesis work's references
  2. On a theorem, which is formulated and proved by the student

It is welcome to

  • Make variants of our own formulations and proofs
  • Re-formulate significant messages of researchers and formulate these messages as theorems

Plan of a talk

  1. Introduction: the main message briefly
  2. If necessary (it could be introduced during the talk)
    1. Axiom sets
    2. Definitions
    3. Algebraic structures
    4. Notations
  3. Theorem formulation and exact proof
    1. The author's variant of the proof could be ameliorated
  4. Corollaries
  5. Theorem significance and applications

Typography

  • As one (or two) text page example, template to download
  • Please
    • set the font size \(\geqslant 14\)pt
    • include plots, diagrams, freehand drawings

The organization

Scoring

  • Talks and text 0-4 points, according to comparison
  • Out-of-schedule drops a half
  • The exam 2 points: schemes of proof of various theorems
    • time-limit test (as Physics state exam) and discussion
    • theorem formulation and poof scheme are hand-written
    • two random theorems from the list below, 10 min to write the text

Theorem types

  • Uniqueness, existence
  • Universality
  • Convergence
  • Complexity
  • Properties of estimations
  • Bounds

Schedule

Spring semester 2021

Student talks

Speaker References Thesis work
Bishuk Anton 17.2 link 31.3 link
Weiser Kirill 17.2 link 31.3 link
Grebenkova Olga 24.2 link 7.4 link
Gunaev Ruslan 24.2 link 7.4 link
Zholobov Vladimir 3.3 link 14.4 link
Islamov Rustem 3.3 link 14.4 link
Pankratov Victor 10.3 link 21.4 link
Savelyev Nikolay 10.3 link 21.4 link
Filatov Andrey 10.3 link 21.4 link
Filippova Anastasia 17.3 link 28.4 link
Khar Alexandra 17.3 link 28.4 link
Khristolyubov Maxim 24.3 link 5.5 link
Shokorov Vyacheslav 24.3 link 5.5 link

Invited talks

Speaker Link
Strijov, Potanin 10.2 link
Mark Potanin 17.2 link
Mark Potanin 24.2 link
Andriy Grabovyi 10.3 link
Andriy Grabovyi 17.3 link
Andriy Grabovyi 24.3 link
Mark Potanin 31.3 link
Oleg Bakhteev 7.4 link
Philipp Nikitin 14.4 link
Andriy Grabovyi 21.4 link
Oleg Bakhteev 5.5 link
Potanin, Strijov 12.5 Discussion
Potanin, Aduenko, Bakhteev 26.5 Exam

Out of schedule

  1. Three works by Greg Yang arXiv:1910.12478, arXiv:2006.14548, arXiv:2009.10685 Youtube Rus
  2. Theorems on flows by Johann Brehmera and Kyle Cranmera arXiv:2003.13913v2

References

  1. Mathematical statistics by A.A. Borovkov, 1998
  2. Learning Theory from First Principles by Francis Bach, 2021
  3. Theoretical foundations of potential function method in pattern recognition by M. A. Aizerman, E. M. Braverman, and L. I. Rozonoer // Avtomatica i Telemekhanica, 1964. Vol. 25, pp. 917-936.

Proof techniques

  1. Proofs and Mathematical Reasoning by Agata Stefanowicz, 2014
  2. The nuts and bolts of proofs by Antonella Cupillari, 2013
  3. Theorems, Corollaries, Lemmas, and Methods of Proof by Richard J. Rossi, 1956
  4. Problem Books in Mathematics by P.R. Halmos (editor), 1990
  5. Les contre-exemples en mathématique par Bertrand Hauchecorne, 2007
  6. Kolmogorov and Mathematical Logic by Vladimir A. Uspensky // The Journal of Symbolic Logic, Vol. 57, No. 2 (Jun., 1992), 385-412.
  7. Что такое аксиоматический метод? В.А. Успенский, 2001
  8. Аксиоматический метод. Е.Е. Золин, 2015

Methodology

  1. Introduction to Metamathematics by Stephen Cole Kleene, 1950
  2. Science and Method by Henry Poincare, 1908
  3. A Summary of Scientific Method by Peter Kosso, 2011
  4. Being a Researcher: An Informatics Perspective by Carlo Ghezzi, 2020
  5. The definitive glossary of higher mathematical jargon by Math Vault, 2015
  6. The definitive guide to learning higher mathematics: 10 principles to mathematical transcendence by Math Vault, 2020
  7. List of mathematical jargon on Wikipedia
  8. Пикабу. Типичные методы доказательства, 2018 (если вы чувствуете, что несет не туда)