Difference between revisions of "Mathematical forecasting"

From Research management course
Jump to: navigation, search
 
(11 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
  |description=This course delivers model selection methods in machine learning and forecasting. The models are linear, tensor, deep neural networks, and neural differential equations.
 
  |description=This course delivers model selection methods in machine learning and forecasting. The models are linear, tensor, deep neural networks, and neural differential equations.
 
  }}
 
  }}
==Collecting links to Fall 2024==
+
The additional part moved to [[Functional Data Analysis]]  
===General===
 
# Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems [https://arxiv.org/abs/2307.08423 arxiv 2023]
 
# Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning [https://www.cis.upenn.edu/~jean/math-deep.pdf upenn 2024]
 
# The Elements of Differentiable Programming [https://arxiv.org/abs/2403.14606 arxiv 2024]
 
 
 
===Prerequisites===
 
# Understanding Deep Learning ''by Simon J.D. Prince'' [https://udlbook.github.io/udlbook/ mit 2023]
 
# Deep Learning by ''C.M. and H. Bishops'' [https://www.bishopbook.com/ Springer 2024] (online version)
 
# A Geometric Approach to Differential Forms ''by David Bachman'' [https://arxiv.org/abs/math/0306194v1 arxiv 2013]
 
# A Geometric Approach to Differential Forms ''by David Bachman'' [https://arxiv.org/abs/math/0306194v1 arxiv 2013]
 
 
 
===1. Linear models===
 
# A Tutorial on Independent Component Analysis [https://arxiv.org/abs/1404.2986 arxiv, 2014]
 
# On the Stability of Multilinear Dynamical Systems [https://arxiv.org/abs/2105.01041 arxiv 2022]
 
# Tensor-based Regression Models and Applications ''by Ming Hou'' Thèse [https://core.ac.uk/download/pdf/442636056.pdf Uni-Laval 2017]
 
 
 
====Sp====
 
# Spherical Harmonics in p Dimensions [https://arxiv.org/abs/1205.3548 arxiv 2012]
 
# Physics of simple pendulum a case study of nonlinear dynamics [https://www.researchgate.net/publication/332766499_Physics_of_simple_pendulum_a_case_study_of_nonlinear_dynamics RG 2008]
 
 
 
====SSM====
 
# Missing Slice Recovery for Tensors Using a Low-rank Model in Embedded Space [https://arxiv.org/abs/1804.01736 arxiv 2018]
 
 
 
====SSM+generative====
 
# ('''FLOW tex source''') Masked Autoregressive Flow for Density Estimation [https://arxiv.org/abs/1705.07057 arxiv 2017]
 
 
 
===PINN===
 
# Three ways to solve partial differential equations with neural networks — A review [https://arxiv.org/abs/2102.11802 arxiv 2021]
 
# NeuPDE: Neural Network Based Ordinary and Partial Differential Equations for Modeling Time-Dependent Data [https://arxiv.org/abs/1908.03190 arxiv 2019]
 
# Physics-based deep learning [https://www.physicsbaseddeeplearning.org/intro-teaser.html code]
 
# PINN by Steve Burton [https://www.youtube.com/watch?v=g-S0m2zcKUg&list=PLMrJAkhIeNNQ0BaKuBKY43k4xMo6NSbBa&index=3 yt]
 
 
 
===2. Riemmanian models===
 
<!--No need to put CCM in this semester -->
 
 
 
 
 
==== SSA ====
 
#
 
 
 
 
 
 
 
==== Generative ====
 
# Riemannian Continuous Normalizing Flows [https://arxiv.org/abs/2006.10605 arxiv  2020]
 
 
 
===3. Neural ODE===
 
Neural Spatio-Temporal Point Processes ''by Ricky Chen et al.'' [https://arxiv.org/abs/2011.04583 iclr 2021] (likelihood for time and space)
 
# Neural Ordinary Differential Equations ''by Ricky Chen et al.'' [https://arxiv.org/abs/1806.07366 arxiv 2018]
 
# Neural Controlled Differential Equations for Irregular Time Series 'Patrick Kidger et al.'' [https://arxiv.org/abs/2005.08926 arxiv 2020][https://github.com/patrick-kidger/NeuralCDE github]
 
# Diffusion Normalizing Flow [https://arxiv.org/pdf/2110.07579 arxiv 2021]
 
# Differentiable Programming for Differential Equations: A Review [https://arxiv.org/abs/2406.09699 arxiv 2024]
 
# (code tutorial) Deep Implicit Layers - Neural ODEs, Deep Equilibirum Models, and Beyond [https://implicit-layers-tutorial.org/ nips 2020]
 
# (code tutorial) [https://www.physicsbaseddeeplearning.org/overview-ns-forw.html  2021]
 
 
 
====CDE====
 
Neural CDE and tensors
 
https://ieeexplore.ieee.org/abstract/document/9979806
 
https://ieeexplore.ieee.org/abstract/document/9533771
 
 
 
=== 4. Graph and PDEs ===
 
# Fourier Neural Operator for Parametric Partial Differential Equations [https://arxiv.org/abs/2010.08895 arxiv 2020]
 
 
 
==supplimentary==
 
# Masked Attention is All You Need for Graphs [https://arxiv.org/abs/2402.10793 arxiv 2024]
 
 
 
===4. Neural SDE===
 
# Approximation of Stochastic Quasi-Periodic Responses of Limit Cycles in Non-Equilibrium Systems under Periodic Excitations and Weak Fluctuations [https://doi.org/10.3390/e19060280 mdpi entropy 2017] (great illustrations on the stochastic nature of a simple phase trajectory)
 
# Approximation of Stochastic Quasi-Periodic Responses of Limit Cycles in Non-Equilibrium Systems under Periodic Excitations and Weak Fluctuations [https://doi.org/10.3390/e19060280 mdpi entropy 2017] (great illustrations on the stochastic nature of a simple phase trajectory)
 
# Neural SDEs for Conditional Time Series Generation [https://arxiv.org/abs/2301.01315 arxiv 2023] code [https://github.com/pere98diaz/Neural-SDEs-for-Conditional-Time-Series-Generation-and-the-Signature-Wasserstein-1-metric github LSTM - CSig-WGAN]
 
# Neural SDEs as Infinite-Dimensional GANs [https://arxiv.org/pdf/2102.03657 2021]
 
# Efficient and Accurate Gradients for Neural SDEs ''by Patrick Kidger'' [https://arxiv.org/pdf/2105.13493 arxiv 2021] code [https://docs.kidger.site/diffrax/examples/neural_sde/ diffrax]
 
 
 
===5. PINN and Neural PDE===
 
# Process Model Inversion in the Data-Driven Engineering Context for Improved Parameter Sensitivities [https://www.mdpi.com/2227-9717/10/9/1764 mdpi processes 2022] ('''nice connection pictures''')
 
 
 
===6. Chains and homology===
 
# Operator Learning: Algorithms and Analysis [https://arxiv.org/pdf/2402.15715 arxiv 2024]
 
# Homotopy theory for beginners by J.M. Moeller [https://web.math.ku.dk/~moller/e01/algtopI/comments.pdf ku.dk 2015] (is it a pertinent link?)
 
 
 
====To research====
 
# Explorations in Homeomorphic Variational Auto-Encoding [https://arxiv.org/abs/1807.04689 arxiv 2018]
 
# Special Finite Elements for Dipole Modelling ''master thesis Bauer'' [https://www.sci.utah.edu/~wolters/PaperWolters/2012/BauerMaster.pdf 2011]
 
 
 
===Appendix===
 
# Neural Memory Networks [https://cs229.stanford.edu/proj2015/367_report.pdf stanford reports 2019]
 
# An Elementary Introduction to Information Geometry ''by Frank Nielsen'' [An Elementary Introduction to Information Geometry Frank Nielsen [https://doi.org/10.3390/e22101100 mdpi entropy]
 
# The Many Faces of Information Geometry ''by Frank Nielsen'' [https://www.ams.org/journals/notices/202201/rnoti-p36.pdf ams 2022] (short version)
 
# Clifford Algebras and Dimensionality Reduction for Signal Separation ''by [https://www.math.uni-hamburg.de/home/guillemard/ M. Guillemard]''  [https://www.math.uni-hamburg.de/home/guillemard/papers/clifford7.pdf Uni-Hamburg 2010][https://www.math.uni-hamburg.de/home/guillemard/clifford/ code]
 
# Special Finite Elements for Dipole Modelling ''by Martin Bauer'' Master Thesis [https://www.sci.utah.edu/~wolters/PaperWolters/2012/BauerMaster.pdf Erlangen 2012] diff p-form must read
 
# Bayesian model selection for complex dynamic systems [https://www.nature.com/articles/s41467-018-04241-5 2018]
 
# Visualizing 3-Dimensional Manifolds ''by  Dugan J. Hammock'' [https://archive.bridgesmathart.org/2013/bridges2013-551.pdf 2013 umass]
 
# At the Interface of Algebra and Statistics by ''T-D. Bradley'' [https://arxiv.org/abs/2004.05631 arxiv 2020]
 
# Time Series Handbook by Borja, 2021 [https://github.com/phdinds-aim/time_series_handbook github]
 
  
 
==Motivation==
 
==Motivation==
Line 104: Line 12:
 
The course joins two parts of the problem statements in Machine Learning. The first part comes from the structure of the measured data. The data come from Physics, Chemistry, and Biology and have intrinsic algebraic structures. These structures are parts of the theory that stands behind the measurement. The second part comes from errors in the measurement. The stochastic nature of errors requires statistical methods of analysis. So this course joins algebra and statistics. It is devoted to the problem of predictive model selection.
 
The course joins two parts of the problem statements in Machine Learning. The first part comes from the structure of the measured data. The data come from Physics, Chemistry, and Biology and have intrinsic algebraic structures. These structures are parts of the theory that stands behind the measurement. The second part comes from errors in the measurement. The stochastic nature of errors requires statistical methods of analysis. So this course joins algebra and statistics. It is devoted to the problem of predictive model selection.
  
Mathematical forecasting methods play a crucial role in scientific research and industry. The distinction between forecasting and machine learning methods lies in the algebraic structures. We build forecasting models not only in vector spaces but also in vector fields. These fields include time and space and have a continuous nature. We propose a holistic approach to teaching this course: we must consider mathematical methods that combine continuous-time high-dimensional vector and tensor fields. We discuss linear, differential, and non-linear models. We introduce model ensembles to reveal both the source and the target space dependencies.
+
Mathematical forecasting methods play a crucial role in scientific research and industry. The distinction between forecasting and machine learning methods lies in the algebraic structures. We build forecasting models not only in vector spaces but also in vector fields. These fields include time and space and have a continuous nature. We propose a holistic approach to teaching this course: we must consider mathematical methods that combine continuous-time high-dimensional vector and tensor fields. We discuss linear, differential, and non-linear models. We introduce model ensembles to reveal the source and target space dependencies.
 +
 
 +
<!--
 +
My specialization is Mathematical Forecasting. I present a new view of this field of knowledge: the forecasting problems deal not with the vector spaces but with the vector fields. The main subject is vector and tensor fields over time and space. The modeling data are spatial-time series: audio-video streams, brain signals, and images, biomedical live measurements, wearable device sensor signals, and other signals in biology and physics. The practical applications for study and labwork are brain-computer interface, human motion, and human health monitoring. The course is organized into eight sections: autoregressive models, tensor decomposition, canonic correlation analysis, continuous-time analysis, dynamic systems, spatial-time alignment, metrics learning, and diffusion-graphical models. Each section runs labwork with various practical applications in Python.
 +
-->
  
 
== Lectures ==  
 
== Lectures ==  
Line 178: Line 90:
 
'''[https://bit.ly/3QAOYPd Current labworks, October 2022, is here]'''
 
'''[https://bit.ly/3QAOYPd Current labworks, October 2022, is here]'''
 
Lab work contains a report in the pynb or TeX format and a talk with a discussion
 
Lab work contains a report in the pynb or TeX format and a talk with a discussion
#Title and motivated abstract
+
# Title and motivated abstract
#Problem statement
+
# Problem statement
#Model, problem solution
+
# Model, problem solution
#Code, analysis, and illustrative plots
+
# Code, analysis, and illustrative plots
 
#References
 
#References
Note: '''the model''' is the '''personal''' contribution. The infrastructure: data acquisition, data uploads, error functions, and plots are welcome to be created '''collectively''' and shared.
+
Note: '''the model''' is the '''personal''' contribution. The infrastructure: data acquisition, data uploads, error functions, and plots are welcome to be created '''collectively''' and shared.
 
===Topics of the lab works (Fall)===
 
===Topics of the lab works (Fall)===
 
*Autoregressive forecasting – Singular structure Analysis
 
*Autoregressive forecasting – Singular structure Analysis

Latest revision as of 23:14, 7 October 2024

The additional part moved to Functional Data Analysis

Motivation

This course delivers methods of model selection in machine learning and forecasting. The models are linear, tensor, deep neural networks, and neural differential equations. The modeling data are videos, audios, encephalograms, fMRIs, and other measurements in natural science. The practical examples are brain-computer interfaces, weather forecasting, and various spatial-time series forecasting. The lab works are organized as paper-with-code reports.

The course joins two parts of the problem statements in Machine Learning. The first part comes from the structure of the measured data. The data come from Physics, Chemistry, and Biology and have intrinsic algebraic structures. These structures are parts of the theory that stands behind the measurement. The second part comes from errors in the measurement. The stochastic nature of errors requires statistical methods of analysis. So this course joins algebra and statistics. It is devoted to the problem of predictive model selection.

Mathematical forecasting methods play a crucial role in scientific research and industry. The distinction between forecasting and machine learning methods lies in the algebraic structures. We build forecasting models not only in vector spaces but also in vector fields. These fields include time and space and have a continuous nature. We propose a holistic approach to teaching this course: we must consider mathematical methods that combine continuous-time high-dimensional vector and tensor fields. We discuss linear, differential, and non-linear models. We introduce model ensembles to reveal the source and target space dependencies.


Lectures

Main topics

  1. Autoregression and singular structure analysis
  2. Tensor decomposition and spatial-time models
  3. Signal decoding and multi-modeling
  4. Space alignment
  5. Convergent cross-mapping and dynamic systems
  6. Continuous-time forecasting and Neural ODEs

Fall semester

  1. Introduction
    • Semester overview, motivation, homework labs, exams
    • Time and space in forecasting application problems
    • Linear, neural, and memory forecasting models
  2. Phase space approximate
    • Singular spectrum analysis and forecasting
    • k-linear forms, Principal component analysis
    • Singular values decomposition
  3. Basic models
    • Cross-correlation
    • Stochastic processes, autoregression, GARCH
    • Non-parametric regression and kernels
    • Error functions, residue convolution model, and analysis
  4. Fourier transform
    • Discrete transforms, wavelet transform
    • Gabor transform and spectrogram
    • 2d transform, Gerchberg–Saxton algorithm
  5. Higher-order linear models
    • Tensors and Penrose notation
    • Tucker decomposition and alternated least squares
    • Higher-order singular values decomposition
  6. Neural models
    • Convolutions for time and space
    • Recursive, Hopefield, and Memory models
    • Sequential models with attention
  7. Canonical correlation analysis
    • Projection to latent space
    • PLS as SVD, model optimization, and selection
    • Higher-order PLS
  8. Time and space alignment
    • Dynamic time warping
    • Dynamic barycenter averaging
    • Self-modeling regression
  9. Causality detection
    • Granger test
    • Convergent cross-mapping
    • Dynamic system and Taken's theorem
  10. Differential models
    • Residual neural networks
    • Neuro-ODE and its solution
    • Splines, Controlled neuro-ODE
  11. State-space representation
    • Linear differential models
    • Partial differential models
    • Memory models
  12. Forecasting and control
    • Control models
    • Controllability and feedback
    • Proportional integral derivative controller

Lab works

Current labworks, October 2022, is here Lab work contains a report in the pynb or TeX format and a talk with a discussion

  1. Title and motivated abstract
  2. Problem statement
  3. Model, problem solution
  4. Code, analysis, and illustrative plots
  5. References

Note: the model is the personal contribution. The infrastructure: data acquisition, data uploads, error functions, and plots are welcome to be created collectively and shared.

Topics of the lab works (Fall)

  • Autoregressive forecasting – Singular structure Analysis
  • Spatial-time forecasting – Tensor decomposition
  • Signal decoding – Projection to latent space
  • Continuous-time forecasting – Neural differential equations

Example of the lab report

  • Put here

Format of lab works

  1. Create a .pynb or .py file Surname2022Lab in the folder
  2. The report also could be in the .tex file.
  3. Find the format of your report above.
  4. The computational experiment contains common part and individual part.
  5. Common part:
    1. use four short sample set [airplane], [electricity], [accelerometer hand motion], [video hand motion],
    2. prepare the design matrix and target a scalar/vector for each time sample (in the form time, vecx, vecy),
    3. set the forecast horizon, plot the forecast, and estimate the error.
  6. Individual part:
    1. select a lab work and specify your model (you can adopt any code available),
    2. tune parameters, make your forecast according to the horizon,
    3. write the report.
  7. Error analysis is a part of the report:
  8. plot of the forecast,
  9. MAPE error (and your optimization error, if available) and its standard deviation,
  10. prove your model has the optimal structure, try various structure parameters.

Details:

  1. time refers to each sample (in unix or any useful format),
  2. the horizon is an expected fundamental period,
  3. note that the historical time ends before the forecasting period, it means we could use either historical data or the forecasted data (the historical data are not updated after history ends),
  4. the forecasting protocol is in parer, text, slides by Nikita Uvarov.

Examples:

  1. Old format of the report
  2. Code and project
  3. Previous project from Sourceforge.net

Discussion and collaboration

Exam and grading

Four lab works within deadlines and the exam on topics with problems and discussion. Each lab gives 2pt, and the exam gives 2pt, so 2*4+2=10.

Terminology and notation

  1. Feature selection in Katrutsa A.M., Strijov V.V. 2017. A comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria // Expert Systems with Applications DOI
  2. Tensor decomposition in Motrenko A.P., Strijov V.V. 2018. Multi-way feature selection for ECoG-based brain-computer interface // Expert Systems with Applications DOI
  3. Signal decoding in R.V.IsachenkoV.V.Strijov. 2022. Quadratic programming feature selection for multicorrelated signal decoding with partial least squares // Expert Systems with Applications DOI
  4. Forecasting schedule and horizon in Uvarov N.D. et al. 2018. Selecting the Superpositioning of Models // Computational Mathematics and Cybernetics DOI

Topics

Fall

  • Energy forecasting example
  • Regression
  • Linear model
  • Model selection call
  • Forecasting protocol
  • Error functions
  • Singular spectrum analysis
  • SSA forecasting
  • Forecasting protocols and verification (before AR)
  • Autoregression
  • Singular values decomposition (PCA, AE, Kar-Lo)
  • QPFS model selection
  • Auto, cross-correlation, cointegration
  • Diagrams for ML and PLS
  • Projection to latent space and relation to PCA, canonical-correlation analysis
  • PLS-QPFS model selection
  • Higher-order SSA
  • Tensor decomposition
  • Tensor model selection
  • HOPLS
  • Granger causality test
  • Convergent cross mapping
  • HOCCM to invent
  • Taken’s theorem
  • ResNet, Neural ODE
  • Adjoint and back-propagation
  • Flows and forecasting

Spring

  • Space state models
  • S4, Hippo, SaShiMi models
  • RNN, LSTM, attention, transformer models
  • Neural PDE, Lagrangian, Hamiltonian nns.
  • Directional regression
  • Harmonic functions
  • Phase extraction
  • Non-parametric regression and customer demand forecasting
  • Graph earth prediction
  • Convolutional models
  • Graph convolutions and spectrum
  • Fourier transform and phase retrieval problem
  • Radon transform and tomography reconstruction
  • Forward and inverse problems, kernel regularisation
  • Karhunen–Loeve theorem, FPCA
  • Parametric and non-parametric models
  • Reproductive kernel Hilbert space
  • Integral operators and Mercer theorem Convolution theorem
  • Graph convolution
  • Manifolds and local models
  • Statistics on Riemannian spaces
  • Statistics on stratified spaces

Appendix to Spring

  • Probabilistic diffusion and Graphs
  • Graph convolution, graph representation
  • Neural diffusion and PDEs, GRAND
  • Tensors and Ricci flow, and PDE
  • Remmannian, Ricci tensors
  • Differential forms
  • Metrics learning and SDP
  • Taken's ODE

Appendix-2 to Spring

  • ResNet, LSTM, etc
  • Neuro ODE, RK4
  • Controlled ODE, Visualization
  • BackProp
  • S4, memories
  • Graph convolution
  • Graph Laplacian
  • Differentiation of Graph Laplacian
  • Riemannian?? Py
  • GRAND
  • Neuro PDE, Galerkin
  • The inverse problem of brain signals
  • Laplacian, miltoniian NNs?

Does dot product create a metric space? // Only when the dot product is considered bilinear with at least the trivial metric tensor // Singular Spectrum, Phase Space Bing HAVOK!!!! // Linear models, SVD // Convolution / ARIMA - lags // State Space (+ Kalman, etc) + Control Theory // Tensor Decomposition (Tensor ARIMA with Tucker Decomposition) // Least Squares, Alternated Least Squares // Tensor convolution Generation and Decomposition // Feature Selection (lasso-Lars-style, QPFS, Genetic, Tensor Genetic) //


Geometric Algebra Differential geometry Fields, Shiefs, etc…


Catch-up references

  1. Kolmogorov, A.N and Fomin, S.V.: Elements of the Theory of Functions and Functional Analysis, Dover Publications, 1999.
  2. David Bachman: A Geometric Approach to Differential Forms, Birkhauser Boston, 2006.
  3. At the Interface of Algebra and Statistics by Tai-Danae Bradley, 2020
  4. Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators by Tailen Hsiing, Randall Eubank, 2013