Difference between revisions of "Mathematical forecasting"

Revision as of 00:23, 24 June 2024

Collecting links to Fall 2024

General

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems arxiv 2023
Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning upenn 2024
The Elements of Differentiable Programming arxiv 2024

Prerequisites

Understanding Deep Learning by Simon J.D. Prince mit 2023
Deep Learning by C.M. and H. Bishops Springer 2024 (online version)
A Geometric Approach to Differential Forms by David Bachman arxiv 2013
A Geometric Approach to Differential Forms by David Bachman arxiv 2013

1. Linear models

A Tutorial on Independent Component Analysis arxiv, 2014
On the Stability of Multilinear Dynamical Systems arxiv 2022
Tensor-based Regression Models and Applications by Ming Hou Thèse Uni-Laval 2017

arXiv:1205.3548v1

Sp

Spherical Harmonics in p Dimensions arxiv 2012
Physics of simple pendulum a case study of nonlinear dynamics RG 2008

SSM

2. Riemmanian models

SSA

3. Neural ODE

Neural Ordinary Differential Equations by Ricky T. Q. Chen et al. arxiv 2018
Neural Controlled Differential Equations for Irregular Time Series 'Patrick Kidger et al. arxiv 2020 github
Diffusion Normalizing Flow arxiv 2021

4. Neural SDE

Approximation of Stochastic Quasi-Periodic Responses of Limit Cycles in Non-Equilibrium Systems under Periodic Excitations and Weak Fluctuations mdpi entropy 2017 (great illustrations on the stochastic nature of a simple phase trajectory)
Approximation of Stochastic Quasi-Periodic Responses of Limit Cycles in Non-Equilibrium Systems under Periodic Excitations and Weak Fluctuations mdpi entropy 2017 (great illustrations on the stochastic nature of a simple phase trajectory)
Neural SDEs for Conditional Time Series Generation arxiv 2023 code github LSTM - CSig-WGAN
Neural SDEs as Infinite-Dimensional GANs 2021
Efficient and Accurate Gradients for Neural SDEs by Patrick Kidger arxiv 2021 code diffrax

5. PINN and Neural PDE

6. Chains and homology

Operator Learning: Algorithms and Analysis arxiv 2024
Homotopy theory for beginners by J.M. Moeller ku.dk 2015 (is it a pertinent link?)

Appendix

Neural Memory Networks stanford reports 2019
An Elementary Introduction to Information Geometry by Frank Nielsen [An Elementary Introduction to Information Geometry Frank Nielsen mdpi entropy
The Many Faces of Information Geometry by Frank Nielsen ams 2022 (short version)
Clifford Algebras and Dimensionality Reduction for Signal Separation by M. Guillemard Uni-Hamburg 2010 code
Special Finite Elements for Dipole Modelling by Martin Bauer Master Thesis Erlangen 2012 diff p-form must read

Motivation

This course delivers methods of model selection in machine learning and forecasting. The models are linear, tensor, deep neural networks, and neural differential equations. The modeling data are videos, audios, encephalograms, fMRIs, and other measurements in natural science. The practical examples are brain-computer interfaces, weather forecasting, and various spatial-time series forecasting. The lab works are organized as paper-with-code reports.

The course joins two parts of the problem statements in Machine Learning. The first part comes from the structure of the measured data. The data come from Physics, Chemistry, and Biology and have intrinsic algebraic structures. These structures are parts of the theory that stands behind the measurement. The second part comes from errors in the measurement. The stochastic nature of errors requires statistical methods of analysis. So this course joins algebra and statistics. It is devoted to the problem of predictive model selection.

Mathematical forecasting methods play a crucial role in scientific research and industry. The distinction between forecasting and machine learning methods lies in the algebraic structures. We build forecasting models not only in vector spaces but also in vector fields. These fields include time and space and have a continuous nature. We propose a holistic approach to teaching this course: we must consider mathematical methods that combine continuous-time high-dimensional vector and tensor fields. We discuss linear, differential, and non-linear models. We introduce model ensembles to reveal both the source and the target space dependencies.

Lectures

Main topics

Autoregression and singular structure analysis
Tensor decomposition and spatial-time models
Signal decoding and multi-modeling
Space alignment
Convergent cross-mapping and dynamic systems
Continuous-time forecasting and Neural ODEs

Fall semester

Introduction
- Semester overview, motivation, homework labs, exams
- Time and space in forecasting application problems
- Linear, neural, and memory forecasting models
Phase space approximate
- Singular spectrum analysis and forecasting
- k-linear forms, Principal component analysis
- Singular values decomposition
Basic models
- Cross-correlation
- Stochastic processes, autoregression, GARCH
- Non-parametric regression and kernels
- Error functions, residue convolution model, and analysis
Fourier transform
- Discrete transforms, wavelet transform
- Gabor transform and spectrogram
- 2d transform, Gerchberg–Saxton algorithm
Higher-order linear models
- Tensors and Penrose notation
- Tucker decomposition and alternated least squares
- Higher-order singular values decomposition
Neural models
- Convolutions for time and space
- Recursive, Hopefield, and Memory models
- Sequential models with attention
Canonical correlation analysis
- Projection to latent space
- PLS as SVD, model optimization, and selection
- Higher-order PLS
Time and space alignment
- Dynamic time warping
- Dynamic barycenter averaging
- Self-modeling regression
Causality detection
- Granger test
- Convergent cross-mapping
- Dynamic system and Taken's theorem
Differential models
- Residual neural networks
- Neuro-ODE and its solution
- Splines, Controlled neuro-ODE
State-space representation
- Linear differential models
- Partial differential models
- Memory models
Forecasting and control
- Control models
- Controllability and feedback
- Proportional integral derivative controller

Lab works

Current labworks, October 2022, is here Lab work contains a report in the pynb or TeX format and a talk with a discussion

Title and motivated abstract
Problem statement
Model, problem solution
Code, analysis, and illustrative plots
References

Note: the model is the personal contribution. The infrastructure: data acquisition, data uploads, error functions, and plots are welcome to be created collectively and shared.

Topics of the lab works (Fall)

Autoregressive forecasting – Singular structure Analysis
Spatial-time forecasting – Tensor decomposition
Signal decoding – Projection to latent space
Continuous-time forecasting – Neural differential equations

Example of the lab report

Put here

Format of lab works

Create a .pynb or .py file Surname2022Lab in the folder
The report also could be in the .tex file.
Find the format of your report above.
The computational experiment contains common part and individual part.
Common part:
1. use four short sample set [airplane], [electricity], [accelerometer hand motion], [video hand motion],
2. prepare the design matrix and target a scalar/vector for each time sample (in the form time, vecx, vecy),
3. set the forecast horizon, plot the forecast, and estimate the error.
Individual part:
1. select a lab work and specify your model (you can adopt any code available),
2. tune parameters, make your forecast according to the horizon,
3. write the report.
Error analysis is a part of the report:
plot of the forecast,
MAPE error (and your optimization error, if available) and its standard deviation,
prove your model has the optimal structure, try various structure parameters.

Details:

time refers to each sample (in unix or any useful format),
the horizon is an expected fundamental period,
note that the historical time ends before the forecasting period, it means we could use either historical data or the forecasted data (the historical data are not updated after history ends),
the forecasting protocol is in parer, text, slides by Nikita Uvarov.

Examples:

Discussion and collaboration

Exam and grading

Four lab works within deadlines and the exam on topics with problems and discussion. Each lab gives 2pt, and the exam gives 2pt, so 2*4+2=10.

Terminology and notation

Feature selection in Katrutsa A.M., Strijov V.V. 2017. A comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria // Expert Systems with Applications DOI
Tensor decomposition in Motrenko A.P., Strijov V.V. 2018. Multi-way feature selection for ECoG-based brain-computer interface // Expert Systems with Applications DOI
Signal decoding in R.V.IsachenkoV.V.Strijov. 2022. Quadratic programming feature selection for multicorrelated signal decoding with partial least squares // Expert Systems with Applications DOI
Forecasting schedule and horizon in Uvarov N.D. et al. 2018. Selecting the Superpositioning of Models // Computational Mathematics and Cybernetics DOI

Topics

Fall

Energy forecasting example
Regression
Linear model
Model selection call
Forecasting protocol
Error functions
Singular spectrum analysis
SSA forecasting
Forecasting protocols and verification (before AR)
Autoregression
Singular values decomposition (PCA, AE, Kar-Lo)
QPFS model selection
Auto, cross-correlation, cointegration
Diagrams for ML and PLS
Projection to latent space and relation to PCA, canonical-correlation analysis
PLS-QPFS model selection
Higher-order SSA
Tensor decomposition
Tensor model selection
HOPLS
Granger causality test
Convergent cross mapping
HOCCM to invent
Taken’s theorem
ResNet, Neural ODE
Adjoint and back-propagation
Flows and forecasting

Spring

Space state models
S4, Hippo, SaShiMi models
RNN, LSTM, attention, transformer models
Neural PDE, Lagrangian, Hamiltonian nns.
Directional regression
Harmonic functions
Phase extraction
Non-parametric regression and customer demand forecasting
Graph earth prediction
Convolutional models
Graph convolutions and spectrum
Fourier transform and phase retrieval problem
Radon transform and tomography reconstruction
Forward and inverse problems, kernel regularisation
Karhunen–Loeve theorem, FPCA
Parametric and non-parametric models
Reproductive kernel Hilbert space
Integral operators and Mercer theorem Convolution theorem
Graph convolution
Manifolds and local models
Statistics on Riemannian spaces
Statistics on stratified spaces

Appendix to Spring

Probabilistic diffusion and Graphs
Graph convolution, graph representation
Neural diffusion and PDEs, GRAND
Tensors and Ricci flow, and PDE
Remmannian, Ricci tensors
Differential forms
Metrics learning and SDP
Taken's ODE

Appendix-2 to Spring

ResNet, LSTM, etc
Neuro ODE, RK4
Controlled ODE, Visualization
BackProp
S4, memories
Graph convolution
Graph Laplacian
Differentiation of Graph Laplacian
Riemannian?? Py
GRAND
Neuro PDE, Galerkin
The inverse problem of brain signals
Laplacian, miltoniian NNs?

Does dot product create a metric space? // Only when the dot product is considered bilinear with at least the trivial metric tensor // Singular Spectrum, Phase Space Bing HAVOK!!!! // Linear models, SVD // Convolution / ARIMA - lags // State Space (+ Kalman, etc) + Control Theory // Tensor Decomposition (Tensor ARIMA with Tucker Decomposition) // Least Squares, Alternated Least Squares // Tensor convolution Generation and Decomposition // Feature Selection (lasso-Lars-style, QPFS, Genetic, Tensor Genetic) //

Geometric Algebra Differential geometry Fields, Shiefs, etc…

Catch-up references

Kolmogorov, A.N and Fomin, S.V.: Elements of the Theory of Functions and Functional Analysis, Dover Publications, 1999.
David Bachman: A Geometric Approach to Differential Forms, Birkhauser Boston, 2006.
At the Interface of Algebra and Statistics by Tai-Danae Bradley, 2020
Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators by Tailen Hsiing, Randall Eubank, 2013

@@ Line 34: / Line 34: @@
+==== SSA ====
+#
 ===3. Neural ODE===

Navigation menu