Functional Data Analysis

Intelligent Data Analysis 2024

The statistical analysis of spatial time series requires additional methods of data analysis. First, we suppose time is continuous, put to the state space changes \(\frac{d\mathbf{x}}{dt}\) and use neural ordinary and stochastic differential equations. Second, we analyze a multivariate and multidimensional time series and use the tensor representation and tensor analysis. Third, since the time series have significant cross-correlation we model them in the Riemannian space. Fourth, medical time series are periodic, the base model is the pendulum model, \(\frac{d^2x}{dt^2}=-c\sin{x}\). We use physics-informed neural networks to approximate data. Fifth, the practical experiments involve multiple data sources. We use canonical correlation analysis with latent state space. This space aligns the source and target spaces and generates data in source and target manifolds.

Applications

This field of Machine Learning applies to any field where the measurements have continuous time and space data acquired from multimodal sources: climate modeling, neural interfaces, solid-state physics, electronics, fluid dynamics, and many more. We will carefully collect both the theory and its practice.

Your profit

Your goal is to enhance your abilities to convey messages to the reader in the language of applied mathematics. The main part of your MS thesis work is the theoretical foundations of Machine Learning, where you present your personal results supported by the necessary theory.

Structure of a seminar

The semester has 10 weeks, and five couple of weeks for homework.

Odd week: introduction to the topic and handout of a theme for the homework.
Even week: a discussion of the essay, collecting the list of improvements to each essay.
Odd week: a discussion of the improved essay, putting the essays into a joint structure.

Scoring

Each essay brings one point, and each improvement brings one point. If an easy is perfect, no improvement is required, it counts as one plus one point. The threshold for binary decision is seven points.

The homework

The course gives two credits, so it requires time. The result is a two-page essay. It delivers an introduction to the designated topic. It could be automatically generated or collected from Wikipedia. The main requirement is that you be responsible for each statement of your essay. Each formula is yours.

The essay carries a comprehensive and strict answer to the topic question, illustrative plots are welcome. The result is ready to compile in a joint manuscript after the Even week. So please use the LaTeX template.

The style is the set theory, algebra, analysis, and Bayesian statistics. Category theory and homotopy theory are welcome.

This course gives you two credits, so it is 76/10 = 5 hours of weekly homework.

Templated and links

The course Git Hub to download the homework essays
The overleaf to compile the joint manuscript
The LaTeX template for an essay
The course chat to ask questions

Requirements for the text and the discussion

Comprehensive explanation of the method or the question we discuss
Only the principle, no experiments
Two-page text (more or less)
The reader is a second or third-year student
The picture is obligatory
However, a brief reference to some deep learning structure is welcome
Talk could be a slide or a text itself
The list of references with doi
Tell how it was generated
Observing a gap, put a note about it (to question later)

Style remarks for the essays

Automatic generation of mediocre-quality texts increased requirements for the quality of the new messages. It makes novelty rare and makes the authorship appreciated. But it simplifies the ways of delivering. So since textbook generation has become simple, we will use generative chats to train our skills of reader persuasion. The reader is our MS-thesis defense committee.

Additional remarks for clarification. Люди уже придумали все необходимое. Когда-то давно человечество развивалось очень бурно – постоянно менялись не только вещи, окружавшие людей, но и слова, которыми они пользовались. В те дни было много разных названий для творческого человека - инженер, поэт, ученый. И все они постоянно изобретали новое. Но это было детство человечества. А потом оно достигло зрелости. Творчество не исчезло - но оно стало сводиться к выбору из уже созданного. Говоря образно, мы больше не выращиваем виноград. Мы посылаем за бутылкой в погреб. Людей, которые занимаются этим, называют "сомелье". (В. Пелевин)

Avoid this style (reserved for the seminar)

Table of homeworks

These ten weeks we discuss the next five topics:

Multimodal data
Continous time and space models
Physics-informed models
Multilinear models
Riemannian spaces

Note that all these items enlighten stochastic-deterministic decomposition. So the questions include three parts:

deterministic model,
generative model,
stochastic-deterministic decomposition method.

See the questions below for your reference.

Multimodal data

First series of topics

Canonical Correlation Analysis
CCA in tensor representation
Kernel CCA in Hilbert and L2[a,b] spaces
CCA versus Cross-Attention Transformers
Generative CCA, diffusion, and flow
Comparative analysis of variants of CCA like PLS and others
Functional PCA

Continous models

Second series of topics

Neural ODE
Continous state space models
Continous normalizing flows
Ajoint method and continuous backpropagation
Neural Delayed Differential Equations
Neural PDE
S4 and Hippo models
Rimannian continuous models

Physics-Informed models

Third series of topics

PINNs as multimodels
Spherical harmonics in p dimensions (an IMU example is welcome)
PDF and Physics-Informed learning
Integral Transforms in Physics-Informed learning

Multilinear models and topology

Third series of topics

Cliffort or Geometric algebra in machine learning
Tensor models, tensor decomposition, and approximation (tensor PLS pr CCA)
Machine learning models for tensors: Field Equation (Yang-Mills Equations_
Machine learning models for theoretical physics (Maxwell’s Equations, Navier-Stocks)
Persistent homology and dimensionality reduction (say, arXiv:2302.03447 with embedding delays)

General

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems arxiv 2023
Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning upenn 2024
The Elements of Differentiable Programming arxiv 2024
The list from the previous year 2023.

Prerequisites

Understanding Deep Learning by Simon J.D. Prince mit 2023
Deep Learning by C.M. and H. Bishops Springer 2024 (online version)
A Geometric Approach to Differential Forms by David Bachman arxiv 2013
Advanced Calculus: Geometric View by James J. Callahan pdf 2010, collection
Geometric Deep Learning by Michael M. Bronstein arxiv 2021

Linear and bilinear models

A Tutorial on Independent Component Analysis arxiv, 2014
On the Stability of Multilinear Dynamical Systems arxiv 2022
Tensor-based Regression Models and Applications by Ming Hou Thèse Uni-Laval 2017

Tensor models

Tensor Canonical Correlation Analysis for Multi-view Dimension Reduction [1] (Semkin)

Spherical Harmonics

Spherical Harmonics in p Dimensions arxiv 2012
Physics of simple pendulum a case study of nonlinear dynamics RG 2008
Time series forecasting using manifold learning, 2021 arxiv
Time-series forecasting using manifold learning, radial basis function interpolation, and geometric harmonics 2022 Chaos AIP

State Space Models

Missing Slice Recovery for Tensors Using a Low-rank Model in Embedded Space arxiv 2018

SSM Generative Models

Masked Autoregressive Flow for Density Estimation arxiv 2017

SSM+Riemann+Gaussian process regression

Time-series forecasting using manifold learning, radial basis function interpolation, and geometric harmonics by Ioannis G. Kevrekidis,3 and Constantinos Siettos, 2022 pdf

Physics-Informed Neural Networks

Three ways to solve partial differential equations with neural networks — A review arxiv 2021
NeuPDE: Neural Network Based Ordinary and Partial Differential Equations for Modeling Time-Dependent Data arxiv 2019
Physics-based deep learning code
PINN by Steve Burton yt
Process Model Inversion in the Data-Driven Engineering Context for Improved Parameter Sensitivities mdpi processes 2022 (nice connection pictures)
Physics-based Deep Learning github
Integral Transforms in a Physics-Informed (Quantum) Neural Network setting arxiv 2022

Riemmanian models

Riemannian Continuous Normalizing Flows arxiv 2020

Continous time, Neural ODE

Neural Spatio-Temporal Point Processes by Ricky Chen et al. iclr 2021 (likelihood for time and space)
Neural Ordinary Differential Equations by Ricky Chen et al. arxiv 2018
Neural Controlled Differential Equations for Irregular Time Series 'Patrick Kidger et al. arxiv 2020 github
Diffusion Normalizing Flow arxiv 2021
Differentiable Programming for Differential Equations: A Review arxiv 2024
(code tutorial) Deep Implicit Layers - Neural ODEs, Deep Equilibirum Models, and Beyond nips 2020
(code tutorial) 2021
Neural CDE and tensors IEEE, IEEE

Graph and PDEs

Fourier Neural Operator for Parametric Partial Differential Equations arxiv 2020
Masked Attention is All You Need for Graphs arxiv 2024

Neural SDE

Approximation of Stochastic Quasi-Periodic Responses of Limit Cycles in Non-Equilibrium Systems under Periodic Excitations and Weak Fluctuations mdpi entropy 2017 (great illustrations on the stochastic nature of a simple phase trajectory)
Approximation of Stochastic Quasi-Periodic Responses of Limit Cycles in Non-Equilibrium Systems under Periodic Excitations and Weak Fluctuations mdpi entropy 2017 (great illustrations on the stochastic nature of a simple phase trajectory)
Neural SDEs for Conditional Time Series Generation arxiv 2023 code github LSTM - CSig-WGAN
Neural SDEs as Infinite-Dimensional GANs 2021
Efficient and Accurate Gradients for Neural SDEs by Patrick Kidger arxiv 2021 code diffrax

Chains and homology

Operator Learning: Algorithms and Analysis arxiv 2024
Homotopy theory for beginners by J.M. Moeller ku.dk 2015 (is it a pertinent link?)
Explorations in Homeomorphic Variational Auto-Encoding arxiv 2018
Special Finite Elements for Dipole Modelling master thesis Bauer 2011
Selecting embedding delays: An overview of embedding techniques and a new method using persistent homology arxiv 2023 (denis)

Appendix

Neural Memory Networks stanford reports 2019
An Elementary Introduction to Information Geometry by Frank Nielsen [An Elementary Introduction to Information Geometry Frank Nielsen mdpi entropy
The Many Faces of Information Geometry by Frank Nielsen ams 2022 (short version)
Clifford Algebras and Dimensionality Reduction for Signal Separation by M. Guillemard Uni-Hamburg 2010 code
Special Finite Elements for Dipole Modelling by Martin Bauer Master Thesis Erlangen 2012 diff p-form must read
Bayesian model selection for complex dynamic systems 2018
Visualizing 3-Dimensional Manifolds by Dugan J. Hammock 2013 umass
At the Interface of Algebra and Statistics by T-D. Bradley arxiv 2020
Time Series Handbook by Borja, 2021 github

Navigation menu