Paying in Attention

Search

❯

❯

❯

Thesis Reading List

Thesis Reading List

Nov 14, 20247 min read

NeuralNetworks ApproximationTheory

Open questions:

Can every -Lipschitz function on a bounded domain be uniformly approximated by a shallow network? (this must be for dimension independent rates)
Is a heavy-tailed measure necessary for exponential depth separations?
Is approximation rate achievable for either
1. Oscillations grow at rate with a fast-decaying measure (Gaussian measure)
2. Oscillations grow rate with a heavy-tailed measure

Current reads:

Eldan & Shamir - The Power of Depth for Feedforward Neural Networks.pdf
Bubek & Selke - A Universal Law of Robustness via Isoperimetry.pdf
Hsu & Sanford - On the Approximation Power of Two-Layer Networks of Random ReLUs.pdf
Safran & Eldan - Depth Separations in Neural Networks- What is Actually Being Separated?.pdf
Safran & Reichman - Depth Separations in Neural Networks- Separating the Dimension from the Accuracy.pdf
Venturi & Bruna - Depth separation beyond radial functions.pdf

Textbooks & Theses:

Bach - Learning Theory from First Principles.pdf
Foucart’s - Expressiveness of Shallow Networks.pdf
Guhring - Approximation with Neural Networks from a Theoretical and Practical Perspective.pdf
Motamed - Approximation Power of Deep Neural Networks An explanatory mathematical survey.pdf
Peterson - Neural Network Theory.pdf
Telgarsky - Deep learning theory lecture notes.pdf
Telgarsky - Deep learning theory.pdf
Venturi - Architectural properties of neural networks for function approximation.pdf
Weinan & Ma - Towards a Mathematical Understanding of Neural Network-Based Machine Learning- what we know and what we don’t.pdf

Approximation Theory:

Christensen - An Introduction to Frames and Riesz Bases.pdf
Cohen, DeVore, Petrova - Optimal Stable Nonlinear Approximation.pdf
DeVore - DeVore - Optimal nonlinear approximation.pdf
DeVore, Hanin & Petrova - Neural Network Approximation.pdf
DeVore - Nonlinear Approximation.pdf
Handbook on Neural Information Processing.pdf

Harmonic & Functional Analysis:

Carl - Entropy Numbers, s-Numbers, and Eigenvalue Problems.pdf
Dahlke & Kutyniok - THE UNCERTAINTY PRINCIPLE ASSOCIATED WITH THE CONTINUOUS SHEARLET TRANSFORM.pdf
Folland - Real Analysis- Modern Techniques and Their Applications.pdf
Katznelson - An Introduction to Harmonic Analysis.pdf
Stein & Shakarchi - FUNCTIONAL ANALYSIS INTRODUCTION TO FURTHER Topics IN ANALYSIS.pdf
Talbut - A SHORT TOUR OF HARMONIC ANALYSIS.pdf
Xiao & He - Uncertainty Inequality for Radon Transform on the Heisenberg Group.pdf

Shallow networks:

Classical results

Breiman - Hinging Hyperplanes for Regression, Classification, and Function Approximation.pdf
Funahashi - On the Approximate Realization of Continuous Mappings by Neural Networks .pdf
Girosi & Anzellotti - Convergence Rates of Approximation by Translates.pdf
Jones - A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training.pdf
Kůrková - Kolmogorov’s Theorem Is Relevant .pdf
Barron - Universal Approximation Bounds for Superpositions of a Sigmoidal Function .pdf
Cybenko - Approximation by Superpositions of a Sigmoidal Function.pdf
Hornik - Multilayer Feedforward Networks are Universal Approximators .pdf
Kůrková - Dimension-Independent Rates of Approximation by Neural Networks.pdf

Approximation of Lipschitz Functions

Hsu & Sanford - On the Approximation Power of Two-Layer Networks of Random ReLUs.pdf

Infinite-width Shallow Networks/Neural Tangent Kernel (NTK)

Bach - On the relationship between multivariate splines and infinitely-wide neural networks.pdf
Ji & Telgarsky - Neural tangent kernels, transportation mappings, and universal approximation.pdf
Ongie & Soudry - A FUNCTION SPACE VIEW OF BOUNDED NORM INFINITE WIDTH RELU NETS- THE MULTIVARIATE CASE.pdf
Weinan - KOLMOGOROV WIDTH DECAY AND POOR APPROXIMATORS IN MACHINE LEARNING- SHALLOW NEURAL NETWORKS, RANDOM FEATURE MODELS AND NEURAL TANGENT KERNELS.pdf
Yehudai & Shamir - On the Power and Limitations of Random Features for Understanding Neural Networks.pdf

Shallow Networks as Integral Transform/Ridge Functions and Radon Transform methods

Candès - Harmonic Analysis of Neural Networks.pdf
Carroll - CONSTRUCTION OF NEURAL NETS USING THE RADON TRANSFORM.pdf
Ito - Representation of functions by superpositions of a step or sigmoid function and their applications to neural network theory.pdf
Klusowski & Barron - Approximation by combinations of relu and squared relu ridge functions with ℓ 1 and ℓ 0 controls.pdf
Kůrková - Integral Transforms Induced by Heaviside Perceptrons.pdf
Maiorov - On Best Approximation by Ridge Functions.pdf
Maiorov - Best approximation by ridge functions in Lp-spaces.pdf
Siegel - APPROXIMATION RATES FOR SHALLOW RELUk NEURAL NETWORKS ON SOBOLEV SPACES VIA THE RADON TRANSFORM.pdf
Sonoda - A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks.pdf
Unser - Ridges, Neural Networks, and the Radon Transform.pdf

Banach spaces of functions expressible by shallow networks

Parhi & Nowak - Banach Space Representer Theorems for Neural Networks and Ridge Splines.pdf
Spek - Duality for Neural Networks through Reproducing Kernel Banach Spaces.pdf
Weinan & Ma - The Barron Space and the Flow-induced Function Spaces for Neural Network Models (Springer).pdf

Misc

Bach - Breaking the Curse of Dimensionality with Convex Neural Networks.pdf
Divol & Niles-Weed - Optimal transport map estimation in general function spaces.pdf
Kainen, Kůrková & Voight - Approximation by neural networks is not continuous.pdf
Kůrková - Kolmogorov’s Theorem and Multilayer Neural Networks .pdf
Kůrková - Limitations of Shallow Networks.pdf
Kůrková - Representations of Highly-Varying Functions by One-Hidden-Layer Networks.pdf
Maiorov & Meir - On the near optimality of the stochastic approximation of smooth functions by neural networks.pdf
Siegel - High-order approximation rates for shallow neural networks with cosine and ReLUk activation functions.pdf
Siegel - Sharp Bounds on the Approximation Rates, Metric Entropy, and n-Widths of Shallow Neural Networks.pdf

Depth separations:

Amsel & Bruna - On the Benefits of Rank in Attention
Bu - DEPTH-WIDTH TRADE-OFFS FOR NEURAL NETWORKS VIA TOPOLOGICAL ENTROPY.pdf
Chatziafratis - Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems.pdf
Chatziafratis - Depth-Width Trade-offs for ReLU Networks via Sharkovsky’s Theorem.pdf
Daniely - Depth Separation for Neural Networks.pdf
Malach & Shalev-Shwartz - Is Deeper Better only when Shallow is Good?.pdf
Parkisons & Shamir - Depth Separation in Norm-Bounded Infinite-Width Neural Networks.pdf
Poggio - Why and When Can Deep – but Not Shallow – Networks Avoid the Curse of Dimensionality- a Review.pdf
Rim, Venturi, Bruna, and Peherstorfer - DEPTH SEPARATION FOR REDUCED DEEP NETWORKS IN NONLINEAR MODEL REDUCTION- DISTILLING SHOCK WAVES IN NONLINEAR HYPERBOLIC PROBLEMS.pdf
Safran & Shamir -Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks 1.pdf
Sanford, Hsu, Telgarsky - Representational Strengths and Limitations of Transformers.pdf
Telgarsky - Benefits of depth in neural networks.pdf
Vardi & Shamir - Width is Less Important than Depth in ReLU Neural Networks 1.pdf
Venturi & Bruna - Depth separation beyond radial functions.pdf
Yarotsky - Error bounds for approximations with deep ReLU networks.pdf
Yehudai, Shalev-Shwartz & Shamir The Connection Between Approximation, Depth Separation and Learnability in Neural Networks.pdf
Zweig & Bruna - Exponential Separations in Symmetric Neural Networks.pdf
Eldan & Shamir - The Power of Depth for Feedforward Neural Networks.pdf
Telgarsky - Representation Benefits of Deep Feedforward Networks.pdf

Deep networks:

Misc

Daubechies, DeVore, Foucart, Hanin & Petrova - Nonlinear Approximation and Deep ReLU Networks.pdf
Gonon - The necessity of depth for artificial neural networks to approximate certain classes of smooth and bounded functions without the curse of dimensionality.pdf
Hanin - Complexity of Linear Regions in Deep Networks.pdf
Hanin - Deep ReLU Networks Have Surprisingly Few Activation Patterns.pdf
Hanin - UNIVERSAL FUNCTION APPROXIMATION BY DEEP NEURAL NETS WITH BOUNDED WIDTH AND RELU ACTIVATIONS.pdf
Yarotsky - Optimal approximation of continuous functions by very deep ReLU networks.pdf
Yarotsky - The phase diagram of approximation rates for deep neural networks.pdf

Banach spaces of functions expressible by deep networks

Pahri & Nowak - What Kinds of Functions Do Deep Neural Networks Learn? Insights from Variational Spline Theory.pdf
Weinan - ON THE BANACH SPACES ASSOCIATED WITH MULTI-LAYER RELU NETWORKS.pdf

Expressivity and learning:

Bengio - Representation Learning- A Review and New Perspectives.pdf
Chen, Rotskoff, Bruna & Vanden-Eijden - A Dynamical Central Limit Theorem for Shallow Neural Networks.pdf
Chizat & Bach - On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport.pdf
Malach & Shalev-Shwartz - Is Deeper Better only when Shallow is Good?.pdf
Rotskoff & Vanden-Eijnden - TRAINABILITY AND ACCURACY OF NEURAL NETWORKS- AN INTERACTING PARTICLE SYSTEM APPROACH.pdf
Welper - Approximation Results for Gradient Descent trained Neural Networks.pdf
Wojtowytsch & Weinan - Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective.pdf

Miscellaneous Topics & Views:

PDEs

Weinan - Some observations on high-dimensional partial differential equations with Barron data.pdf

Quantized/bounded networks:

Güntürk - Approximation of functions with one-bit neural networks.pdf

Random approximations:

Rahimi & Recht - Uniform Approximation of Functions with Random Bases.pdf

Dynamical systems:

Li & Lin - Deep Learning via Dynamical Systems- An Approximation Perspective.pdf

Graph View

Backlinks

No backlinks found

NeuralNetworks
ApproximationTheory

Created with Quartz v4.2.3 © 2024

Portfolio