Login / Signup

Interplay between depth and width for interpolation in neural ODEs.

Antonio Álvarez-LópezArselane Hadj SlimaneEnrique Zuazua
Published in: Neural networks : the official journal of the International Neural Network Society (2024)
Neural ordinary differential equations have emerged as a natural tool for supervised learning from a control perspective, yet a complete understanding of the role played by their architecture remains elusive. In this work, we examine the interplay between the width p and the number of transitions between layers L (corresponding to a depth of L+1). Specifically, we construct explicit controls interpolating either a finite dataset D, comprising N pairs of points in R d , or two probability measures within a Wasserstein error margin ɛ>0. Our findings reveal a balancing trade-off between p and L, with L scaling as 1+O(N/p) for data interpolation, and as 1+Op -1 +(1+p) -1 ɛ -d for measures. In the high-dimensional and wide setting where d,p>N, our result can be refined to achieve L=0. This naturally raises the problem of data interpolation in the autonomous regime, characterized by L=0. We adopt two alternative approaches: either controlling in a probabilistic sense, or by relaxing the target condition. In the first case, when p=N we develop an inductive control strategy based on a separability assumption whose probability increases with d. In the second one, we establish an explicit error decay rate with respect to p which results from applying a universal approximation theorem to a custom-built Lipschitz vector field interpolating D.
Keyphrases
  • electronic health record
  • optical coherence tomography
  • big data
  • machine learning
  • genome wide