ALeSCo Project 1: Neural network training via persistence of excitation

Leitung:	Prof. Dr.-Ing. Matthias Müller, Dr. Victor Lopez
Jahr:	2025
Förderung:	Deutsche Forschungsgemeinschaft (DFG) - 535860958
Laufzeit:	2025 - 2029

Project Summary

As part of the research group “Active Learning for Dynamic Systems and Control (ALeSCo) - Data Informativity, Uncertainties and Guarantees”, this project develops and analyzes new methods for training neural networks (NN) using system and control theory concepts. The primary goal is to use control-theoretic methods to guarantee generalization properties of a trained NN, in particular error bounds for the deviation between the output of the trained NN and an NN with ideal weights. For this purpose, recent advances in stability analysis of optimization-based estimation methods (Moving Horizon Estimation, MHE) and data-based system representations based on nonlinear versions of Willems' Fundametal Lemma (FL) are exploited. In both cases, the training data must be informative enough, so active learning strategies are devised to identify suitable data points. The use of MHE, a method for state estimation of nonlinear systems, represents the first approach of this project. For this purpose, the update of the weights together with the initial equation of the NN is interpreted as a dynamic system.

The state estimation then provides weights for the NN that approximate the measured outputs. This requires appropriate modifications and extensions of existing MHE methods and recently developed techniques for robust stability analysis in order to apply them to the training of NNs and guarantee the desired generalization properties. The FL, in turn, provides a system representation with which any input/output trajectory of a dynamic system can be obtained from measurement data only and without identification of a parametric model. Since the input/output mapping of an NN can also be interpreted as a dynamic system, similar ideas can be used to describe the NN using only training data without explicitly identifying suitable weights. As a second approach in this project, new nonlinear extensions to the FL that utilize the known structure of the NN are developed for such a data-based representation.

Generalization properties are shown for the developed training methods if the training data fulfills certain informativeness conditions. In this project, we therefore investigate suitable concepts such as persistence of excitation and other informativeness conditions for the two training approaches mentioned above and different network architectures. In addition, active learning strategies are developed that fulfill such informativeness conditions in an effective way. To improve the practical applicability of the above training methods, strategies for reducing the computational effort are investigated. Finally, the developed methods are evaluated on different benchmark systems.

Zurück