Convergence of a multilayer perceptron to histogram Bayesian regression

Мұқаба

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат Ашық рұқсат
Рұқсат жабық Рұқсат берілді
Рұқсат жабық Тек жазылушылар үшін

Аннотация

The problem of enhancing the interpretability and consistency of Baysesian classifier solutions in approximating the empirical data by means of a multilayer perceptron is under consideration. Histogram regression preserves transparency and statistical interpretation but is limited by memory requirements ($O(n)$) and weak scalability, while a multilayer perceptron provides a memory efficient representation ($O(1)$)and high computational efficiency in combination with limited interpretability. The focus is on a unary learning scheme, when the training sample consists of examples in the same target class and additional background points which are uniformly distributed over a compact subset of the feature space. This approach enables one to treat each class separately and implement the failure mechanism outside the data support, which enhances the model reliability. It is proposed to consider the perceptron output as a consistent analogue of the histogram class interval induced by the linearity cells of the perceptron. It is proved that under the natural assumptions of regularity and controlled growth of architecture the output function of a multilayer perseptron is consistent and equivalent to a histogram estimator. Theoretical consistency is rigorously ðroved in the case of a fixed first layer, while numerical experiments confirm the applicability of the results to models all of whose layers are trained. Thus histogram interpretation ensures the statistical verification of the consistency of perceptron approximation and addscredibility to classification solutions in the framework of a unary model.

Авторлар туралы

Nikita Eliseev

Ivannikov Institute for System Programming of the Russian Academy of Sciences

Email: neliseev@ispras.ru

Andrey Perminov

Ivannikov Institute for System Programming of the Russian Academy of Sciences

Email: perminov@ispras.ru
ORCID iD: 0000-0001-8047-0114

Denis Turdakov

Ivannikov Institute for System Programming of the Russian Academy of Sciences; Research Center of the Trusted Artificial Intelligence ISP RAS

Email: turdakov@ispras.ru
ORCID iD: 0000-0001-8745-0984

Әдебиет тізімі

  1. M. Csikos, N. H. Mustafa, A. Kupavskii, “Tight lower bounds on the VC-dimension of geometric set systems”, J. Mach. Learn. Res., 20 (2019), 81, 8 pp.
  2. G. Cybenko, “Approximation by superpositions of a sigmoidal function”, Math. Control Signals Systems, 2:4 (1989), 303–314
  3. Bing Gao, Qiyu Sun, Yang Wang, Zhiqiang Xu, “Phase retrieval from the magnitudes of affine linear measurements”, Adv. in Appl. Math., 93 (2018), 121–141
  4. R. Giryes, G. Sapiro, A. M. Bronstein, “Deep neural networks with random Gaussian weights: a universal classification strategy?”, IEEE Trans. Signal Process., 64:13 (2016), 3444–3457
  5. A. Goujon, A. Etemadi, M. Unser, “On the number of regions of piecewise linear neural networks”, J. Comput. Appl. Math., 441 (2024), 115667, 22 pp.
  6. Feng Guo, Liguo Jiao, Do Sang Kim, “On continuous selections of polynomial functions”, Optimization, 73:2 (2024), 295–328
  7. M. Imaizumi, K. Fukumizu, “Deep neural networks learn non-smooth functions effectively”, Proceedings of the 22nd international conference on artificial intelligence and statistics, Proc. Mach. Learn. Res. (PMLR), 89, 2019, 869–878
  8. A. Janosi, W. Steinbrunn, M. Pfisterer, R. Detrano, Heart disease [Dataset], UCI Machine Learning Repository, 1989
  9. A. Nobel, “Histogram regression estimation using data-dependent partitions”, Ann. Statist., 24:3 (1996), 1084–1105
  10. Y. Plan, R. Vershynin, “Dimension reduction by random hyperplane tessellations”, Discrete Comput. Geom., 51:2 (2014), 438–461
  11. B. Ramana, N. Venkateswarlu, ILPD (Indian liver patient dataset) [Dataset], UCI Machine Learning Repository, 2022
  12. S. Scholtes, “Piecewise affine functions”, Introduction to piecewise differentiable equations, SpringerBriefs Optim., Springer, New York, 2012, 13–63
  13. W. Wolberg, O. Mangasarian, N. Street, W. Street, Breast cancer Wisconsin (Diagnostic) [Dataset], UCI Machine Learning Repository, 1993

Қосымша файлдар

Қосымша файлдар
Әрекет
1. JATS XML

© Eliseev N.A., Perminov A.I., Turdakov D.Y., 2025

Согласие на обработку персональных данных

 

Используя сайт https://journals.rcsi.science, я (далее – «Пользователь» или «Субъект персональных данных») даю согласие на обработку персональных данных на этом сайте (текст Согласия) и на обработку персональных данных с помощью сервиса «Яндекс.Метрика» (текст Согласия).