Convergence of a multilayer perceptron to histogram Bayesian regression

Nikita Aleksandrovich Eliseev; Елисеев Никита Александрович; Andrey Igorevich Perminov; Перминов Андрей Игоревич; Denis Yur'evich Turdakov; Турдаков Денис Юрьевич

doi:10.4213/rm10273

Convergence of a multilayer perceptron to histogram Bayesian regression

Авторлар: Eliseev N.A.¹, Perminov A.I.¹, Turdakov D.Y.¹^,2
Мекемелер:
1. Ivannikov Institute for System Programming of the Russian Academy of Sciences
2. Research Center of the Trusted Artificial Intelligence ISP RAS
Шығарылым: Том 80, № 6 (2025)
Беттер: 45-72
Бөлім: Articles
URL: https://ogarev-online.ru/0042-1316/article/view/358699
DOI: https://doi.org/10.4213/rm10273
ID: 358699

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат
Рұқсат жабық

Рұқсат берілді
Рұқсат жабық

Тек жазылушылар үшін

Аннотация
Авторлар туралы
Әдебиет тізімі
Қосымша файлдар
Статистика

Аннотация

The problem of enhancing the interpretability and consistency of Baysesian classifier solutions in approximating the empirical data by means of a multilayer perceptron is under consideration. Histogram regression preserves transparency and statistical interpretation but is limited by memory requirements ($O(n)$) and weak scalability, while a multilayer perceptron provides a memory efficient representation ($O(1)$)and high computational efficiency in combination with limited interpretability. The focus is on a unary learning scheme, when the training sample consists of examples in the same target class and additional background points which are uniformly distributed over a compact subset of the feature space. This approach enables one to treat each class separately and implement the failure mechanism outside the data support, which enhances the model reliability. It is proposed to consider the perceptron output as a consistent analogue of the histogram class interval induced by the linearity cells of the perceptron. It is proved that under the natural assumptions of regularity and controlled growth of architecture the output function of a multilayer perseptron is consistent and equivalent to a histogram estimator. Theoretical consistency is rigorously ðroved in the case of a fixed first layer, while numerical experiments confirm the applicability of the results to models all of whose layers are trained. Thus histogram interpretation ensures the statistical verification of the consistency of perceptron approximation and addscredibility to classification solutions in the framework of a unary model.

Негізгі сөздер

Multilayer perceptron, histogram regressions, piecewise linear activation functions, Bayesian classifier, consistency, asymptotic equivalence, VC-dimension, random hyperplanes, unary classification

Әдебиет тізімі

M. Csikos, N. H. Mustafa, A. Kupavskii, “Tight lower bounds on the VC-dimension of geometric set systems”, J. Mach. Learn. Res., 20 (2019), 81, 8 pp.
G. Cybenko, “Approximation by superpositions of a sigmoidal function”, Math. Control Signals Systems, 2:4 (1989), 303–314
Bing Gao, Qiyu Sun, Yang Wang, Zhiqiang Xu, “Phase retrieval from the magnitudes of affine linear measurements”, Adv. in Appl. Math., 93 (2018), 121–141
R. Giryes, G. Sapiro, A. M. Bronstein, “Deep neural networks with random Gaussian weights: a universal classification strategy?”, IEEE Trans. Signal Process., 64:13 (2016), 3444–3457
A. Goujon, A. Etemadi, M. Unser, “On the number of regions of piecewise linear neural networks”, J. Comput. Appl. Math., 441 (2024), 115667, 22 pp.
Feng Guo, Liguo Jiao, Do Sang Kim, “On continuous selections of polynomial functions”, Optimization, 73:2 (2024), 295–328
M. Imaizumi, K. Fukumizu, “Deep neural networks learn non-smooth functions effectively”, Proceedings of the 22nd international conference on artificial intelligence and statistics, Proc. Mach. Learn. Res. (PMLR), 89, 2019, 869–878
A. Janosi, W. Steinbrunn, M. Pfisterer, R. Detrano, Heart disease [Dataset], UCI Machine Learning Repository, 1989
A. Nobel, “Histogram regression estimation using data-dependent partitions”, Ann. Statist., 24:3 (1996), 1084–1105
Y. Plan, R. Vershynin, “Dimension reduction by random hyperplane tessellations”, Discrete Comput. Geom., 51:2 (2014), 438–461
B. Ramana, N. Venkateswarlu, ILPD (Indian liver patient dataset) [Dataset], UCI Machine Learning Repository, 2022
S. Scholtes, “Piecewise affine functions”, Introduction to piecewise differentiable equations, SpringerBriefs Optim., Springer, New York, 2012, 13–63
W. Wolberg, O. Mangasarian, N. Street, W. Street, Breast cancer Wisconsin (Diagnostic) [Dataset], UCI Machine Learning Repository, 1993

Қосымша файлдар

Әрекет

1. JATS XML

Жүктеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу

Convergence of a multilayer perceptron to histogram Bayesian regression

Толық мәтін

Аннотация

Негізгі сөздер

Авторлар туралы

Nikita Eliseev

Andrey Perminov

Denis Turdakov

Әдебиет тізімі

Қосымша файлдар