The relationship between the Fermi–Dirac distribution and statistical distributions in languages
- 作者: Maslov V.P.1
-
隶属关系:
- National Research University Higher School of Economics
- 期: 卷 101, 编号 3-4 (2017)
- 页面: 645-659
- 栏目: Article
- URL: https://ogarev-online.ru/0001-4346/article/view/150022
- DOI: https://doi.org/10.1134/S0001434617030221
- ID: 150022
如何引用文章
详细
In this article, we study, from the mathematical point of view, the analogies between language and multi-particle systems in thermodynamics. We attempt to introduce an appropriate mathematical apparatus and the technical tools of statistical physics to descriptions of language. In particular, we apply the notions of number of degrees of freedom, Bose condensate, phase transition and others to linguistics objects. On the basis of a statistical analysis of dictionaries and statistical distributions in languages, we conjecture that the transition from the semiotic communication system of the higher primates to human language can be described as a phase transition of the first kind. We show that the number of words appearing with frequency 1 in a corpus of texts is equal to the number of ones in the corresponding Fermi–Dirac distribution, while the high frequency of stop-words corresponds to the large number of particles in the Bose condensate, when the number of degrees of freedom is less than two, provided there is a gap in the spectrum. The presented considerations are illustrated by examples from the Russian language. Some of the illustrative examples are untranslatable into English, and so they were replaced in translation by similar examples from the English language.
作者简介
V. Maslov
National Research University Higher School of Economics
编辑信件的主要联系方式.
Email: v.p.maslov@mail.ru
俄罗斯联邦, Moscow
补充文件
