Expanding the Database of a Balanced Linguistic Corpus with Values from a Dictionary of Tonality (corpus experiment)
- Authors: Gorozhanov A.I.1, Stepanova D.V.2
-
Affiliations:
- Moscow State Linguistic University
- Minsk State Linguistic University
- Issue: No 7(888) (2024)
- Pages: 29-35
- Section: Linguistics
- URL: https://ogarev-online.ru/2542-2197/article/view/299410
- ID: 299410
Cite item
Full Text
Abstract
The proposed research aims to develop and test an algorithm for expanding a balanced dynamic linguistic corpus of more than 3 million tokens with connotative characteristics. To achieve this, the authors rely on original software solutions created at the laboratory for fundamental and applied issues of virtual education at Moscow State Linguistic University. As a result, a properly functioning corpus was obtained with the ability to supplement its individual fragments with data on the connotations of tokens and sentences.
About the authors
Alexey Ivanovich Gorozhanov
Moscow State Linguistic University
Author for correspondence.
Email: a.gorozhanov@linguanet.ru
Doctor of Philology (Dr. habil), Associate Prof. , Professor in the Department of German Language Grammar and History, Faculty for German Language
Russian FederationDarya Valeryevna Stepanova
Minsk State Linguistic University
Email: daryastepanova79@gmail.com
PhD (Philology), Associate Prof., Associate Professor in the Department of Theory and Practice of English Speech,
Faculty for English Language
References
- Gorozhanov, A. I., Guseynova, I. A., Stepanova, D. V. (2024). Natural Language Processing and Fiction Text: Basis for Corpus Research. RUDN Journal Of Language Studies, Semiotics And Semantics, 15(1), 195–210. doi: 10.22363/2313-2299-2024-15-1-195-210.
- Stepanova, D. V. (2023). Software package for generating a dynamic media texts corpus. Minsk State Linguistic University Bulletin. Series 1. Philology, 6(127), 123–130. EDN FMBTKO. (In Russ.)
- Gorozhanov, A. I. (2023). Extension of a standard balanced linguistic corpus built according to spaCy rules by connotative characteristics. Philology. Theory & Practice, 11(16), 3888–3893. doi: 10.30853/phil20230594. EDN FVUIUL. (In Russ.)
- Chernichkin, D. A., Krivenko, A. I. (2023). Media image of Russia in Kazakh Telegram channels. Political Expertise: Politex, 4(19), 565–586. doi: 10.21638/spbu23.2023.404. EDN POURDG. (In Russ.)
- Komarova, E. V. (2023). Digital ethics challengers in Russian and English media texts: Migrant Discourse Case Study. Media Linguistics, 2(10), 253–264. doi: 10.21638/spbu22.2023.207. EDN MFJOQV. (In Russ.)
- Glushak, V. M. (2023). Negation of German polar words and expressions in automated analysis of text tonality. Philology. Theory & Practice, 10(16), 3287–3292. doi: 10.30853/phil20230510. EDN CWDXEU. (In Russ.)
- Chernyshevich, M. V. (2018). The architecture of sentiment-analysis system and its linguistic resources. Minsk State Linguistic University Bulletin. Series 1. Philology, 3(94), 72–80. EDN WXUUJR. (In Russ.)
Supplementary files
