Aspects of creating a corporate question-and-answer system using generative pre-trained language models
- Authors: Golikov A.A.1, Akimov D.A.1, Romanovskii M.S.1, Trashchenkov S.V.1
-
Affiliations:
- Issue: No 12 (2023)
- Pages: 190-205
- Section: Articles
- URL: https://ogarev-online.ru/2409-8698/article/view/380044
- DOI: https://doi.org/10.25136/2409-8698.2023.12.69353
- EDN: https://elibrary.ru/FSTHRW
- ID: 380044
Cite item
Full Text
Abstract
About the authors
Aleksei Aleksandrovich Golikov
Email: ag@mastercr.ru
Dmitrii Andreevich Akimov
Email: akimovdmitry1@mail.ru
ORCID iD: 0009-0004-2800-4430
Maksim Sergeevich Romanovskii
Email: maksim.s.romanovskii@gmail.com
Sergei Viktorovich Trashchenkov
Email: trashchenkov@gmail.com
ORCID iD: 0000-0001-8786-8336
References
- Simmons R. F., Klein S., McConlogue K. Indexing and dependency logic for answering English questions // American Documentation. – 1964. – Т. 15. – №. 3. – С. 196-204.
- Luo M. et al. Choose your qa model wisely: A systematic study of generative and extractive readers for question answering // arXiv preprint arXiv:2203.07522. – 2022.
- Zhou C. et al. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt // arXiv preprint arXiv:2302.09419. – 2023.
- Lewis P. et al. Retrieval-augmented generation for knowledge-intensive nlp tasks //Advances in Neural Information Processing Systems. – 2020. – Т. 33. – С. 9459-9474.
- Маслюхин С. М. Диалоговая система на основе устных разговоров с доступом к неструктурированной базе знаний // Научно-технический вестник информационных технологий, механики и оптики. – 2023. – Т. 23. – №. 1. – С. 88-95.
- Евсеев Д. А., Бурцев М. С. Использование графовых и текстовых баз знаний в диалоговом ассистенте DREAM // Труды Московского физико-технического института. – 2022. – Т. 14. – №. 3 (55). – С. 21-33.
- Su D. Generative Long-form Question Answering: Relevance, Faithfulness and Succinctness //arXiv preprint arXiv:2211.08386. – 2022.
- Kim M. Y. et al. Legal information retrieval and entailment based on bm25, transformer and semantic thesaurus methods // The Review of Socionetwork Strategies. – 2022. – Т. 16. – №. 1. – С. 157-174.
- Ke W. Alternatives to Classic BM25-IDF based on a New Information Theoretical Framework //2022 IEEE International Conference on Big Data (Big Data). – IEEE, 2022. – С. 36-44.
- Rodriguez P. L., Spirling A. Word embeddings: What works, what doesn’t, and how to tell the difference for applied research // The Journal of Politics. – 2022. – Т. 84. – №. 1. – С. 101-115.
- Жеребцова Ю. А., Чижик А. В. Сравнение моделей векторного представления текстов в задаче создания чат-бота // Вестник Новосибирского государственного университета. Серия: Лингвистика и межкультурная коммуникация. – 2020. – Т. 18. – №. 3. – С. 16-34.
- Digutsch J., Kosinski M. Overlap in meaning is a stronger predictor of semantic activation in GPT-3 than in humans //Scientific Reports. – 2023. – Т. 13. – №. 1. – С. 5035.
- Kamnis S. Generative pre-trained transformers (GPT) for surface engineering // Surface and Coatings Technology. – 2023. – С. 129680.
- Khadija M. A., Aziz A., Nurharjadmo W. Automating Information Retrieval from Faculty Guidelines: Designing a PDF-Driven Chatbot powered by OpenAI ChatGPT // 2023 International Conference on Computer, Control, Informatics and its Applications (IC3INA). – IEEE, 2023. – С. 394-399.
- Johnson J., Douze M., Jégou H. Billion-scale similarity search with gpus // IEEE Transactions on Big Data. – 2019. – Т. 7. – №. 3. – С. 535-547.
- Rajpurkar P. et al. Squad: 100,000+ questions for machine comprehension of text // arXiv preprint arXiv:1606.05250. – 2016.
- Bai Y., Wang D. Z. More than reading comprehension: A survey on datasets and metrics of textual question answering // arXiv preprint arXiv:2109.12264. – 2021.
Supplementary files

