Neural network classification of videos based on a small number of frames

Cover Page

Cite item

Full Text

Abstract

The article proposes a method for neural network classification of short videos. The classification problem is considered from the point of view of reducing the number of operations required to categorize videos. The proposed solution consists of using a small number of frames (no more than 10) to perform classification using the lightest neural network architecture of the ResNet family of models. As part of the work, a proprietary training dataset was created, consisting of three classes: “animals”, “cars” and “people”. As a result, a classification accuracy of 79% was obtained, a database of classified videos was formed, and an application with GUI elements was developed for interacting with the classifier and viewing the results.

About the authors

Alexander Vladimirovich Smirnov

Ailamazyan Program Systems Institute of RAS

Email: asmirnov_1991@mail.ru

Dmitry Denisovich Parfenov

Admiral Makarov State University of Maritime and Inland Shipping

Email: parfecto@yandex.ru

Igor Petrovich Tishchenko

Ailamazyan Program Systems Institute of RAS

Email: igor.p.tishchenko@gmail.com

References

  1. Duvvuri K., Kanisettypalli H., Jaswanth K., Murali K.. “Video classification using CNN and ensemble learning”, 2023 9th International Conference on Advanced Computing and Communication Systems. 1, ICACCS 2023 (17-18 March 2023, Coimbatore, India), IEEE, 2023, ISBN 9798350397383, pp. 66–70.
  2. Tang H., Ding L., Wu S., Ren B., Sebe N., Rota P.. “Deep unsupervised key frame extraction for efficient video classification”, ACM Transactions on Multimedia Computing, Communications and Applications, 19:3 (2023), 119, 17 pp.
  3. Savran K. R., Gan J. Q., Escobar J. J.. “A novel keyframe extraction method for video classification using deep neural networks”, Neural Computing and Applications, 35:34 (2023), pp. 24513–24524.
  4. Das M., Raj R., Saha P., Mathew B., Gupta M., Mukherjee A.. “HateMM: a multi-modal dataset for hate video classification”, Proceedings of the International AAAI Conference on Web and Social Media, 17, Proceedings of the Seventeenth International AAAI Conference on Web and Social Media (ICWSM 2023) (2023), pp. 1014–1023.
  5. Lei J., Sun W., Fang Y., Ye N., Yang S., Wu J.. “A model for detecting abnormal elevator passenger behavior based on video classification”, Electronics, 13:13 (2024), 2472, 15 pp.
  6. Amin J., Anjum M. A., Ibrar K., Sharif M., Kadry S., Crespo R. G.. “Detection of anomaly in surveillance videos using quantum convolutional neural networks”, Image and Vision Computing, 135 (2023), 104710.
  7. Cong I., Choi S., Lukin M. D.. “Quantum convolutional neural networks”, Nature Physics, 15 (2019), pp. 1273–1278.
  8. Jianmin H., Jie L.. “A video action recognition method via dual-stream feature fusion neural network with attention”, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 32:04 (2024), pp. 673–694.
  9. Trędowicz M., Struski Ł., Mazur M., Janusz S., Lewicki A., Tabor J.. PrAViC: probabilistic adaptation framework for real-time video classification, 2024, 12 pp.
  10. Gao T., Zhang M., Zhu Y., Zhang Y., Pang X., Ying J., Liu W.. “Sports video classification method based on improved deep learning”, Applied Sciences, 14:2 (2024), 948, 13 pp.
  11. Kanwal Y., Tabassam N.. “An attention mechanism-based CNN-BiLSTM classification model for detection of inappropriate content in cartoon videos”, Multimedia Tools and Applications, 83:11 (2024), pp. 31317–31340.
  12. He K., Zhang X., Ren S., Sun J.. “Deep residual learning for image recognition”, 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 (27-30 June 2016, Las Vegas, NV, USA), IEEE, 2016, ISBN 978-1-4673-8850-4, pp. 770–778.
  13. Ren S., He K., Girshick R., Sun J.. “Faster R-CNN: towards real-time object detection with region proposal networks”, Proceedings of the 28th International Conference on Neural Information Processing Systems. 1, NIPS'15 (December 7–12, 2015, Montreal, Canada), MIT Press, Cambridge, 2015, ISBN 9781510825024, pp. 91–99.
  14. Tan M., Le Q. V.. “EfficientNet: rethinking model scaling for convolutional neural networks”, Proceedings of the 36th International Conference on Machine Learning, ICML 2019 (9-15 June 2019, Long Beach, California, USA), Proceedings of Machine Learning Research, vol. 97, ICML, 2019, ISBN 9781510886988, pp. 6105–6114.

Supplementary files

Supplementary Files
Action
1. JATS XML


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Согласие на обработку персональных данных

 

Используя сайт https://journals.rcsi.science, я (далее – «Пользователь» или «Субъект персональных данных») даю согласие на обработку персональных данных на этом сайте (текст Согласия) и на обработку персональных данных с помощью сервиса «Яндекс.Метрика» (текст Согласия).