Corpus of Privacy Policies for Web Services and Internet of Things Devices for Analyzing the Awareness of Personal Data Subjects
- Authors: Kuznetsov M.D1, Novikova E.S1
-
Affiliations:
- St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS)
- Issue: Vol 24, No 1 (2025)
- Pages: 163-192
- Section: Information security
- URL: https://ogarev-online.ru/2713-3192/article/view/278226
- DOI: https://doi.org/10.15622/ia.24.1.7
- ID: 278226
Cite item
Full Text
Abstract
About the authors
M. D Kuznetsov
St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS)
Email: mkuznetsov7991@gmail.com
14-th Line V.O. 39
E. S Novikova
St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS)
Email: novikova@comsec.spb.ru
14-th Line V.O. 39
References
- Исследование утечек информации в отраслях за три года. URL: https://www.infowatch.ru/analytics/analitika/issledovaniye-utechek-informatsii-v-otraslyakh-za-tri-goda (дата обращения 20.05.2024).
- Американские власти оштрафовали Avast за распространение персональных данных пользователей. URL: https://xakep.ru/2024/02/26/avast-ftc (дата обращения 20.05.2024).
- Number of Internet of Things (IoT) connections worldwide from 2022 to 2023, with forecasts from 2024 to 2033. URL: https://www.statista.com/statistics/1183457/iot-connected-devices-worldwide (дата обращения 20.05.2024).
- Самодолов А.П., Самодолова О.А., Николаенко Е.В. Особенности развития “умных домов” в России // Вестник ЮУрГУ. Серия: Строительство и архитектура. 2021. Т. 21. № 2. С. 78–85.
- Отчет об уязвимостях в устройствах Интернета Вещей. URL: https://www.cnet.com/home/security/your-home-security-camera-could-be-hacked-so-treat-it-that-way (дата обращения 20.05.2024).
- Mitigating Smart Meter Security Risk: A Privacy-preserving Approach. URL: https://eepower.com/technical-articles/mitigating-smart-meter-security-risk-a-privacy-preserving-approach/ (дата обращения 20.05.2024).
- Alanazi F., Kim J., Cotilla-Sanchez E. Load Oscillating Attacks of Smart Grids: Vulnerability Analysis // IEEE Access. 2023. vol. 11. pp. 36538–36549. doi: 10.1109/access.2023.3266249.
- Steinfeld N. “I agree to the terms and conditions”: (How) do users read privacy policies online? An eye-tracking experiment // Computers in Human Behavior. 2016. vol. 55. part B. pp. 992–1000. doi: 10.1016/j.chb.2015.09.038.
- Karegar F., Pettersson J.S., Fischer-Hubner S. The Dilemma of User Engagement in Privacy Notices: Effects of Interaction Modes and Habituation on User Attention // ACM Transactions on Privacy and Security (TOPS). 2020. vol. 23. no. 1. pp. 1–38. doi: 10.1145/3372296.
- Регламент Европейского регулирования персональных данных. URL: http://data.europa.eu/eli/reg/2016/679/oj (дата обращения 20.05.2024).
- Harkous H., Fawaz K, Lebret R, Schaub F, Shin KG, Aberer K. Polisis: automated analysis and presentation of privacy policies using deep learning // Proceedings of the 27th USENIX Security Symposium. 2018. pp. 531–548.
- Novikova E., Doynikova E., Kotenko I. P2Onto: Making Privacy Policies Transparent // Computer Security, CyberICPS SECPRE ADIoT 2020, Proceedings of the International Workshop on Attacks and Defenses for Internet-of-Things. 2020. pp. 235–252.
- Kuznetzov M., Novikova E. Towards application of text mining techniques to the analysis of the privacy policies // Proceedings of the 10th Mediterranean Conference on Embedded Computing. 2021. pp. 1–4. doi: 10.1109/meco52532.2021.9460130.
- Ahmad W., Chi J., Tian Y., Chang K.-W. PolicyQA: A Reading Comprehension Dataset for Privacy Policies // Proceedings of the Findings of the Association for Computational Linguistics (EMNLP). 2020. pp. 743–749.
- Harkous H., et al. Polisis: automated analysis and presentation of privacy policies using deep learning // Proceedings of the 27th USENIX Conference on Security Symposium. 2018. pp. 531–548.
- Zaeem R.N., German R.L., Barber K.S. PrivacyCheck: Automatic Summarization of Privacy Policies Using Data Mining // ACM Transactions on Internet Technology. 2018. vol. 18. vol. 4. doi: 10.1145/3127519.
- Kuznetsov M., et al. Privacy Policies of IoT Devices: Collection and Analysis // Sensors. 2022. vol. 22. no. 5. doi: 10.3390/s22051838.
- Правила защиты конфиденциальности детей в Интернете. URL: https://www.ftc.gov/legal-library/browse/rules/childrens-online-privacy-protection-rule-coppa (дата обращения 20.05.2024).
- Palmirani M., Martoni M., Rossi A., Bartolini C., Robaldo L. Legal ontology for modelling GDPR concepts and norms // Legal Knowledge and Information Systems. Amsterdam: IOS Press. 2018. vol. 313. pp. 91–100. doi: 10.3233/978-1-61499-935-5-91.
- Pandit H.J., O’Sullivan D., Lewis D. An Ontology Design Pattern for Describing Personal Data in Privacy Policies // 9th Workshop on Ontology Design and Patterns. 2018. vol. 2195. pp. 29–39.
- Oltramari A., Piraviperumal D., Schaub F., Wilson S., Cherivirala S., Norton T.B., Russel N.C., Story P., Reidenberg, Sadeh N. PrivOnto: a semantic framework for the analysis of privacy policies // Semantic Web. 2018. vol. 9. no. 2. pp. 185–203.
- Cano-Benito J., Cimmino A., Garcia-Castro R. Toward the ontological modeling of smart contracts: A solidity use case // IEEE Access. 2021. vol. 9. pp. 140156–140172. doi: 10.1109/access.2021.3115577.
- Wilson Ah., et al. The Creation and Analysis of a Website Privacy Policy Corpus // Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016. pp. 1330–1340. doi: 10.18653/v1/P16-1126.
- Zimmeck S., et al. MAPS: scaling privacy compliance analysis to a million apps // In Proceedings on Privacy Enhancing Technologies 2019. vol. 3. pp. 66–86. doi: 10.2478/popets-2019-0037.
- Kumar V.H., Iyengar R., Nisal N., Feng Y., Habib H., Story P., Cherivirala S., Nagan M., Cranor L., Wilson S., Schaud F., Sadeh N. Finding a Choice in a Haystack: Automatic Extraction of Opt-Out Statements from Privacy Policy Text // Proceedings of The Web Conference. 2020. pp. 1943–1954. doi: 10.1145/3366423.3380262.
- Hosseini M.B., Heaps J., Slavin R., Niu J., Breaux T. Ambiguity and Generality in Natural Language Privacy Policies // IEEE 29th International Requirements Engineering Conference (RE). 2021. pp. 70–81. doi: 10.1109/RE51729.2021.00014.
- Hosseini M.B., Breaux T., Slavin R., Niu J., Wang X. Analyzing Privacy Policies through Syntax-Driven Semantic Analysis of Information Types // Information and Software Technology Journal. 2021. vol. 138. doi: 10.1016/j.infsof.2021.106608.
- Веб-страница проекта Usable Privacy Policy. URL: https://usableprivacy.org (дата обращения 21.05.2024).
- Веб-сайт Amazon Alexa. URL: https://www.alexa.com (дата обращения 22.05.2024).
- Poplavska E., Norton T.B., Wilson S., Sadeh N. From Prescription to Description: Mapping the GDPR to a Privacy Policy Corpus Annotation Scheme // Proceedings of the 33rd International Conference on Legal Knowledge and Information Systems. 2020. pp. 243–246.
- Веб-сайт сервиса Google Play. URL: https://play.google.com/store (дата обращения 24.05.2024).
- Amos R., Acar G., Kshirsagar M., Narayanan A., Mayer J. Privacy Policies over Time: Curation and Analysis of a Million-Document Dataset // Proceedings of the Web Conference. 2021. pp. 2165–2176. doi: 10.1145/3442381.3450048.
- Zaeem R.N., Barber K.S. A Large Publicly Available Corpus of Website Privacy Policies Based on DMOZ // In Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy. 2021. pp. 143–148. doi: 10.1145/3422337.3447827.
- Веб-директория Curlie. URL: https://curlie.org (дата обращения 26.05.2024).
- Srinath M., Wilson S., Giles C. Privacy at Scale: Introducing the PrivaSeer Corpus of Web Privacy Policies // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021. pp. 6829–6839. doi: 10.18653/v1/2021.acl-long.532.
- Веб-сайт Amazon. URL: https://www.amazon.com (дата обращения 26.05.2024).
- Веб-сайт Walmart. URL: https://www.walmart.com/ (дата обращения 28.05.2024).
- Hamid A., Samidi H.R., Finin T., Pappachan P., Yus R. PrivacyLens: A Framework to Collect and Analyze the Landscape of Past, Present, and Future Smart Device Privacy Policies // arXiv pradprint arXiv.2308.05890. 2023.
- Ravichander A., Black A., Wilson S., Norton T., Sadeh N. Question Answering for Privacy Policies: Combining Computational and Legal Perspectives // Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing. 2019. pp. 4947–4958. doi: 10.18653/v1/D19-1500.
- Веб-сайт аналитической площадки Mail.ru Top. https://top.mail.ru (дата обращения 02.06.2024).
- Веб-сайт аналитической площадки Rambler Top-100. https://top100.rambler.ru (дата обращения 02.06.2024).
- Политика безопасности компании Huawei. https://www.huawei.com/eu/privacy-policy (дата обращения 02.06.2024).
- Blei D., Ng A., Jordan M. Latent Dirichlet Allocation // Journal of Machine Learning Research. 2003. vol. 3. pp. 993–1022.
- Веб-сайт библиотеки NLTK. URL: https://www.nltk.org (дата обращения 02.06.2024).
Supplementary files
