Preview

Modeling and Analysis of Information Systems

Advanced search

Methods of implicit aspect detection in Russian publicism sentences

https://doi.org/10.18255/1818-1015-2024-3-226-239

Abstract

The paper compares performance of various methods of automatic implicit aspect detection in publicism sentences in Russian. The task of implicit aspect detection is an auxiliary task in the aspect-oriented sentiment analysis. The experiments were conducted on a corpus of sentences extracted from political campaign materials. The best results, with F1-measure reaching 0.84, were obtained using the Navec embeddings and classifiers based on the support vector machine method. Fairly high results, with F1-measure reaching 0.77, were obtained using the bag-of-words model and the naive Bayesian classifier. Other methods showed lower performance. It was also revealed during the experiments that the detection quality can differ significantly between the aspects. The detection quality is the highest for the aspects associated with characteristic marker words, for example, “health car” and “holding elections”. More general aspects, such as “quality of governance”, are detected with the worst quality.

About the Authors

Anatoliy Y. Poletaev
P.G. Demidov Yaroslavl State University
Russian Federation


Ilya V. Paramonov
P.G. Demidov Yaroslavl State University
Russian Federation


Egor M. Kolupaev
P.G. Demidov Yaroslavl State University
Russian Federation


References

1. B. Liu, Sentiment Analysis and Opinion Mining. Springer, 2022.

2. W. Zhang, X. Li, Y. Deng, L. Bing, and W. Lam, “A survey on aspect-based sentiment analysis: Tasks, methods, and challenges,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 11, pp. 11019–11038, 2022, doi: 10.1109/TKDE.2022.3230975.

3. M. M. Trucscǎ and F. Frasincar, “Survey on aspect detection for aspect-based sentiment analysis,” Artificial Intelligence Review, vol. 56, no. 5, pp. 3797–3846, 2023.

4. A. Naumov, R. Rybka, A. Sboev, A. Selivanov, and A. Gryaznov, “Neural-network method for determining text author's sentiment to an aspect specified by the named entity,” in CEUR Workshop Proceedings, 2020, vol. 2648, pp. 134–143.

5. E. V. Sergeeva, “Features of speech exposure in the preelection media discourse,” in Aktual'nye problemy gumanitarnogo znaniya v tekhnicheskom vuze, 2021, pp. 237–239.

6. A. Nazir, Y. Rao, L. Wu, and L. Sun, “Issues and challenges of aspect-based sentiment analysis: A comprehensive survey,” IEEE Transactions on Affective Computing, vol. 13, no. 2, pp. 845–863, 2020, doi: 10.1109/TAFFC.2020.2970399.

7. P. K. Soni and R. Rambola, “A Survey on Implicit Aspect Detection for Sentiment Analysis: Terminology, Issues, and Scope,” IEEE Access, vol. 10, pp. 63932–63957, 2022, doi: 10.1109/ACCESS.2022.3183205.

8. B. Mohammed and others, “Hybrid approach to extract adjectives for implicit aspect identification in opinion mining,” in 11th International Conference on Intelligent Systems: Theories and Applications (SITA), 2016, pp. 1–5, doi: 10.1109/SITA.2016.7772284.

9. A. O. Kornej and E. N. Kryuchkova, “Semantiko-statisticheskij algoritm opredeleniya kategorij aspektov v zadachah sentiment-analiza,” Izvestiya Yuzhnogo federal'nogo universiteta. Tekhnicheskie nauki, no. 6 (216), pp. 66–74, 2020, doi: 10.18522/2311-3103-2020-6-66-74.

10. E. I. Gribkov and Y. P. Ekhlakov, “Nejrosetevaya model' na osnove sistemy perekhodov dlya izvlecheniya sostavnyh ob'ektov i ih atributov iz tekstov na estestvennom yazyke,” Doklady Tomskogo gosudarstvennogo universiteta sistem upravleniya i radioelektroniki, vol. 23, no. 1, pp. 47–52, 2020, doi: 10.21293/1818-0442-2020-23-1-47-52.

11. L. Hickman, S. Thapa, L. Tay, M. Cao, and P. Srinivasan, “Text preprocessing for text mining in organizational research: Review and recommendations,” Organizational Research Methods, vol. 25, no. 1, pp. 114–146, 2022, doi: 10.1177/1094428120971683.

12. S. Bird, E. Klein, and E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly Media, Inc., 2009.

13. U. Naseem, I. Razzak, and P. W. Eklund, “A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on Twitter,” Multimedia Tools and Applications, vol. 80, pp. 35239–35266, 2021, doi: 10.1007/s11042-020-10082-6.

14. J. Coates and D. Bollegala, “Frustratingly Easy Meta-Embedding -- Computing Meta-Embeddings by Averaging Source Word Embeddings,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 2018, pp. 194–198, doi: 10.18653/v1/N18-2031.

15. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space.” 2013.

16. I. Yamada et al., “Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020, pp. 23–30, doi: 10.18653/v1/2020.emnlp-demos.4.

17. A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, “Bag of Tricks for Efficient Text Classification,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017, pp. 427–431, doi: 10.48550/arXiv.1607.01759.

18. A. Kukushkin, “Navec -- kompaktnye embeddingi dlya russkogo yazyka.” 2020, Accessed: Aug. 11, 2024. [Online]. Available: https://natasha.github.io/navec/.

19. J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532–1543, doi: 10.3115/v1/D14-1162.

20. Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in International conference on machine learning, 2014, pp. 1188–1196.

21. F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.


Review

For citations:


Poletaev A.Y., Paramonov I.V., Kolupaev E.M. Methods of implicit aspect detection in Russian publicism sentences. Modeling and Analysis of Information Systems. 2024;31(3):226-239. (In Russ.) https://doi.org/10.18255/1818-1015-2024-3-226-239

Views: 175


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-1015 (Print)
ISSN 2313-5417 (Online)