Preview

NSU Vestnik. Series: Linguistics and Intercultural Communication

Advanced search

High-Level Semantic Interpretation of the Russian Static Models Structure

https://doi.org/10.25205/1818-7935-2023-21-1-67-82

Abstract

Since its inception, the Word2vec vector space has become a universal tool both for scientific and practical activities. Over time, it became clear that there is a lack of new methods for interpreting the location of words in vector spaces. The existing methods included consideration of analogies or clustering of a vector space. In recent years, an approach based on probing—analysis of the impact of small changes in the model on the result—has been actively developed. In this paper, we propose a new method for interpreting the arrangement of words in a vector space, applicable for the high-level interpretation of the entire space as a whole. The method provides for identifying the main directions which are selecting large groups of words (about a third of all the words in the model’s dictionary) and opposing them by some semantic features. The method allows us to build a shallow hierarchy of such features. We conducted our experiments on three models trained in different corpora: Russian National Corpus, Araneum Russicum and a collection of scientific articles from different subject domains. For our experiments, we used only nouns from the models’ dictionaries. The article considers an expert interpretation of such division up to the third level. The set of selected features and their hierarchy differ from model to model, but they have a lot in common. We have found that the identified semantic features depend on the texts comprising a corpus used for the model training, their subject domain, and style. The resulting division of words does not always correlate with the common sense used for ontology development. For example, one of the coinciding features is the abstract or material nature of the object. However, at the upper level of models, words are divided into everyday/special lexis, archaic lexis, proper names and common nouns. The article provides examples of words included in the derived groups.

About the Authors

O. A. Serikov
Moscow Institute of Physics and Technology; Artificial Intelligence Research Institute; Institute of Linguistics RAS; HSE University
Russian Federation

Oleg A. Serikov, researcher at HSE University; MIPT; AIRI; Laboratory for Study and Preservation of Minority Languages of the Institute of Linguistics RAS

Moscow



V. A. Geneeva
HSE University
Russian Federation

Veronika A. Geneeva, master student 

Moscow



A. A. Aksenova
JSC Sberbank
Russian Federation

Anna A. Aksenova, data analyst

Moscow



E. S. Klyshinskiy
HSE University
Russian Federation

Eduard S. Klyshinskiy, Assoc. Prof., PhD in CS, researcher 

Moscow



References

1. Gribova, V. V., Petryaeva, M. V., Okun, D. B., Shalfeeva, E. A. Medical Diagnosis Ontology for Intelligent Decision Support Systems. Ontologiya Proektirovaniya [Ontology of designing], 2018, vol. 8, no. 1(27), pp. 58–73. (in Russ.)

2. Rozental, D. E., Telenkova, M. A. Dictionary of Linguistic Terms [Slovar-Spravochnik Lingvisticheskikh Terminov]. 2nd ed. Moscow: Prosveschenie, 1976, 543 p. (in Russ.)

3. Adi, Y. et al. Fine-grained analysis of sentence embeddings using auxiliary prediction tasks [Online]. 2016 URL: https://arxiv.org/abs/1608.04207 (accessed on 01.09.2022).

4. Bocharov, V., Bichineva, S., Granovsky, D., Ostapuk, N., Stepanova, M. Quality assurance tools in the OpenCorpora project. In: Proc. of Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2011”. Moscow: RSUH, 2011, pp. 107-115

5. Bodenreider, O. The Unified Medical Language System (UMLS): integrating biomedical terminology [Online]. Oxford University Press, 2004, рр. 267–270. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC308795/ (accessed on: 01.09.2022)

6. Chizhikova, A., Murzakhmetov, S., Serikov, O., Shavrina, T., Burtsev, M. Attention Understands Semantic Relations. In: Proc. of the 13th Conference on Language Resources and Evaluation (LREC-2022), 2022, pp. 4040–4050.

7. Conneau, A., Lample, G., Ranzato, M. A., Denoyer, L., Jégou, H. Word Translation Without Parallel Data [Online]. URL: https://arxiv.org/abs/1710.04087 (accessed on: 01.09.2022).

8. Conneau, A. et al. What you can cram into a single vector: Probing sentence embeddings for linguistic properties [Online]. URL: https://arxiv.org/abs/1805.01070 (accessed on: 01.09.2022).

9. Ethayarajh, K. How contextual are contextualized word representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. In Proc. of Association for Computational Linguistics, Hong Kong, 2019, pp. 55–65.

10. Faruqui, M., Tsvetkov, Y., Yogatama, D., Dyer, C., Smith, N. A. Sparse Overcomplete Word Vector Representations. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015, pp. 1491–1500.

11. Gallant, S. Context vector representations for document retrieval. In: Proc. of AAAI Workshop on Natural Language Text Retrieval, 1991.

12. Gustaf, S. Meaning and change of meaning: with special reference to the English language. Indiana University Press, 1964, 490 p.

13. Korogodina , O., Karpik, O., Klyshinsky E. Evaluation of Vector Transformations for Russian Word2Vec and FastText Embeddings. In: Proc. of Graphicon 2020. DOI 10.51130/graphicon-2020-2-3-18

14. Kozlowski, A., Taddy, М., Evansa, J. The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings. American Sociological Review, 2017, pp. 905–949.

15. Kutuzov, A. Distributional word embeddings in modeling diachronic semantic change [Online]. Doctoral Thesis, University of Oslo, 2020. URL: https://www.duo.uio.no/bitstream/handle/10852/81045/1/Kutuzov-Thesis.pdf.

16. Lasri, K., Pimentel, T., Lenci, A., Poibeau, T., Cotterell, R. Probing for the Usage of Grammatical Number. In: Proc. of the 60th Annual Meeting of the Association for Computational Linguistics, vol. 1, 2022, pp. 8818–8831.

17. Linzen, T., Dupoux, E., Goldberg, Y. Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics, 2016, vol. 4, pp. 521–535.

18. Loureiro, D., Alipio, M. J. Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 5682–5691.

19. Representations. Virtual Event, Austria, May 3-7, 2021. URL: https://openreview.net/forum?id=mNtmhaDkAr (accessed on: 01.09.2022).

20. Mikolov, T., Chen, K., Corrado, G., Dean, J. Efficient estimation of word representations in vector space. Proc. of International Conference on Learning Representations (ICLR), 2013 a.

21. Mikolov, T., Chen, K., Corrado, G., Dean, J. Distributed Representations of Words and Phrases and their Compositionality. In: Proc. of 27th Annual Conference on Neural Information Processing Systems, 2013, pp. 3111–3119.

22. Rabinovich, E., Xu, Y., Stevenson, S. The Typology of Polysemy: A Multilingual Distributional Framework, 2020 [Online]. URL: https://arxiv.org/abs/2006.01966v1 (accessed on: 01.09.2022).

23. Ravfogel, S. et al. Counterfactual interventions reveal the causal effect of relative clause representations on agreement prediction [Online]. URL: https://arxiv.org/abs/2105.06965 (accessed on: 01.09.2022).

24. Rubinstein, D., Levi, E., Schwartz, R., Rappoport, A. How well do distributional models capture different types of semantic knowledge? In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 2015, pp. 726–730. https://aclanthology.org/P15-2119.pdf.

25. Subramanian, A., Pruthi, D., Jhamtani, H., Berg-Kirkpatrick, T., Hovy, E. SPINE: SParse Interpretable Neural Embeddings. The 32nd AAAI Conference on Artificial Intelligence (AAAI-18), 2018.

26. Tenney, I., Das, D., Pavlick, E. BERT rediscovers the classical NLP pipeline [Online]. URL: https:// arxiv.org/abs/1905.05950 (accessed on: 01.09.2022).

27. Vig, J. et al. Causal mediation analysis for interpreting neural NLP: The case of gender bias [Online]. URL: https://arxiv.org/abs/2004.12265 (accessed on: 01.09.2022).

28. Voloshina, E., Serikov, O., Shavrina, T. Is neural language acquisition similar to natural? A chronological probing study. In Proc. of Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2022”. 2022. pp. 550-563

29. Weeds, J., Clarke, D., Reffin, J., Weir, D., Keller, B. Learning to distinguish hypernyms and co-hyponyms. In: Proceedings of COLING-2014. Dublin, the 25th International Conference on Computational Linguistics: Technical Papers, 2014, pp. 2249–2259.

30. Yao, S., Yu, D., Xiao, K. Enhancing Domain Word Embedding via Latent Semantic Imputation, 2019 [Online]: arXiv:1905.08900v1. URL: https://arxiv.org/abs/1905.08900 (accessed on: 01.09.2022).


Review

For citations:


Serikov O.A., Geneeva V.A., Aksenova A.A., Klyshinskiy E.S. High-Level Semantic Interpretation of the Russian Static Models Structure. NSU Vestnik. Series: Linguistics and Intercultural Communication. 2023;21(1):67-82. (In Russ.) https://doi.org/10.25205/1818-7935-2023-21-1-67-82

Views: 300


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-7935 (Print)