Towards semantic-rich word embeddings

Grzegorz Beringer; Mateusz Jabłoński; Piotr Januszewski; Andrzej Sobecki; Julian Szymański

Towards semantic-rich word embeddings

Grzegorz Beringer, Mateusz Jabłoński, Piotr Januszewski, Andrzej Sobecki, Julian Szymański

DOI: http://dx.doi.org/10.15439/2019F120

Citation: Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 18, pages 273–276 (2019)

Full text

Abstract. In recent years, word embeddings have been shown to improve the performance in NLP tasks such as syntactic parsing or sentiment analysis. While useful, they are problematic in representing ambiguous words with multiple meanings, since they keep a single representation for each word in the vocabulary. Constructing separate embeddings for meanings of ambiguous words could be useful for solving the Word Sense Disambiguation (WSD) task. In this work, we present how a word embeddings average- based method can be used to produce semantic-rich meaning embeddings, and how they can be improved with distance optimization techniques. We also open-source a WSD dataset that was created for the purpose of evaluating methods presented in this research.

References

R. Socher, J. Bauer, C. D. Manning, and N. Andrew Y., “Parsing with compositional vector grammars,” in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2013, pp. 455–465. [Online]. Available: http://aclweb.org/anthology/P13-1045
R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2013, pp. 1631–1642. [Online]. Available: http://aclweb.org/anthology/D13-1170
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” CoRR, vol. abs/1301.3781, 2013. [Online]. Available: http://dblp.uni-trier.de/db/journals/corr/corr1301.html#abs-1301-3781
J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in In EMNLP, 2014.
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, vol. 41, no. 6, pp. 391–407, 1990.
P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” arXiv preprint https://arxiv.org/abs/1607.04606, 2016.
Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin, “A neural probabilistic language model,” J. Mach. Learn. Res., vol. 3, pp. 1137–1155, Mar. 2003. [Online]. Available: http://dl.acm.org/citation.cfm?id=944919.944966
M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” feb 2018. [Online]. Available: http://arxiv.org/abs/1802.05365
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” oct 2018. [Online]. Available: http://arxiv.org/abs/1810.04805
I. Iacobacci, M. T. Pilehvar, and R. Navigli, “Embeddings for word sense disambiguation: An evaluation study,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2016, pp. 897–907. [Online]. Available: http://aclweb.org/anthology/P16-1085
M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, 2018, pp. 2227–2237. [Online]. Available: http://aclweb.org/anthology/N18-1202
A. Raganato, J. Camacho-Collados, and R. Navigli, “Word sense disambiguation: A unified evaluation framework and empirical comparison,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, 2017, pp. 99–110. [Online]. Available: http://aclweb.org/anthology/E17-1010
A. Raganato, C. Delli Bovi, and R. Navigli, “Neural sequence learning models for word sense disambiguation,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2017, pp. 1156–1167. [Online]. Available: http://aclweb.org/anthology/D17-1120
O. Melamud, J. Goldberger, and I. Dagan, “context2vec: Learning generic context embedding with bidirectional lstm,” in Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. Association for Computational Linguistics, 2016, pp. 51–61. [Online]. Available: http://aclweb.org/anthology/K16-1006