TY - JOUR TI - Towards Learning Word Representation AU - Wiercioch, Magdalena TI - Towards Learning Word Representation AB - Continuous vector representations, as a distributed representations for words have gained a lot of attention in Natural Language Processing (NLP) field. Although they are considered as valuable methods to model both semantic and syntactic features, they still may be improved. For instance, the open issue seems to be to develop different strategies to introduce the knowledge about the morphology of words. It is a core point in case of either dense languages where many rare words appear and texts which have numerous metaphors or similies. In this paper, we extend a recent approach to represent word information. The underlying idea of our technique is to present a word in form of a bag of syllable and letter n-grams. More specifically, we provide a vector representation for each extracted syllable-based and letter-based n-gram, and perform concatenation. Moreover, in contrast to the previous method, we accept n-grams of varied length n. Further various experiments, like tasks-word similarity ranking or sentiment analysis report our method is competitive with respect to other state-of-theart techniques and takes a step toward more informative word representation construction. VL - 2016 IS - Volume 25 PY - 2017 SN - 1732-3916 C1 - 2083-8476 SP - 103 EP - 115 DO - 10.4467/20838476SI.16.008.6189 UR - https://ejournals.eu/en/journal/schedae-informaticae/article/towards-learning-word-representation KW - representation learning KW - n-gram model KW - NLP