Words are the main phenomena that form a sentence. In terms of understanding the topics in a conversation, one can look at the important words in it. A good keyword extraction model effectively solves problems regarding:
if a text is relevant to several topics, these algorithms extract word-wise keywords but it is desired to represent keywords of topics separately. As a result of these, in ArKeywordExtractor, a hybrid model has been created which uses unsupervised learning and TF/IDF scores.
ArKeywordExtractor process steps:
Word embeddings or word vectorization is an NLP methodology used to find the similarity of words with each other, allowing words to be represented by vectors corresponding to real numbers.
FastText was used to find word vectors of texts in ArKeywordExtractor.