Word-embedding models for text analytics

Word2vec is an algorithm that translates text data into a word embedding that deep learning algorithms can understand. A word embedding model can be used as features in machine learning or deep learning classification tasks and for a variety of other predictive tasks.

Word2vec takes as input a set of text data or a corpus and gives back as output a set of numerical vectors representing the context, word frequencies, and relationships between words. The resulting numerical vectors can be used to transform raw text into a numeric representation that is suitable for data visualization, machine learning, and deep learning.

Text analytics is used to make predictions from unstructured text in applications ranging from computational finance to predictive maintenance. You can perform topic modeling, classification, and sentiment analysis in an efficient and scalable way by using pretrained or trained word embeddings with machine learning or deep learning. Text Analytics Toolbox™ reads word embeddings produced by word2vec, GloVe, and FastText with the wordEmbedding object.

To learn more about using word2vec, see Text Analytics Toolbox and Deep Learning Toolbox™.

See also: data science, deep learning, Deep Learning Toolbox, Statistics and Machine Learning Toolbox, Predictive Maintenance Toolbox

Introducing Deep Learning with MATLAB

Get Started with Text Analytics in MATLAB