Term Frequency–Inverse Document Frequency (tf-idf) matrix
[1] Barrios, Federico, Federico López, Luis Argerich, and Rosa Wachenchauzer. "Variations of the Similarity Function of TextRank for Automated Summarization." arXiv preprint arXiv:1602.03606 (2016).
bagOfNgrams
| bagOfWords
| encode
| tokenizedDocument
| topkngrams
| topkwords