Does the text analytics toolbox allow users to test out-of-sample perplexity with LDA?
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Stephen Bruestle
am 15 Okt. 2018
Kommentiert: Stephen Bruestle
am 30 Nov. 2018
I want to create two samples from my data: one for training and one for testing. Then I want to fit the LDA model using the training sample. Then I want to test the preplexity of the test sample using the fitted model. Is this possible with the text analytics toolbox?
0 Kommentare
Akzeptierte Antwort
Christopher Creutzig
am 26 Nov. 2018
The second output of logp gives you the perplexity.
txt = extractFileText('sonnets.txt');
sonnets = split(txt,[newline newline]);
sonnets = sonnets(5:2:end);
td = tokenizedDocument(sonnets);
bow = bagOfWords(td(1:50));
mdl = fitlda(bow,5,'Verbose',0);
[~,perpl] = logp(mdl, encode(bow,td(51:53)))
% perpl = 337.4999
2 Kommentare
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Text Analytics Toolbox finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!