MATLAB Answers


sonnetsCounts.mat file

Asked by Peter Mayhew on 23 Dec 2018
Latest activity Commented on by Walter Roberson
on 26 Dec 2018
Does anyone know how the sonnetsCounts.mat file was created on the following MATLAB page:
Predict Top LDA Topics of Word Count Matrix
Load the example data. sonnetsCounts.mat contains a matrix of word counts and a corresponding vocabulary of preprocessed versions of Shakespeare's sonnets.
load sonnetsCounts.mat
ans = 1×2
154 3092
When I open the sonnetsCounts.mat file, it has the following data
val =
(1,1) 1
(106,1) 1
(131,1) 2
(154,1) 1
(1,2) 1
(143,2) 1
I presume the second column in the frequency of words. But I'm not sure if the vector in the first column represents two words?


Sign in to comment.





1 Answer

Walter Roberson
Answer by Walter Roberson
on 24 Dec 2018
Edited by Walter Roberson
on 24 Dec 2018
 Accepted Answer

The counts is a sparse matrix.
(143,2) 1
means that sonnet #143, unique word #2, had a count of 1.


Show 1 older comment
Walter Roberson
on 26 Dec 2018
No. It is the Counts property of the bag directly not the result of encoding an additional document against the bag.
Peter Mayhew on 26 Dec 2018
OK, so if I understand correctly. I would perform the following command
bag = bagOfWords(documents);
Then check the counts property of variable bag.
Walter Roberson
on 26 Dec 2018
Counts with a capital C, but Yes.

Sign in to comment.