Appending to the field of a structure array

14 Ansichten (letzte 30 Tage)
Rick
Rick am 5 Jul. 2014
Hello, I am trying to create a word index. I start off with an empty cell array with 3 fields: Word, Documents, and Locations. For now ignore the latter two. I have a cell array with words
Doc1 = {'Matlab','is','awesome'};
To avoid confusion, there are other documents that have the same word. I want to take my Index, which I created a function for here
function Index = InitializeIndex()
c10 = cell(1,0);
Index = struct('Word', c10, 'Documents', c10, 'Locations', c10);
I want to add the unique words into Index, so here is my function.
function Index = InsertDoc(Index, newDoc, DocNum)
% This function will be a struct array where each element corresponds to a
% unique word in a group of documents. In each element of the struct array
% the word is stored in the Word field, the document numbers that the word
% is contained is in the documents field, and the locations of the word in
% each document is in the Location field.
Index = {Index.Word};
for i = 1:numel(newDoc)
% IndexWord is either empty or the word is not present in IndexWord
if isempty(Index) || strcmpi(Index{i},newDoc(i))
Index.Word{end+1} = newDoc(i);
end
end
My problem is twofold. First, I am having difficulty with my condition regarding the word being unique in index. How do I make it so that it knows if the word does not exist in index, then append? The second question is how do I actually append the word into the word field of Index?

Antworten (2)

Alfonso Nieto-Castanon
Alfonso Nieto-Castanon am 5 Jul. 2014
Bearbeitet: Alfonso Nieto-Castanon am 5 Jul. 2014
Assuming that you want Index to be a single struct with fields Words/Documents/Locations (each of the fields being a cell array), then you could do something along these lines:
UniqueWordsInDoc = unique(newDoc); % unique words
in = ismember(UniqueWordsInDoc,Index.Word); % words already in Index
idx = numel(Index.Word)+(1:nnz(~in)); % new Index entries
Index.Word(idx) = UniqueWordsInDoc(~in); % adds new words
If, on the other hand, you want Index to be a struct array with fields Words/Documents/Locations (each of the fields being a string or vector), then you could do something along these lines:
UniqueWordsInDoc = unique(newDoc); % unique words
in = ismember(UniqueWordsInDoc,{Index.Word}); % words already in Index
idx = numel(Index)+(1:nnz(~in)); % new Index entries
[Index(idx).Word] = deal(UniqueWordsInDoc{~in}); % adds new words
In the former case you initialize using:
Index = struct('Word',{{}});
while in the latter you would initialize Index using:
Index = struct('Word',{});
I hope this clarifies the "indexing" issues, this can be kind of tricky...
EDIT1: added correction by Cedric
EDIT2: "concatenating" versions, something along these lines:
in case 1:
UniqueWordsInDoc = unique(newDoc); % unique words
in = ismember(UniqueWordsInDoc,Index.Word); % words already in Index
Index.Word = [Index.Word UniqueWordsInDoc(~in)]; % adds new words
in case 2:
UniqueWordsInDoc = unique(newDoc); % unique words
in = ismember(UniqueWordsInDoc,{Index.Word}); % words already in Index
Index = [Index cell2struct(UniqueWordsInDoc(~in),'Word')']; % adds new words
  7 Kommentare
Rick
Rick am 5 Jul. 2014
Here is my code right now
function Index = InsertDoc(Index, newDoc, DocNum)
% This function will be a struct array where each element corresponds to a
% unique word in a group of documents. In each element of the struct array
% the word is stored in the Word field, the document numbers that the word
% is contained is in the documents field, and the locations of the word in
% each document is in the Location field.
for i = 1:numel(newDoc)
% IndexWord is either empty or the word is not present in IndexWord
if isempty(Index)|| strcmpi({Index.Word},newDoc{i})
Index(end + 1).Word = newDoc{i};
end
end
Here is my input
Doc1 = {'Matlab','is','awesome'};
E7 = InitializeIndex;
E7 = InsertDoc(E7,Doc1,1);
and my output was not what I expected. I expected E7(2) to be 'is'.
EDU>>E7(1)
ans =
Word: 'Matlab'
Documents: []
Locations: []
EDU>> E7(2)
Index exceeds matrix dimensions.
Alfonso Nieto-Castanon
Alfonso Nieto-Castanon am 5 Jul. 2014
Bearbeitet: Alfonso Nieto-Castanon am 5 Jul. 2014
change
strcmpi({Index.Word},newDoc{i})
to
~any(strcmpi({Index.Word},newDoc{i}))

Melden Sie sich an, um zu kommentieren.


the cyclist
the cyclist am 5 Jul. 2014
For the first part, use the ismember() command. For the second part, you can just append using
new_list = {old_list,new_word};
  3 Kommentare
the cyclist
the cyclist am 5 Jul. 2014
Actually, I think I misunderstood what you meant. If you already had
Index(1).Word = 'cat';
then you can append with
Index(end+1).Word = 'dog';
Rick
Rick am 5 Jul. 2014
Bearbeitet: Rick am 5 Jul. 2014
So do you mean this?? I got rid of Index = {Index.Word} because that is overwriting my function InitializeIndex
for i = 1:numel(newDoc)
% IndexWord is either empty or the word is not present in IndexWord
if isempty(Index)|| strcmpi({Index.Word},newDoc{i})
Index(end + 1).Word = newDoc{i};
end
end
I get the following problem. When I type Index(2).Word, I get 'Index exceeds matrix dimensions.'

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by