query about svm classifier
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi, I have learnt that we need to give two kinds of parameters to the svm. One for each class. I have a question. I referred this link and will use it to explain my query
In the above example, it says that the trainingLabels tell the classifier if the digit belongs to a particular category or not. But i do not understand as to what should be the percentage of belong and not-belong. i.e if i have 5 images of a digit '1', and trainingLabel for that is [1 1 1 1 1] , then should i also give 5 digits which are not digit '1' ? then the trainingLabel would be [1 1 1 1 1 0 0 0 0 0 ]. Is my understanding correct? If not, what is the percentage that we should give which belongs to one group and percentage of input which doesnt belong to it .
Please clarify
0 Kommentare
Antworten (1)
Walter Roberson
am 11 Mär. 2014
The labels are not percentages, they are category numbers that have no mathematical meaning. You could use (say) 39 as the trainingLabel for your digit '1' and you could use 54 for the letter 'i' and 27 for the digit '2' and whatever other arbitrary values are convenient. So if the order of the samples was '1', '1', '1', '1', '1', 'i', '2', '2', 'i', 'i' then the trainingLabel would be [39 39 39 39 39 54 27 27 39 39]
Use anything consistent that is convenient. You could probably even use characters such as
trainingLabel = '11111i22ii'
as long as the vector is one position per sample and the numbering is consistent.
3 Kommentare
Walter Roberson
am 11 Mär. 2014
You are right, svmtrain() only accepts two distinct (non-error) values:
Grouping variable, which can be a categorical, numeric, or logical vector, a cell vector of strings, or a character matrix with each row representing a class label. Each element of Group specifies the group of the corresponding row of Training. Group should divide Training into two groups. Group has the same number of elements as there are rows in Training. svmtrain treats each NaN, empty string, or 'undefined' in Group as a missing value, and ignores the corresponding row of Training.
The trainingLabel should be a column vector, such as ('11111').' or ['1';'1';'1';'1';'1']
Yes you absolutely need to train with the different classes present in the input.
Siehe auch
Kategorien
Mehr zu Statistics and Machine Learning Toolbox finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!