# How to estimate K for K-means clustring

20 Ansichten (letzte 30 Tage)
wisekily am 15 Mai 2016
Kommentiert: Bashar Saad am 12 Jul. 2019
I'm working on unsupervised classification or clustering, i want to estimate the K (which refers to cluster number) before starting th k-means algorithm
##### 0 Kommentare-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

### Akzeptierte Antwort

Walter Roberson am 16 Mai 2016
You will probably not find any code already implemented for this purpose.
The theoretical answer for the "best" number of clusters to use is "one cluster for every unique point", as that will always have the best possible fit.
If you do not wish to use one cluster for every unique point, you need to have some kind of penalty term that favors fewer clusters. I read through the theory paper on that a few years ago, and it was clear to me that they were setting the weights arbitrarily (but usefully for the kinds of clustering they were doing), and that there was no way to calculate what the weights should be without some knowledge of the range of number of clusters that would be appropriate for the physical system being examined. The theoretical algorithms were not suitable for "unsupervised learning", only for "supervised learning". The work we were doing at the time required unsupervised learning, so there was no way for us to determine what the proper number of clusters should be.
##### 0 Kommentare-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

### Weitere Antworten (3)

the cyclist am 15 Mai 2016
This is not really a MATLAB question, but rather a general data science question.
##### 4 Kommentare2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden
Image Analyst am 15 Mai 2016
There are MATLAB functions for estimating the best k. I don't remember what they were - I'd have to look them up in the Machine Learning course notes.
wisekily am 15 Mai 2016

Melden Sie sich an, um zu kommentieren.

Image Analyst am 15 Mai 2016
The web page on kmeans explains how you can use silhouette() to determine the best number of clusters, k:
##### 3 Kommentare1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden
Walter Roberson am 16 Mai 2016
the cyclist am 16 Mai 2016
Which is also the same link that I pointed you to earlier. So, uh, now you have 3 of the top 10 contributors to this forum telling you consistently the same thing.

Melden Sie sich an, um zu kommentieren.

kira am 2 Mai 2019
old question, but I just found a way myself looking at matlab documentation:
klist=2:n;%the number of clusters you want to try
myfunc = @(X,K)(kmeans(X, K));
eva = evalclusters(net.IW{1},myfunc,'CalinskiHarabasz','klist',klist)
classes=kmeans(net.IW{1},eva.OptimalK);
##### 1 Kommentar-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden
Bashar Saad am 12 Jul. 2019
could you help me pleas the code is not clear

Melden Sie sich an, um zu kommentieren.

### Kategorien

Mehr zu k-Means and k-Medoids Clustering finden Sie in Help Center und File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by