How to manually set K-means centroids when classifying an image

23 Ansichten (letzte 30 Tage)
Hello World (wasn't that what the books told you to print way back when you started doing HTML?...)
I am exploring the kmeans function in matlab to classify an RGB image into three classes. I would like to force the kmeans with regards to the location of the centroids. As I can understand from the documentation, I should use the 'start' option, however I can not figure out how to set it correctly: In the images, I wan't to separate blue sky from water and land. Let's say that I find the sky to have an average RGB value of [120,130,190], water at [110,150,150] and land at [120,140,120]. Could any of you give an example of how to force the kmeans with these centroids? Thank you in advance for any input!

Akzeptierte Antwort

Shashank Prasanna
Shashank Prasanna am 27 Mär. 2014
if your data matrix X is n-by-p, and you want to cluster the data into 3 clusters, then the location of each centroid is 1-by-p, you can stack the centroids for the 3 clusters into a single matrix which is 3-by-p and provide to kmeans as starting centroids.
C = [120,130,190;110,150,150;120,140,120];
I am assuming here that your matrix X is n-by-3.
This is explained in the documentation:

Weitere Antworten (2)

Tom Lane
Tom Lane am 29 Mär. 2014
If your goal is to specify the centroids in advance, and not just have kmeans start with them and adjust them as things go along, then I think you don't want to use kmeans at all. Just use pdist2, find the closest centroid for each point, and classify into the cluster defined by the closest centroid.
  2 Kommentare
Andreas Westergaard
Andreas Westergaard am 29 Mär. 2014
Hi Tom. Thanks a lot for your input. I get your point. The reason I wanted to set initial centroids was to enable the classification to discover if one of the classes was not present on a given image, and thus have the algorithm to define one class less.
Image Analyst
Image Analyst am 29 Mär. 2014
That is the main reason that automatic thresholds are not always robust. If you have to find something that can range from anywhere of 0% of an image to 100% of an image, using thresholds that force you to pick automatically, or clusters that force you to pick a certain number of clusters, are not robust. They will fail if you don't have the proper number of pixels in the image belonging to those classes. For most or all of my color classification applications I use fixed values to determine the class. I used a training set to determine where the classes will be and then once I decide on them, they are fixed for all images. That way I can get area fractions for all color classes no matter if they are present or 100% or somewhere in between. If you had one cluster and told it to find 4 clusters, it would find 4 clusters but it will chop your image up into 4 clusters when if you had 3 other "real" colors there, it would find them all accurately, whereas in the first case it was calling the cluster 4 clusters when it should actually only be one cluster.

Melden Sie sich an, um zu kommentieren.


Image Analyst
Image Analyst am 27 Mär. 2014
Please mark the Answer as accepted if that's what you were looking for. Thanks.
  3 Kommentare
Image Analyst
Image Analyst am 27 Mär. 2014
Why don't you just manually segment these things. kmeans is appropriate if you have the same number of color classes but they move around in color space all the time (from image to image). If you have known classes, like you know you'll always have clouds, sky, water, sand, and grass, then it's best if you just define those regions in colorspace and segment according to them. What are you going to do if you have 5 classes like I said, and you tell it there are only 3 classes (sky, water, land)? It will fail.
Perhaps you'd like to use this approach (I haven't tried it):
Andreas Westergaard
Andreas Westergaard am 29 Mär. 2014
Bearbeitet: Andreas Westergaard am 29 Mär. 2014
Hi "Image Analyst" I tried the segmentation you suggested and it looks promising. I will pursue it a bit more. My initial idea of using Kmeans was because I need to process images under different light conditions. Thank you again for your valuable input. By the way, I tried to accept your answer as well but apparently I am only allowed to accept one answer. I gave a vote instead...

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by