What happens if I use the CLUSTERDATA function with the 'ward' method and the 'cosine' measure in Statistics Toolbox 8.1 (R2012b)?

Reading the MATLAB documentation for the function CLUSTERDATA it says that the 'ward' method is defined for Euclidean distances only. However, if I run the CLUSTERDATA function with 'ward' and 'cosine' options I obtain a warning and better results than running the function with the Euclidean distance.

 Akzeptierte Antwort

In the 'ward' linkage, the distance between two clusters is defined as a weighted version of the 'Euclidean' distance of the centroids of these two clusters. Our documentation page shows the formula used for 'ward' linkage (<http://www.mathworks.com/help/stats/linkage.html>).
It can be shown that, when 'Euclidean' distance is used; 'ward' linkage method forms a new cluster by merging the two clusters that lead to the smallest possible increase of the total sum of the squares of the observation-to-centroid distances.
When the distance is set to 'cosine' in 'ward' linkage option, the 'cosine' distance will replace the 'Euclidean' distance in the formula shown in our documentation. However, using 'cosine' distance does not have a straightforward interpretation.

Weitere Antworten (0)

Produkte

Version

R2012b

Tags

Noch keine Tags eingegeben.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by