Similarity of histograms: interpretation of cosine and jaccard similarities with "pdist2"
11 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Sim
am 26 Jun. 2023
Bearbeitet: Sai Teja G
am 10 Okt. 2023
I would like to assess the similarity between two "bin counts" (that I previously derived through the "histcounts" function), by using the "pdist2" function:
% input
bin_counts_a = [689 430 311 135 66 67 99 23 37 19 8 4 3 4 1 3 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1];
bin_counts_b = [569 402 200 166 262 90 50 16 33 12 6 35 49 4 12 8 8 2 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1];
% visualise the two "bin counts" vectors as bars:
bar(1:length(bin_counts_a),[bin_counts_a;bin_counts_b])
% calculation of similarities
cosine_similarity = 1 - pdist2(bin_counts_a,bin_counts_b,'cosine')
jaccard_similarity = 1 - pdist2(bin_counts_a,bin_counts_b,'jaccard')
If the cosine similarity is close to 1, which means the two vectors are similar, shouldn't the jaccard similarity be closer to 1 as well?
2 Kommentare
Dyuman Joshi
am 26 Jun. 2023
"If the cosine similarity is close to 1, which means the two vectors are similar, shouldn't the jaccard similarity be closer to 1 as well?"
No, because the similarities are defined differently. Cosine similarilty is not same as Jaccard similarity.
You can check out the definitions in the More About section of the pdist2 documentation page.
Akzeptierte Antwort
Sai Teja G
am 14 Aug. 2023
Bearbeitet: Sai Teja G
am 10 Okt. 2023
Hi Sim,
I see that you are comparing two vectors by using ‘cosine’ and ‘jaccard’ distances between them.
They are not the same, as Jaccard Similarity considers a set of unique word lengths, while cosine similarity considers the entire sentence vector, disregarding data duplication.
Please refer the following documentation for more information on distance metrics like ‘jaccard’ and ‘cosine’ for the function ‘pdist2()’ –
Hope it helps!
0 Kommentare
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Data Distribution Plots finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!