K-means clustering implementation on physiological signals data

6 Ansichten (letzte 30 Tage)
I need your kind guidance in using K-mean for clustering on physiological signals data. I have 20 signals file, each file has 14 features and the size is different for each file such as 27X14, 22X14 etc. Now i want to give all these files at a time as input to K-mean to generate the clustering. Bit it always create wrong scatterplot for clusters, as can be seen from plot, there is large number of data at 0, which looks wrong to me. I zoomed in this data in another plot to show you all. I tried to reshap my data but still cannot create the right cluster. I am attaching the code which i am using and the scatterplot. I also attached the data file. Please guide me how to do this or is there any other code to do. Looking forward to hearing from you. Thanks
cd 'D:\Research\Classification\final'
labels1=dir('*.csv');
figure
IMF_U = [];
for j =1: length(labels1)
sw = readmatrix(labels1(j).name);
% sw =reshape(sw,[],2);
IMF_U=[IMF_U;sw];
end
IMF_U =reshape(IMF_U,[],2);
[idx,C] = kmeans(IMF_U,3);
gscatter(IMF_U(:,1),IMF_U(:,2),idx,'bgm')
hold on
plot(C(:,1),C(:,2),'kx')
hold off
legend('Cluster 1','Cluster 2','Cluster 3','Centroid')
  3 Kommentare
John D'Errico
John D'Errico am 21 Feb. 2023
You appear to have some information in your data set that is all zero. And of course, kmeans found it. What would you expect? If you want better help, then you need to provide the data. Attach it as @Image Analyst showed.
Muhammad Hammad Malik
Muhammad Hammad Malik am 21 Feb. 2023
Thank you for the response. I have attached the data file which i am using. There are values in negative and near to zero's too. Kindly have a look. Thanks

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Image Analyst
Image Analyst am 21 Feb. 2023
I made some changes but what I can't figure out is why you reshape your data from 14 columns into only 2 columns. Why are you doing that?
% Optional initialization steps
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 18;
% cd 'D:\Research\Classification\final'
fileList = dir('feat*.csv');
figure('Name', 'kmeans data');
IMF_U = [];
for k = 1 : length(fileList)
% Get data for this file only.
thisFilesData = readmatrix(fileList(k).name);
% sw =reshape(sw,[],2);
IMF_U = [IMF_U; thisFilesData];
end
% Reshape from 14 columns into 2 columns. WHY?????
IMF_U =reshape(IMF_U,[],2);
% Specify the number of classes we want to force the data into.
numberOfClasses = 3;
% Do the kmeans clustering.
[classAssignments, classCentroids] = kmeans(IMF_U, numberOfClasses);
% Plot the 2 columns.
gscatter(IMF_U(:,1),IMF_U(:,2), classAssignments, 'bgm', '.', 30)
hold on
grid on;
% Plot class centroids with a big black X.
plot(classCentroids(:,1), classCentroids(:,2),'kx', 'LineWidth', 2, 'MarkerSize', 20)
hold off
legend('In Cluster 1','In Cluster 2','In Cluster 3','Class Centroids')
fprintf('All Done!\n');
Other than that the results look reasonable.
  5 Kommentare
Image Analyst
Image Analyst am 22 Feb. 2023
Features (columns) 1, 3, 4, and 12 (especially those) have values close to zero. But that doesn't matter. It can still classify the measurements (rows) into 3 classes. If you look at the class assignments:
>> classAssignments
classAssignments =
1
1
1
1
1
1
1
1
1
1
1
1
3
3
3
3
3
3
1
1
2
you'll see that every row got classified as some class/cluster, and all 3 classes are represented. It's impossible to visualize a 14-dimensional space to see the clusters, but, guaranteed, they are in clusters (because you forced it to find 3 clusters). If you just scatter 2 of those features against each other, you may or may not see clustering. Just because you have clusters when considering all 14 features does NOT mean that you will notice the clusters in 2-dimensional space where you plot only a pair of features against each other. You might or might not see clusters, or it might depend on the two or three features you choose to show in the scatter plot.
Does that make sense?
By the way, you can see the last row is the only one in class 2 because that row looks absolutely nothing like the rows above it, so it's its own class.
There is something different between rows assigned to class 1 and those assigned to class 3 but it's difficult to see exactly what is the main reason because there are 14 features and we can't plot in 14-dimensional space to visualize it.
Muhammad Hammad Malik
Muhammad Hammad Malik am 22 Feb. 2023
Alright, thanks alot for detailed explaination, i got your point.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by