How to split a column's elements to two vectors based on lables?

Question

0 Stimmen

lung1.mat

I attached a part of lung dataset(32X57), It's last column is the lables(1 or 2), I want to split each column to two vectors based on the lables:

F(i).normal vector for saving matrix's elements with lable 1 ,

F(i).tumor vector for saving elements with lable 2 .

I attached my matlab code.

For adding each column's elements in a vector, It seems this code is not true. I'll be very gratefull to have your opinion.

close all;
clc
load lung.mat
       F=lung; 
       [n,m]=size(F);
   
for i=1: m                    
    s1=0;  s2=0;                 
    for j=1: n                 
        if (F(j,m)==1)     
           for z=1:s1
            F(i).normal(z)=F(j,i);  
            s1=s1+1;   
           end
            
        else         
            for x=1:s2
            F(i).tumor(x)=F(j,i); 
            s2=s2+1;   
            end
        end
    end
end

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

Image Analyst am 27 Dez. 2018

In MATLAB Online öffnen

1 Stimme

You didn't attach lung.mat. But is this what you want:

% Create sample data.
data = randi(9, 32, 57); % Random integers in the range 1-9.
data(:, end) = randi(2, 32, 1) % Last columns is 1 or 2 ONLY.
% Find out what rows are labeled 1 and 2
% by looking in the last column.
rowsLabeled1 = data(:, end) == 1;
rowsLabeled2 = data(:, end) == 2;
% Extract rows labeled 1 and 2 into their own matrices.
data1 = data(rowsLabeled1, :);
data2 = data(rowsLabeled2, :);
% You can get vectors from each column by extracting it into a new variable
% e.g. to get 2 vectors for column 5, do
col51 = data1(:, 5); % Get col 5 with label 1.
col52 = data2(:, 5); % Get col 5 with label 2.

14 Kommentare
12 ältere Kommentare anzeigen 12 ältere Kommentare ausblenden

Image Analyst am 29 Dez. 2018

kNearestNeighbor.m

You already know how to use pdist2, and you can plot all those distances, and even get a histogram of them. If you want to split into two zones, you can use graythresh(), imbinarize() or kmeans(), though like before I think that makes little to no sense. You still haven't explained why. Anyway, you should use a fixed threshold for consistency. Using an automatic threshold that varies depending on how many points are class 1 or class 2 is not good for comparing data sets. What if the distances were normally distributed? What does that mean? The numbers are uniformly distributed??? What if the distances had two clusters? What does that mean? That the measurements were in two tight clusters? It seems that by having the data for that measurement already labeled that someone has already somehow thresholded something, and it's probably the values themselves rather than the distance between them. But go ahead and do it and show us the values and the histograms, and the distance values and the distance value histogram and we can see if the distance histogram gives any additional insight.

It would be easy for you to make up data sets that range from clustered to uniformly distributed and compute the distances in each case. For example, in my K Nearest Neighbor demo, I create two classes, each with a spread, and a separation between the two classes. Though it's in 2-D for 2 variables. You could actually just make two classes in 1-D simply by using rand() and randn() and setting the mean and spread for each class.

Image Analyst am 29 Dez. 2018

In MATLAB Online öffnen

OK, I programmed up a simple Monte Carlo Simulation for you with uniform, non-overlapping distributions for two classes. It is attached. You can see the measurement values, the distance values, and the histogram of the distance values. I think you can do a lot of your experimentation and discovery of insights just by trying different distributions in a Monte Carlo fashion. For example, maybe the distribution of distances is the convolution of the distributions of the two measurement class distributions. What do you think?

% Program to do a Monte Carlo simulation of measurements between two classes of patients.
clc;    % Clear the command window.
close all;  % Close all figures (except those of imtool.)
imtool close all;  % Close all imtool figures if you have the Image Processing Toolbox.
clear;  % Erase all existing variables. Or clearvars if you want.
workspace;  % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 16;
% Specify parameters.
numClass1 = 120; % Number of measurements in class 1.
numClass2 = 80; % Number of measurements in class 2.
meanClass1 = 25;
meanClass2 = 75;
spread1 = 25;
spread2 = 25;
% Generate measurements
class1Values = meanClass1 + spread1 * (rand(numClass1, 1) - 1);
class2Values = meanClass2 + spread2 * (rand(numClass2, 1) - 1);
% Plot measurements
subplot(2, 2, 1);
plot(class1Values, 'b*', 'MarkerSize', 10, 'LineWidth', 2);
hold on;
plot(class2Values, 'r*', 'MarkerSize', 10, 'LineWidth', 2);
xlabel('Measurement Number', 'FontSize', fontSize);
ylabel('Measurement Value', 'FontSize', fontSize);
title('Measurement Value for Every Patient', 'FontSize', fontSize);
grid on;
legend1 = sprintf('%d in Class 1', numClass1);
legend2 = sprintf('%d in Class 2', numClass2);
legend(legend1, legend2, 'location', 'east');
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0, 0.04, 1, 0.96]);
drawnow;
% Compute distances of every point to every other point.
set1 = [zeros(length(class1Values), 1), class1Values];
set2 = [zeros(length(class2Values), 1), class2Values];
distances = pdist2(set1, set2);
subplot(2, 2, 2);
bar(distances);
grid on;
title('Distances between Class 1 Points and Class 2 Points', 'FontSize', fontSize);
xlabel('Pair Number', 'FontSize', fontSize);
ylabel('Distance between pair', 'FontSize', fontSize);
% Show histogram of distances.
subplot(2, 2, 3:4);
histogram(distances);
grid on;
caption = sprintf('Histogram of %d Distances between Class 1 Points and Class 2 Points', numel(distances));
title(caption, 'FontSize', fontSize);
xlabel('Distance', 'FontSize', fontSize);
ylabel('Count', 'FontSize', fontSize);

Melden Sie sich an, um zu kommentieren.

Answer 2

Cris LaPierre am 27 Dez. 2018

0 Stimmen

Your data is not attached, so nothing to test but have you looked into using a table and the functions findgroup and splitapply? See some examples here.

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

phdcomputer Eng am 27 Dez. 2018

In MATLAB Online öffnen

Thanks greatly

I attached a part of the data.(lung1.mat)

In the following code:

I used pdist2 function to compute distance between two column vectors by using jaccard measure.

I wrote this in command line to see the distance result:

pdist2(data(:,2),data(:,2),'jaccard');

but there is an error:

Undefined function or variable 'data'.

I'll be grateful to have your opinion.

close all;
clc
load lung.mat
data=lung; 
[n,m]=size(data);
rowslabled1=data(:,m)==1;   
rowslabled2=data(:,m)==2;   
data1=data(rowslabled1,:);  
data2=data(rowslabled2,:);      
for i=1: m                       
   data1(:,i);      
   data2(:,i);
   d=pdist2(data(:,i),data(:,i),'jaccard');
end

Melden Sie sich an, um zu kommentieren.

How to split a column's elements to two vectors based on lables?

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

14 Kommentare
12 ältere Kommentare anzeigen 12 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Kategorien

Tags

Community Treasure Hunt

How to split a column's elements to two vectors based on lables?

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

14 Kommentare 12 ältere Kommentare anzeigen 12 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar -1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Kategorien

Tags

Siehe auch

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

14 Kommentare
12 ältere Kommentare anzeigen 12 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden