Inserting blank columns at specific points

Question

Jacob Barrett-Newton am 22 Feb. 2018

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/384176-inserting-blank-columns-at-specific-points

Kommentiert: Pawel Jastrzebski am 12 Mär. 2018

So I have a set of data that I wish to take the means of. The first set worked fine for me as there were 22 values, where there were 11 items with 2 repeats. The issue I am now having is that my second set has 29 values, where there are 10 items, where 9 items have 3 repeats and a annoying 1 item has 2 repeats.

for j = 3:16
for i=1:21
  MeanR{j,i} = mean(Numbers(j,i:(i+1)));
end
end
MeanR(:,2:2:end)=[];

I am not sure if there was an easier way, but I wanted mean values of 1&2, 3&4 etc, but this takes means values of 1&2, 2&3 etc so I just deleted every second item. But this way will not work for my second set. I tried to insert an empty column so I can use the same method, but I cannot seem to do so. Any ideas?

I have attached the file of data that I am using. I use xls read to extract the data into numbers and text, then from that work out means

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Pawel Jastrzebski am 22 Feb. 2018

Could you attach a screenshot/file or at least a description of how your data is organised?

Is your data stored in a column vector?
Or in a tabular way i.e. rows = 'items' - columns = no of repeats?

Lastly, how do you get your data into Matlab? Do you import it as a matrix? Table?

It will be easier for the community to address your problem if you provide the above details.

Jacob Barrett-Newton am 22 Feb. 2018

I have attached the data that I am using. I import it using xls read to get Numbers and Text. Then from that Calculate means. I want to calculate the means of every 2 consecutive numbers, for the first case, and then every 3 consecutive numbers for the second case. But to do that for the second case I need to insert a blank column. I suppose i could go into the xls file and do it manually, but is there a way in matlab to do it?

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Pawel Jastrzebski am 22 Feb. 2018

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/384176-inserting-blank-columns-at-specific-points#answer_306547

In MATLAB Online öffnen

% import data to Matlab as a Table
T_data = readtable('Data_of_interest.csv');
% get rid of meaningless columns/rows
T_data(1,:)      = [];
T_data(:,53:end) = [];
% 1st column seems to be a description so use it as such
T_data.Properties.RowNames = T_data.(1);
T_data.(1) = [];
% Split the data into
% - SET1: first 22 columns
% - SET2: the rest
T_set1 = T_data(:,1:22);
T_set2 = T_data(:,23:end);
% Let's focus now on the 'T_set2' as it has irregularl
% number of the repeats
% One way to deal with the problem is to count
% the uniqe tests and their repeats.
% Your column names have a fixed format'.
% Make sure you only keep the relevant bit of the name, that is:
% 'x201X_XX_XX' = 11 CHARS
% If you loose the suffix, you'll be able to determine:
% 1) how many unique names you have
% 2) what is the number of their repeats
% 3) their location within table
% get the colum names
columNames = T_set2.Properties.VariableNames;
% transpose them for easier human readibility
columNames = columNames'
% remove the suffix
for i=1:length(columNames)
    % before truncation
    columNames(i)
    columNames{i}(11+1:end) = [];
    %after truncation
    columNames(i)
end
% find unique names (logical vector)
uniqueNameCounter = false(length(columNames),1);
uniqueNameCounter(1) = true;
for i=2:length(columNames)
      if columNames{i}==columNames{i-1}
          uniqueNameCounter(i) = false;
      else
          uniqueNameCounter(i) = true;
      end
  end
% now use this logical vector to find the indexes
% of your uniqe tests
allindexes = 1:size(T_set2,2);
uniqueIndexes = allindexes(uniqueNameCounter);
% now use 'uniqueIndexes' to get the number
% of the repeats per test
noOfrepeats = uniqueIndexes(2:end)-uniqueIndexes(1:end-1)
% notice that you have 10 unique names but 9 'noOfRepeats'
% 10th number of the repeats is produced in a following way:
noOfrepeats(end+1) = size(T_set2,2) - uniqueIndexes(end)+1
% Now that you know where your uniqe names are
% And the number of the repeats per unique name
% You can create a code that accesses these data
% And applies whatever calculation you need
% i.e. Add MEAN
% preallocate memory for your calcualtion
tempMat = zeros(size(T_data,1),length(uniqueIndexes));
% calculate MEAN
for i = 1:length(uniqueIndexes)
      % show selected fragment of the table
      T_set2(:,uniqueIndexes(i):uniqueIndexes(i)+noOfrepeats(i)-1)
      % make room for new data
      aveVal =  mean(T_set2{:,uniqueIndexes(i):uniqueIndexes(i)+noOfrepeats(i)-1},2);
      tempMat(:,i) = aveVal
  end
% Store MEAN results to a table
T_AveVal = array2table(tempMat);
% new column names
newNames = cell(1, length(uniqueIndexes));
for i=1:length(uniqueIndexes)
       newNames{i}= [columNames{uniqueIndexes(i)} '_Mean']
  end
T_AveVal.Properties.VariableNames = newNames;
% Display MEAN
T_AveVal

4 Kommentare
2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

Jacob Barrett-Newton am 10 Mär. 2018

Hi,

I was just wondering if you could help me again. I have been plotting the data, and for the first 22 values it works fine as there are 2 repeats, but like the issue with calculating means for the second 29 values, I am having the same issue with plotting. I have managed to do it one way, where i basically plot the values all separately, but its a pretty long loop and I'm not even sure it works properly. Any ideas?

Pawel Jastrzebski am 12 Mär. 2018

How are you plotting the data? What type of plot are do you use/want to use?
Could you submit your code so far?

Melden Sie sich an, um zu kommentieren.

Answer 2

Jan am 22 Feb. 2018

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/384176-inserting-blank-columns-at-specific-points#answer_306514

In MATLAB Online öffnen

What about:

[s1, s2] = size(Numbers);
MeanR    = squeeze(mean(reshape(Numbers, s1, 2, s2/2), 2));

But this works only, if s2 is even. At least, no loop is required.

Your loop could be modified:

for j = 3:16
  for i = 1:2:21  % Instead of 1:21
    MeanR{j, (i+1)/2} = mean(Numbers(j, i:(i+1)));
  end
end

How do you determine "repeats"? Maybe splitapply is much easier for your purpose.

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Rik am 22 Feb. 2018

Zeroes will affect the mean, so you should use NaN values to pad the array if you're going to, and use 'omitnan' in your call to mean

Jan am 22 Feb. 2018

In MATLAB Online öffnen

@Jacob: I asked for how the "repeats" are recognized, because this would allow to post a matching suggestion with splitapply or accumarray. But I can invent an example:

x     = rand(1, 20) > 0.6;
group = 1 + cumsum(x);  % Perhaps [1,1,2,3,3,3,4,5,5,5,...]
data  = rand(1, 20);    % Any data
meanData = splitapply(@mean, data, group)

Now meanData contains the mean value over all elements of data belonging to the same value in group. This considers the number of elements automatically and this is cleaner than inserting dummy data only to ignore it later.

Melden Sie sich an, um zu kommentieren.

Answer 3

Pawel Jastrzebski am 23 Feb. 2018

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/384176-inserting-blank-columns-at-specific-points#answer_306681

Bearbeitet: Pawel Jastrzebski am 23 Feb. 2018

Jacob Barrett-Newton thank you.

In general, if you post a specific question and provide all the information, the community is more likely to help you out.
In regards to my code, I consider myself a beginner as well and there's probably plenty of way it can be improved. Jan Simon's suggestion is one for starters.
I think that if you swap 'mean' calculation for the 'standard deviation' - the code will still work. However, from the statistical point of view - is getting a std.dev from 2 or 3 data points is going to provide a relevant measure for you?
Lastly, this code only works, because you have a unique date for each test. If you were to test multiple tests on the same day with multiple repeats you would either have to think of different file naming convention or tweak the bit of the code that counts the unique names.