Inserting blank columns at specific points

3 Ansichten (letzte 30 Tage)
Jacob Barrett-Newton
Jacob Barrett-Newton am 22 Feb. 2018
Kommentiert: Pawel Jastrzebski am 12 Mär. 2018
So I have a set of data that I wish to take the means of. The first set worked fine for me as there were 22 values, where there were 11 items with 2 repeats. The issue I am now having is that my second set has 29 values, where there are 10 items, where 9 items have 3 repeats and a annoying 1 item has 2 repeats.
for j = 3:16
for i=1:21
MeanR{j,i} = mean(Numbers(j,i:(i+1)));
end
end
MeanR(:,2:2:end)=[];
I am not sure if there was an easier way, but I wanted mean values of 1&2, 3&4 etc, but this takes means values of 1&2, 2&3 etc so I just deleted every second item. But this way will not work for my second set. I tried to insert an empty column so I can use the same method, but I cannot seem to do so. Any ideas?
I have attached the file of data that I am using. I use xls read to extract the data into numbers and text, then from that work out means
  3 Kommentare
Pawel Jastrzebski
Pawel Jastrzebski am 22 Feb. 2018
Could you attach a screenshot/file or at least a description of how your data is organised?
  • Is your data stored in a column vector?
  • Or in a tabular way i.e. rows = 'items' - columns = no of repeats?
Lastly, how do you get your data into Matlab? Do you import it as a matrix? Table?
It will be easier for the community to address your problem if you provide the above details.
Jacob Barrett-Newton
Jacob Barrett-Newton am 22 Feb. 2018
I have attached the data that I am using. I import it using xls read to get Numbers and Text. Then from that Calculate means. I want to calculate the means of every 2 consecutive numbers, for the first case, and then every 3 consecutive numbers for the second case. But to do that for the second case I need to insert a blank column. I suppose i could go into the xls file and do it manually, but is there a way in matlab to do it?

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Pawel Jastrzebski
Pawel Jastrzebski am 22 Feb. 2018
% import data to Matlab as a Table
T_data = readtable('Data_of_interest.csv');
% get rid of meaningless columns/rows
T_data(1,:) = [];
T_data(:,53:end) = [];
% 1st column seems to be a description so use it as such
T_data.Properties.RowNames = T_data.(1);
T_data.(1) = [];
% Split the data into
% - SET1: first 22 columns
% - SET2: the rest
T_set1 = T_data(:,1:22);
T_set2 = T_data(:,23:end);
% Let's focus now on the 'T_set2' as it has irregularl
% number of the repeats
% One way to deal with the problem is to count
% the uniqe tests and their repeats.
% Your column names have a fixed format'.
% Make sure you only keep the relevant bit of the name, that is:
% 'x201X_XX_XX' = 11 CHARS
% If you loose the suffix, you'll be able to determine:
% 1) how many unique names you have
% 2) what is the number of their repeats
% 3) their location within table
% get the colum names
columNames = T_set2.Properties.VariableNames;
% transpose them for easier human readibility
columNames = columNames'
% remove the suffix
for i=1:length(columNames)
% before truncation
columNames(i)
columNames{i}(11+1:end) = [];
%after truncation
columNames(i)
end
% find unique names (logical vector)
uniqueNameCounter = false(length(columNames),1);
uniqueNameCounter(1) = true;
for i=2:length(columNames)
if columNames{i}==columNames{i-1}
uniqueNameCounter(i) = false;
else
uniqueNameCounter(i) = true;
end
end
% now use this logical vector to find the indexes
% of your uniqe tests
allindexes = 1:size(T_set2,2);
uniqueIndexes = allindexes(uniqueNameCounter);
% now use 'uniqueIndexes' to get the number
% of the repeats per test
noOfrepeats = uniqueIndexes(2:end)-uniqueIndexes(1:end-1)
% notice that you have 10 unique names but 9 'noOfRepeats'
% 10th number of the repeats is produced in a following way:
noOfrepeats(end+1) = size(T_set2,2) - uniqueIndexes(end)+1
% Now that you know where your uniqe names are
% And the number of the repeats per unique name
% You can create a code that accesses these data
% And applies whatever calculation you need
% i.e. Add MEAN
% preallocate memory for your calcualtion
tempMat = zeros(size(T_data,1),length(uniqueIndexes));
% calculate MEAN
for i = 1:length(uniqueIndexes)
% show selected fragment of the table
T_set2(:,uniqueIndexes(i):uniqueIndexes(i)+noOfrepeats(i)-1)
% make room for new data
aveVal = mean(T_set2{:,uniqueIndexes(i):uniqueIndexes(i)+noOfrepeats(i)-1},2);
tempMat(:,i) = aveVal
end
% Store MEAN results to a table
T_AveVal = array2table(tempMat);
% new column names
newNames = cell(1, length(uniqueIndexes));
for i=1:length(uniqueIndexes)
newNames{i}= [columNames{uniqueIndexes(i)} '_Mean']
end
T_AveVal.Properties.VariableNames = newNames;
% Display MEAN
T_AveVal
  4 Kommentare
Jacob Barrett-Newton
Jacob Barrett-Newton am 10 Mär. 2018
Hi,
I was just wondering if you could help me again. I have been plotting the data, and for the first 22 values it works fine as there are 2 repeats, but like the issue with calculating means for the second 29 values, I am having the same issue with plotting. I have managed to do it one way, where i basically plot the values all separately, but its a pretty long loop and I'm not even sure it works properly. Any ideas?
Pawel Jastrzebski
Pawel Jastrzebski am 12 Mär. 2018
  • How are you plotting the data? What type of plot are do you use/want to use?
  • Could you submit your code so far?

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (2)

Jan
Jan am 22 Feb. 2018
What about:
[s1, s2] = size(Numbers);
MeanR = squeeze(mean(reshape(Numbers, s1, 2, s2/2), 2));
But this works only, if s2 is even. At least, no loop is required.
Your loop could be modified:
for j = 3:16
for i = 1:2:21 % Instead of 1:21
MeanR{j, (i+1)/2} = mean(Numbers(j, i:(i+1)));
end
end
How do you determine "repeats"? Maybe splitapply is much easier for your purpose.
  3 Kommentare
Rik
Rik am 22 Feb. 2018
Zeroes will affect the mean, so you should use NaN values to pad the array if you're going to, and use 'omitnan' in your call to mean
Jan
Jan am 22 Feb. 2018
@Jacob: I asked for how the "repeats" are recognized, because this would allow to post a matching suggestion with splitapply or accumarray. But I can invent an example:
x = rand(1, 20) > 0.6;
group = 1 + cumsum(x); % Perhaps [1,1,2,3,3,3,4,5,5,5,...]
data = rand(1, 20); % Any data
meanData = splitapply(@mean, data, group)
Now meanData contains the mean value over all elements of data belonging to the same value in group. This considers the number of elements automatically and this is cleaner than inserting dummy data only to ignore it later.

Melden Sie sich an, um zu kommentieren.


Pawel Jastrzebski
Pawel Jastrzebski am 23 Feb. 2018
Bearbeitet: Pawel Jastrzebski am 23 Feb. 2018
  • In general, if you post a specific question and provide all the information, the community is more likely to help you out.
  • In regards to my code, I consider myself a beginner as well and there's probably plenty of way it can be improved. Jan Simon's suggestion is one for starters.
  • I think that if you swap 'mean' calculation for the 'standard deviation' - the code will still work. However, from the statistical point of view - is getting a std.dev from 2 or 3 data points is going to provide a relevant measure for you?
  • Lastly, this code only works, because you have a unique date for each test. If you were to test multiple tests on the same day with multiple repeats you would either have to think of different file naming convention or tweak the bit of the code that counts the unique names.
  3 Kommentare
Pawel Jastrzebski
Pawel Jastrzebski am 26 Feb. 2018
Check out the documentation for MEAN and STD - for the STD you need to specify 3 inputs:
So the code will be more like this:
stdval = std(T_set2{:,uniqueIndexes(i):uniqueIndexes(i)+noOfrepeats(i)-1},0,2);
Jan
Jan am 26 Feb. 2018
Or:
meanData = splitapply(@std, data, group)

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by