Indexing geochemical data arrays with different numbers of elements
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Emily
am 26 Apr. 2023
Kommentiert: Star Strider
am 28 Apr. 2023
I have a table of data collected from an instrument that makes 6 measurements for each sample. At the end of the analysis I'm left with a CSV file containing 6 rows of data for every sample. For example, if I analyze 100 samples, I have a CSV file with 600 rows. I have written a code to process the data, and I use only the last three measurements (rows 4-6 of the array 'injection') for each sample. Here's how I read the data and create the arrays
T = readtable("data", 'VariableNamingRule','preserve');
%define the variables
line = table2array(T(:,1));
d18O_raw = table2array(T(:,4));
port = table2array(T(:,2));
injection = table2array(T(:,3));
%select the last three measurements
%use only injections 4-6 for each sample
line = line(injection>3);
d18O_raw = d18O_raw(injection>3);
port = port(injection>3);
injection = injection(injection>3);
I average the three measurements for each sample so I am left with one measurement per variable for sample. Importantly, I also reshape the "port" variable which helps me to identify each sample (so I match up each variable with the corresponding sample later).
d18O_raw = reshape(d18O_raw, [3, numel(line)/3]);
average_d18O = transpose(mean(d18O_raw));
port_reshaped = port(1:3:end,:);
Here's where my issue arises. Sometimes, the machine has an error and only measures a sample 5 times instead of 6 times. In the sample data included, the first sample has only been measured 5 times, but it could in theory happen at any point in the analysis. Currently I have to manually fix a file (or change my code) if there is a sample that has only been measured 5 times. I want to be able to have my code handle a sample that has EITHER 5 or 6 measurements, automatically select the last 2 or 3 measurements (i.e., always skip the first 3 measurements) and then be able to average either 2 or 3 measurements and index the corresponding ports if there are 2 or 3 samples.
My current way of handling this issue is clunky and doesn't make the script easy to share with others, which is the goal.
Thank you in advance for your help.
2 Kommentare
Siddharth Bhutiya
am 27 Apr. 2023
Star Strider has already answered the question below. But I'll just mention this. For the lines of code that are doing the following:
line = table2array(T(:,1));
Simpler way to just extract the entire variable is to use dot indexing as follows:
line = T.Line;
% OR
line = T.(1);
Akzeptierte Antwort
Star Strider
am 26 Apr. 2023
Bearbeitet: Star Strider
am 26 Apr. 2023
This can be done in a relatively straightforward way by first separating the sub-matrices into dindividual cells using the accumarray function, and then using the cellfun function to calculate the mean of elements (4:end) of column 4 where ‘end’ (the length of the column) can be any length.
T = readtable("data", 'VariableNamingRule','preserve')
%define the variables
line = table2array(T(:,1));
d18O_raw = table2array(T(:,4));
port = table2array(T(:,2));
injection = table2array(T(:,3));
% %select the last three measurements
% %use only injections 4-6 for each sample
% line = line(injection>3);
% d18O_raw = d18O_raw(injection>3);
% port = port(injection>3);
% injection = injection(injection>3);
[G,ID] = findgroups(T.Port); % Use 'Port' To Define The Groups
A = accumarray(G, T{:,1}, [], @(x){T{x,:}}) % Accumulate Sub-Matrices According To 'G'
A{1} % Display Intermediate Results (Optional)
A{end} % Display Intermediate Results (Optional)
Outc = cellfun(@(x)mean(x(4:end,4)), A, 'Unif',0) % Calculate The 'mean' Of Rows 4:end In Each Sub-Matrix
Outn = cell2mat(Outc) % Convert The 'cell' Array To A Numeric Array
% The 'Check' Variable Can Be Deleted, Since It Simply Shows How The Code Works, And Checks The Results
Check = [mean(A{1}(4:end,4)) mean(A{2}(4:end,4)) mean(A{3}(4:end,4)) mean(A{4}(4:end,4)) mean(A{5}(4:end,4)) mean(A{6}(4:end,4)) mean(A{7}(4:end,4)) mean(A{8}(4:end,4)) mean(A{9}(4:end,4))].'
EDIT — (265 Apr 2023 at 21:48)
Changed the second accumarray argument to choose the correct data. (Not catching that earlier.)
.
4 Kommentare
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!