How to output specific rows from tables depending on values within the table?
15 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
I have a table of variable I have pulled form an excel spread sheet (the actual file is 45 columns X 2000 rows). But this gives the idea of what I am trying to achieve.
I would like to find each separate participants (identified by their 'ID') maximum jump height for that 'season' of testing and remove the other rows.
The Table I'm working with is like this (but extended):
[Season] [TrialNo] [ID] [AgeGroup] [bodyMass] [JumpHeight] [Force] [FlighTime] [LandingForce]
'Pre' 1 0001 U14 40 35 685 0.3 1100
'Pre' 2 0001 U14 40 32 630 0.25 1200
'Pre' 1 0002 U14 42 40 750 0.42 1000
'Pre' 2 0002 U14 42 36 700 0.4 1300
'Pre' 1 0003 U14 45 32 610 0.3 1111
'Pre' 2 0003 U14 45 28 600 0.3 1600
'Post' 1 0001 U14 40 35 685 0.3 1100
'Post' 2 0001 U14 40 32 630 0.25 1200
'Post' 1 0002 U14 42 40 750 0.42 1000
'Post' 2 0002 U14 42 36 700 0.4 1300
'Post' 1 0003 U14 45 32 610 0.3 1111
'Post' 2 0003 U14 45 28 600 0.3 1600
What I aim to end up with is something more like
[Season] [TrialNo] [ID] [AgeGroup] [bodyMass] [JumpHeight] [Force] [FlighTime] [LandingForce]
'Pre' 1 0001 U14 40 35 685 0.3 1100
'Pre' 1 0002 U14 42 40 750 0.42 1000
'Pre' 1 0003 U14 45 32 610 0.3 1111
'Post' 1 0001 U14 40 35 685 0.3 1100
'Post' 1 0002 U14 42 40 750 0.42 1000
'Post' 1 0003 U14 45 32 610 0.3 1111
2 Kommentare
Paolo
am 5 Jun. 2018
Bearbeitet: Paolo
am 5 Jun. 2018
JumpHeight seems to be already sorted for the ID for every pair or rows, so if that's always the case you could just delete every other row.
Alternatively, if its not always sorted, you could sort every two rows by JumpHeight.
Could you attach a sample spreadsheet?
Antworten (3)
Razvan Carbunescu
am 5 Jun. 2018
Bearbeitet: Razvan Carbunescu
am 5 Jun. 2018
>> GT = groupsummary (T,{'Season','ID'},'max','JumpHeight')
GT =
6×4 table
Season ID GroupCount max_JumpHeight
______ __ __________ ______________
'Post' 1 2 35
'Post' 2 2 40
'Post' 3 2 32
'Pre' 1 2 35
'Pre' 2 2 40
'Pre' 3 2 32
If you want to get all the row information or on an earlier release can use findgroups / splitapply workflow
idx = findgroups(T.Season,T.ID);
GT = splitapply(@maxidx,T,idx);
GT.Properties.VariableNames = T.Properties.VariableNames
function T = maxidx(varargin)
[~,i] = max(varargin{6});
tmpvarargout = cellfun(@(x) x(i,:),varargin,'UniformOutput',false);
T = table(tmpvarargout{:});
end
Sample Output
GT =
6×9 table
Season TrialNo ID AgeGroup bodyMass JumpHeight Force FlighTime LandingForce
______ _______ __ ________ ________ __________ _____ _________ ____________
'Post' 1 1 14 40 35 685 0.3 1100
'Post' 1 2 14 42 40 750 0.42 1000
'Post' 1 3 14 45 32 610 0.3 1111
'Pre' 1 1 14 40 35 685 0.3 1100
'Pre' 1 2 14 42 40 750 0.42 1000
'Pre' 1 3 14 45 32 610 0.3 1111
Edit: Script assumes JumpHeight is 6th column in table, might have to modify for correct position
6 Kommentare
Razvan Carbunescu
am 7 Jun. 2018
I had missed the fact that you're on R2014a. findgroups/splitapply were introduced in R2016b.
I think the way to try to get it in R2014a is to use sortrows and unique with the rows flag to find the indexing to the first sorted highest value.
ST = sortrows(T,{'Season' 'ID' 'JumpHeight'},{'ascend' 'ascend' 'descend'});
% taking advantage here of the fact that ST is sorted by JumpHeight and unique returns first element
[~,idx] = unique([double(categorical(ST.Season)) ST.ID],'rows');
GT = ST(idx,:)
Are Mjaavatten
am 5 Jun. 2018
Bearbeitet: Are Mjaavatten
am 11 Jun. 2018
I am a little uncertain about the type of data structure you use. For completeness I therefore entered your data in an Excel workbook that I read using readtable.
If there are always exactly two trials per ID and season:
T0 = readtable('Byrne.xlsx');
rows = [];
for i = 1:2:size(T0,1)-1
[~,j] = max(T0.JumpHeight(i:i+1));
rows = [rows;i+j-1];
end
T2 = T0(rows,:);
If the number of trials may vary:
T0 = sortrows(T0,'ID');
T0 = sortrows(T0,'Season','descend');
J = [find(diff([0;T0.ID])~=0);size(T0,1)]; % Indices for each ID change
rows = [];
for i = 1:length(J)-1
[~,j] = max(T0.JumpHeight(J(i):J(i+1)-1));
rows = [rows;J(i)+j-1];
end
T2 = T0(rows,:);
0 Kommentare
Peter Perkins
am 3 Jul. 2018
In more recent versions of MATLAB there are several ways to do this. In R2014a, do a grouped varfun, using @max as the function to apply, ID and Season as the grouping Variables, and JumpHight as the InputVariable.
0 Kommentare
Siehe auch
Kategorien
Mehr zu Data Distribution Plots finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!