Get indeces of any quantile of a column

Question

A. Goeh am 26 Aug. 2016

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/300999-get-indeces-of-any-quantile-of-a-column

Kommentiert: Image Analyst am 27 Aug. 2016

Hello everybody,

as of now I´m trying to sort a large (101x1168) matrix. I am always sorting the first column, on which the following three columns depend upon. I want to be able to get any of the indeces of, for example the top 10 % cent of the values, or the values between the .3 and .4 quantile of the first column, to adress those with a function. As of now I have used several sortrows(), but it takes a long time to run. It is important to know that the length of the columns may vary ( Some of the columns have more NaNs than others) and thus it would be amazing if it was a function that ignores NaNs (maybe a combination of quantile() and find()?)

Here an example of what I need:

Col. 1 Col. 2 Col. 3 Col. 4

15 18 12 32

14 23 19 12

10 7 18 12

9 34 12 13

11 19 3 17

I know want to know the Index and the value of the top 20% values a in the first column. In this case it would be 1. and 15. If implemented correctly I would be able to get a vector output with all the data.

Any help is truly appeciated! Many thanks and kind regards, A.Goe

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Image Analyst am 26 Aug. 2016

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/300999-get-indeces-of-any-quantile-of-a-column#answer_232869

If you have the Statistics and Machine Learning Toolbox, there is prctile(). Would that help?

Y = prctile(X,p) returns percentiles of the values in a data vector or matrix X for the percentages p in the interval [0,100]. If X is a vector, then Y is a scalar or a vector with the same length as the number of percentiles required (length(p)). Y(i) contains the p(i) percentile.

If X is a matrix, then Y is a row vector or a matrix, where the number of rows of Y is equal to the number of percentiles required (length(p)). The ith row of Y contains the p(i) percentiles of each column of X.

For multidimensional arrays, prctile operates along the first nonsingleton dimension of X.

2 Kommentare
Keine anzeigenKeine ausblenden

A. Goeh am 27 Aug. 2016

Hello , first of all thank you for your answer. I tried prctile(), problem here is that the results don`t necessarily have to be values that can be found in the original dataset, thus I can´t search for the indeces of the results...I´m thinking about being able to split the vector ( column) in same length pieces and search for the first and last index, altough not very successful, to be honest.

Image Analyst am 27 Aug. 2016

In MATLAB Online öffnen

If the values must be in your data, then you can use cumsum() to create the cdf, then use find to find the value. Untested code:

col1 = sort(data(:, 1), 'ascend');
cdf = cumsum(col1); % Compute cdf
cdf = cdf/cdf(end); % Normalize
% Find index of top 20 %
index = find(cdf >= 0.8, 1, 'first');
dataValue = col1(index);

Melden Sie sich an, um zu kommentieren.

Get indeces of any quantile of a column

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

2 Kommentare
Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Community Treasure Hunt

Get indeces of any quantile of a column

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

2 Kommentare Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigenKeine ausblenden