Split large vector into smaller vectors based on upper and lower limits

Hi all, i'm relatively new to MATLAB and coding in general, so i would love some help for a problem which is as follows.
I have a vector containing 6million data points representing wind speeds ranging from 0 to above 30 m/s. There is also another vector with corresponding datetimes.
For a job, when the wind speed reaches 20 m/s, i need to find the time taken for the wind speed to reach 30 m/s. So i would like to split the large vector into separate vectors where the first value is 20 and the final value is 30. These occurances happpen several times throughout the data, so i want to create an individual vector each time the wind speed exceeds 20 and reaches 30.
As an additional condition, the wind speed data is very messy, and there are sometimes single outliers which may be above or below the thresholds given above. Therefore, i would like the code to create the first point when there are 2 consecutive points above 20 m/s, and create the final point when there are 2 consecutive points above 30 m/s.
Any help would be greatly appreciated, thanks !

5 Kommentare

Once you have started a run of > 20, what should be considered to end the run?
18 25 25 25 19 19 25 30 30
if the answer is that two in a row below 20 ends the run, then the 30's would not be counted, as you defined a run as starting with at least two consecutive > 20 and if the 19 19 ends the previous run then the 25 after that would only be a single > 20 .
If i understand your question correctly, the end of the run should be two consequtive >30.
So for example, if the vector is [ 8,10,18,21,23,.....,32,31....], then the run will read 21,23 as 2 consecutive >20, and the run will end at 32,31. I am not wanting the run to start and end of the run using the same threshold, so when the run starts at 2 consequtive >20, it does not matter if there are values below 20 as the run will continue until 2 consecutive >30. I would like these occurances over the whole data range to be put in seperate vectors.
I would then be able to call the corresponding datetime values from the other vector, and therefore measure the time duration of each new vector.
I hope this clears it up better, thanks !
18 25 25 25 19 19 10 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26 30 30
It seems clear to me that by the time of the 0's (at latest) the burst > 20 has ended, and that anything after that is a different stretch. If it ended with (say) 0 0 0 0 ... 0 24 26 30 30 then it would be obvious that the 24 26 would be the start of a run and the 30's should go with that 24 26, not all the way back to the original 25's.
So we need rules for when a run of > 20 is considered to end. You have already said there may be outliers and you implied single outliers are not considered to end the run... so should 2 values < 20 in a row be considered to end a run ?
Hi Walter,
i've atatched a quick sketch which aims to show the rules for the start and end of each run. I hope this helps explain what i am aiming for,
Thanks!
hello
my 2 cents suggestion
why not first smooth a bit your data (using smoothdata)
then loop through the data and serach for conditions :
1/ start point is when 2 consecutive samples are in the band 20 +/- 1 (I need a certain tolerance here)
2/ stop point is when 2 consecutive samples are in the band 30 +/- 1 (I need a certain tolerance here)
3/ this portion of data is valid if there is max 1 (or tbd qty) outliers. The reason of data smoothing was to get rid of the single outliers problem somehow.
if you could share a portion of data , I could try my logic

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

Matt J
Matt J am 12 Okt. 2021
Bearbeitet: Matt J am 12 Okt. 2021
Using this File Exchange submission,
vector = [8,29,18,21,23, 18,19,25,32,31 10,10, 21,26, 33,33];
D=discretize(vector,[-inf,20,30,inf],'IncludedEdge','right');
D=medfilt1(D,3); %outlier removal
x=(D<2);
D(x)=interp1(find(~x),D(~x),find(x),'previous');
Vectors=groupFcn(@(x) {x}, vector,groupTrue(D==2));
Vectors{:}
ans = 1×5
21 23 18 24 25
ans = 1×2
21 26

6 Kommentare

i need to find the time taken for the wind speed to reach 30 m/s
You don't really need to split the vector into separate vectors if all you want to do is measure their lengths. That can be done more quickly with,
D=discretize(vector,[-inf,20,30,inf],'IncludedEdge','right');
D=medfilt1(D,3); %outlier removal
x=(D<2);
D(x)=interp1(find(~x),D(~x),find(x),'previous');
[~,~,runlengths]=groupLims( groupTrue(D==2) , 1 );
Hi Matt, i've given your functions a go and they work pretty closely to what i want to do. I'd just like to ask how i would reject the vectors that have data outside of the specified boundaries? I'd like to produce vectors that only show the relevant data between the boudaries.
Additionally, is there a way to remove vectors which are within the boundaries, but contain deceneding data (representing the wind slowing down from 30 to 20 m/s) ?
Thankyou very much.
I'd just like to ask how i would reject the vectors that have data outside of the specified boundaries?
You seem to be implying that that's not already the case.
Additionally, is there a way to remove vectors which are within the boundaries, but contain deceneding data
That would depend on your definition of descending. Your data is never purely monotonic as far as I can tell. Perhaps something like this:
for i=1:numel(Vectors)
if Vectors{i}(end)<Vectors{i}(1)
Vectors{i}=[];
end
end
Vectors=Vectors(~cellfun('isempty',Vectors));
An easy way to define a vector accending in my case would be if the first value of the vector is larger than the final value.
Then my example should be what you need.
Thanks Matt, works perfectly ! Very much appreciated !

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu MATLAB finden Sie in Hilfe-Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by