How can I identify a pattern of occurrences over multiple days?

3 Ansichten (letzte 30 Tage)
Peter
Peter am 12 Apr. 2012
Hello all,
I am attempting to write a script that will look for a pattern of event occurrences over multiple days of data. Seems like it should be simple enough, yet I am scratching my head.
I want to identify spans of time of a minimum of five days where the event occurred on at least 5/7 of the days. For example, if the event occurred on all five weekdays, then did not occur over the weekend, then occurred again on the next 3 weekdays, I would want to return an index of all 10 of those days. A week later (perhaps after some random occurrences in between) if the event occurred for 3 days in a row, skipped a day, then occurred on the 5th day, then I would want a separate set of indices for this pattern.
The input: An array containing the date of each event as a round-number datenum, e.g:
dates= [734841 734842 734843 734844 734845 734848 734849 734850 734859 734860 734861 734863]
The output: a structure containing indices of the members of each separate pattern. e.g:
patternStructure(1).index = [1 2 3 4 5 6 7 8 9 10]
patternStructure(2).index = [20 21 22 23 24]
Thanks,
Peter

Antworten (1)

Geoff
Geoff am 13 Apr. 2012
Well, what you could say is that a value is in the required set if you subtract the date 4 events ago from the date at the current event, and that difference is less than 7 days. That is:
in = (dates(5:end) - dates(1:end-4)) < 7;
Here, too, you can exploit regexp to find the start and end indices of each sequence:
[s,e] = regexp( char(in+'0'), '1+', 'start', 'end' );
And then, accounting for the end being 4 values out, you can construct an array of indices:
patternStructure = arrayfun( @(n) struct('index', s(n):e(n)+4), 1:numel(s) );
But now, these are indices into dates, and not actual date ranges. Your question is a little strange, given your data and your result.
See, the indices for the first detected pattern in your dates array is 1:8, not 1:10, but dates(8)-dates(1)+1 is indeed 10. This is the only range in your supplied data that fits the requirement. For testing, I added:
dates(end+1) = 734864;
Which gave a 5-out-of-6 pattern from indices 9:13
Anyway, this code will detect your patterns, and it's up to you what you want to do with the indices after that =)

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by