Find a row of repeated values?
13 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi, first off I am very new to Matlab. I asked a similar question yesterday but I don't think I was specific enough so I'm going to try again...
I am analyzing data that comes from a truck, such as engine speed, vehicle speed, etc. The data is collected randomly throughout the day for about 3 hours (10,000 seconds). I am trying to write a script that will detect any repetition in the data that last for 60 seconds or more. Because if that happens, there is something wrong with the sensors in the truck.
For example, if for engine speed I had 10,000 points ranging from 0-2100, and for 60 seconds in a row the data was stuck on 1,000, how would I write a script to detect this and say there is an error?
I appreciate any help I can get! Thanks
0 Kommentare
Akzeptierte Antwort
Evan
am 3 Jul. 2013
Bearbeitet: Evan
am 3 Jul. 2013
Hi. Sorry for not getting back to your comment on my answer yesterday. Here is how I would do it:
First, some random data for my example:
data = 2100*rand(1,10000); %random dataset
Next, I'll make a few sections of data repitition:
data(1,50:120) = 79.356; %set some data to constant value
data(1,200:210) = 81.220; %set some data to constant value
data(1,400:520) = 1445.201; %set some data to constant value
data(1,900:948) = 0.113; %set some data to constant value
Now do the differencing. Runs of zeros will be potential problem areas. The ~ logical command is used to return binary data. That is, where the difference function returned zero (no change) we return "true." Everywhere else returns "false." So now we have a 10,000 element binary vector with sections of ones and zeros, and the ones are repetitions.
datarep = ~diff(data);
Now here is where I search for zeros. Like I said, there are definitely other ways of doing this, including using a for loop, but I find this to be the most compact and simple way I've come across. I'll split it up into steps instead of jamming it all together like I did yesterday.
First, turn your differenced vector into a string:
datarepstr = num2str(datarep) %convert to string
Turning a vector into a string puts spaces between each number, so we'll use a "regular expression replace" function to get rid of them and leave us just the ones and zeros. The function finds all points of ' ' in our string and replaces them with ''.
s = regexprep(datarepstr,' ',''); %remove spaces
Now we want to find where all the ones are in the string, as well as how long each sections of ones is. regexp searches our string for all cases where there are one or more ones, or '1+'. Our expression should find four different sections of ones (because that's how many runs of repetition I added. "ids" is the start of each section and runs is the section pulled out from the string.
[ids runs] = regexp(s,'1+','start','match'); %find all runs and the point where they start
These values are returned in cell arrays. cellfun is a function that performs another function (in this case, length) on each cell of an array. It's like looping over each element but more compact. l should have four elements telling how long each run is.
l = cellfun('length',runs); %find the length of each run
Now we have everything we need in order to check our potential problem runs for ones that cross the line. It will all depend on the frequency of your sampling. If it's on datapoint every second, we'll see if any of our lengths are greater than sixty. If it's every half second, we'll look for >120. And so on.
if any(l > 60) %if any run is longer than 60, display message
disp('Error')
end
Of course, you may want more info than that in your message. You may also want to stop execution of your program, in which case calling error instead of disp would be needed. You may want to tell which elements are the problematic repetitions, and you can do that, because you have the lengths of the runs in l and the indices of where each run starts in ids.
Finally, here's the function in its entirety, now in a very compact form:
[ids runs] = regexp(regexprep(num2str(~diff(data)),' ',''),'1+','start','match');
l = cellfun('length',runs);
if any(l > 60)
disp('Error')
end
0 Kommentare
Weitere Antworten (1)
Siehe auch
Kategorien
Mehr zu Data Type Identification finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!