How can I isolate data from a large input file?

I have a number of large data files with approximately 8,000,000 rows and 10 columns. The data is taken from a train and monitors various inputs over a number of days. The 10th column indicates direction of the train with 1 and -1 for differing direction and 0 for when the train is at a standstill.
Each time the train changes direction I would like to be able to create a new variable that stores all the following data until the next direction change.
I am able to do this manually, by examining the data and finding the index where a direction change is indicated, i.e. 1 becomes -1. I would like to make a process that could automate this.
Any help would be greatly appreciated.

1 Kommentar

As usual, a short meaningful example would reveal the important details. Neither the meaning of the variables (Matlab does not if this is a train, a price or a temperature) not that it is the 1th column. So perhaps your question could be simplified to:
x = [1 1 1 0 1 0 0 -1 -1 1 0 -1 0 -1 0 1]
How can I find indices of changes from -1 to +1 and vice versa ignoring the zeros?

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

dpb
dpb am 15 Okt. 2013
Bearbeitet: dpb am 15 Okt. 2013

1 Stimme

I suggest not using a new variable but indexing into the one.
A very useful coding scheme easy to deal with.
To find the direction changes, use
ixdir=find(abs(diff(x(:,10))==2))+1; % all the points of direction change
The first direction section is from 1:ixdir(1); second is then ixdir(1):ixdir(2), etc., ... Processing those in sequence is quite easy with the indices w/o different variables.

5 Kommentare

Rob
Rob am 15 Okt. 2013
Thank you dpb for the reply, I have used the code but am only getting an empty matrix fomr the find(abs(diff......) . I will keep playing around with the code and see what I can produce. If you have any other pointers that would be great.
Many thanks
dpb
dpb am 15 Okt. 2013
Bearbeitet: dpb am 15 Okt. 2013
Yeah, my bad...I wrote the code based on +/-1 being in conjunction w/ each other forgetting about the zeros when it's standing (or I presume probably even if it were to actually only reverse direction there would be one or more zeros as well).
One question before an actual hard solution -- do you want to retain the standing-still data with the preceding direction data before the next moving section or discard the stationary data?
Rob
Rob am 17 Okt. 2013
At the moment I have made two versions of my data, One with all of the standing still 0's removed and one with them left in. It would be better to use the verison with the standing still data in but I can probably perform the same calculations without them.
Thanks again
Rob
Rob am 17 Okt. 2013
With the 0's removed the original solution works fine!
dpb
dpb am 17 Okt. 2013
Yeah, that was what I was working on the basis of...
There's gotta' be a way w/ the zeros included that's also pretty concise but at the moment the "trick" eludes me of the neatest way. I'm thinking if were to substitute +/-1 for the zero based on the sign preceding then the above works as well; I just haven't got a one-liner to do the substitution down yet.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (3)

sixwwwwww
sixwwwwww am 15 Okt. 2013

1 Stimme

Dear Rob, here is the solution to your problem:
A = [0 0 0 0 0 0 1 0 0 0 0 0 0 -1 0 0 0 0 0 0 1 0 0 -1 0 0 1];
indx = [1 find(A)];
for i = 1:length(indx) - 1
B{i} = A(indx(i):indx(i + 1));
end
Now here replace A with your 10th column and it should work fine. Also here it is assumed that 1 and -1 appear in alternate fashion within 0s as you can see in the vector A. I hope it helps. Good luck!

1 Kommentar

dpb
dpb am 15 Okt. 2013
Difficulty here is it'll be all moving irregardless of direction iiuc that all moving is either +/-1, not just the initial move.

Melden Sie sich an, um zu kommentieren.

Jan
Jan am 17 Okt. 2013

1 Stimme

You can replace the zeros with the former value at first:
x = [1 1 1 0 1 0 0 -1 -1 1 0 -1 0 -1 0 1];
idx = (x ~= 0);
x2 = x(idx);
xf = x2(cumsum(idx));
Now strfind can look for [1, -1] and [-1, 1] in xf, or you can use diff(xf) and search there.
dpb
dpb am 18 Okt. 2013

0 Stimmen

It finally came to me!!! :)
Actually, was looking at it wrong -- to find the beginning of a movement you don't care which direction the move is in--only that it's a change from stopped.
Hence, the index you want is
idx==find(diff(abs(v))==1)+1; % all the points of start from stop
The direction is
sign(v(idx))
where v is the direction column in your data, of course.
This finds the first embedded location in the data; if the train is moving at the beginning of the data record that is discarded by the above as incomplete record. If you want that one, too, prepend a zero in front of the v vector before doing the diff() and then remove the +1 length correction.

2 Kommentare

Rob
Rob am 18 Okt. 2013
This looks very interesting, I will give it a whirl and be sure to let you know how it goes! thanks again
dpb
dpb am 18 Okt. 2013
Bearbeitet: dpb am 18 Okt. 2013
OK, one other caveat -- it does require there be at least one "stopped" measurement between the reversal of direction -- the above doesn't find the +/-2 points. I presumed that isn't possible owing to sample frequency as compared to the realizable direction reversal. If it is possible, "or" the abs(diff(...)==2 with the above before find() and you'll have both. Note that will have to keep the sign in this case as that case goes away with the abs().
That is, specifically,
find(diff([0 abs(v)])==1 | [0 abs(diff(v))==2])

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Large Files and Big Data finden Sie in Hilfe-Center und File Exchange

Gefragt:

Rob
am 15 Okt. 2013

Kommentiert:

dpb
am 18 Okt. 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by