6 views (last 30 days)

Show older comments

I have a vector with N numbers, the vector has downward/upward/downward trend (comes from a set of data). :

A=[4 3 1 0 1 2 5....]

I want to add some extra data (missing ones) to my vector, with interval of 1,for example:

B=[4 3 2 1 0 2 3 4 5 ...]

How can I obtain vector B from vector A?

arich82
on 10 Nov 2015

Edited: arich82
on 10 Nov 2015

I'm assuming that there's a typo in your expected result, and that B should be

B=[4 3 2 1 0 1 2 3 4 5 ...]

A simple (but ugly) for loop can do this, e.g.

A = [4 3 1 0 1 2 5];

B = A(1);

for k = 2:numel(A)

d = diff(A(k - 1:k));

if abs(d) == 1

B(end + 1) = A(k);

else

B = [B(1:end - 1), B(end):sign(d):A(k)];

end

end % for k

Note that the above is terribly inefficient if numel(A) is large, since B is growing inside the loop at each iteration.

If your data sets are large, or if you'll be running this numerous times, there are potentially more efficient (though also more obtuse) answers. For example, a variation on run length encoding:

A = [4 3 1 0 1 2 5];

d = diff(A);

val = sign(d);

len = abs(d);

ind = [0, cumsum(len)] + 1; % add one for 1-based indexing; note: A == B(ind);

n = ind(end); % note: numel(B) == sum(abs(diff(A))) + 1 == n;

mask = false(1, n-1);

mask(ind(1:end-1)) = true; % ind(end) == numel(B), not start of new phase

diffB = val(cumsum(mask)); % cumsum(mask) gives the rle phase number, i.e. index into val

B = A(1) + cumsum([0, diffB]);

Note that this method is NOT robust against repeated values in A, i.e. it fails if any(diff(A) == 0); a workaround would be to sanitize A, e.g. A = A([true, abs(diff(A)) > 0]), but this would obviously throw away data if the repeats are significant.

Please accept this answer if it works for you, or let me know in the comments if I've missed something.

arich82
on 13 Nov 2015

Not sure why I couldn't open the .fig file you posted, but I see the .jpg from your other post here.

I now understand how the two questions are related. For posterity, I'll post two addenda here, and post a full answer in the other question, linked above.

To open the .fig file, I first renamed it to curve.mat, and loaded it. This allowed me to load the x and y data using the following:

data = load('curve.mat');

x = data.hgS_070000.children(1).children.properties.XData;

y = data.hgS_070000.children(1).children.properties.YData;

From the x data, I see that you do indeed need a method which tolerates repeated values in A, i.e. when any(diff(A) == 0) returns true. A simple modification of the run-length encoding scheme does this:

diffx = diff(x);

val = sign(diffx);

len = abs(diffx) + 1; % add one to include diff==0 as new phase; need to subtract off cumsum in ind

ind = [0, cumsum(len) - cumsum(abs(val))] + 1; % add one for 1-based indexing; note: x == X(ind);

n = ind(end); % note: numel(X) == sum(abs(diff(x))) + 1 == n;

mask = false(1, n-1);

mask(ind(1:end-1)) = true; % ind(end) == numel(X), not start of new phase

diffX = val(cumsum(mask)); % cumsum(mask) gives the rle phase number, i.e. index into val

X = x(1) + cumsum([0, diffX]);

K = 1:numel(X);

Y = interp1(K(ind), y, K);

I haven't had a chance to thoroughly vet this, so there might be an error, but essentially, since X is now uniformly spaced (though still non-monotonic), we can interpolate against the index (i.e. treat X and Y as parameterized by K, which is monotonic). I have to run now, but I'll post a full solution in the other thread tomorrow.

I hope this helps.

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!