How can I extract time series data from cell matrix and put all the data together in one matrix with corresponding date and fund ID? See Example here

5 Ansichten (letzte 30 Tage)
I am a new user of Matlab so not aware of the terminology used here. I am working on Mutual Funds analysis.
I have data in a cell named ndata with each cell having 3 columns: Date, Fund-ID, and Returns. Total Unique dates are 465, and total unique Funds are 305. The problem is that most funds start and end on different dates.
The dimensions of ndata are {1,305} with each cell having data of a different Fund. Each cell has dimensions (n,3) where n varies. First column of every cell i.e. ndata{1,i}(:,1) contains Dates. Now I would like to merge all the cells in one matrix, with Dates in first Column, and the Returns in subsequent columns, with every column representing a different Fund, and every row a different date. The Return should correspond to the unique date and unique Fund.
This Example might help explain my problem better:
ndata=cell(1,3);
ndata{1,1} = [1, 101, 4; 2, 101, 4.5; 3, 101, 3];
ndata{1,2} = [2, 202, 1; 3, 202, 1.1; 4, 202, 1.2; 5, 202, 3.2;];
ndata{1,3} = [3, 303, 4.1; 4 303, 4.9];
Funds = [101 202 303];
Dates = [1;2;3;4;5];
I would like to see something like: Returns:
1st Row is: Col1=Date Col2= 101 Col3= 202 Col4= 303
and then
[1, 4, NaN, NaN; 2, 4.5, 1, NaN; 3, 3, 1.1, 4.1; 4, NaN, 1.2, 4.9; 5, NaN, 3.2, NaN]
The code I tried for this kind of example is:
Returns = NaN * ones(5+1, 3+1);
Returns(2:end,1) =Dates; %from the date vector which is same as 1st column of answer I'm looking for i.e. Return Matrix above
Returns(1,2:end) =Funds; %from the FundID vector
for i = 1:3
for j = 2:6
for k=1:size(ndata{1,i},1);
if any(Returns(j,1)==(ndata{1,i}(:,1)));
Returns(j:size(ndata{1,i},1)+j-1,i+1) = (ndata{1,i}(:,3));
end
end
end
end
Can anyone help. I'd really appreciate.
  2 Kommentare
Jan
Jan am 3 Jul. 2013
@Sunny: This is your first question in this forum and you have formatted the text and the code properly, added a useful piece of code and all required information. This is a nice example for an efficient posting and therefore I've voted for it: This is my question of the week! Thanks.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Sunny
Sunny am 9 Jul. 2013
Bearbeitet: Sunny am 9 Jul. 2013
Hi all;
I tried a bunch of codes, and I came up with a very simple loop which does what I wanted it to do.
Returns=nan(size(Dates,1)+1,size(Funds,2)+1); %Main Matrix I wanted to construct with Dates in 1st Column and FundID in 1st Row
Returns(1,2:end)=Funds;
Returns(2:end,1)=Dates;
rets=nan(size(Dates,1),size(Funds,2)); %Just contains Returns in front of their dates and FundID
for k=2:size(Dates,1)+1;
for i =1:size(Funds,2);
for j = 1:size(ndata{:,i},1);
if ndata{:,i}(j,1)==Dates(k-1,1)
rets(k-1,i)=ndata{:,i}(j,3);
end
end
end
end
Returns(2:end,2:end)=rets;
I'm sure there are many other ways to achieve this as well, but this one works for me. If you have a better, faster way, please share.
Thanks
Sunny

Weitere Antworten (3)

Guru
Guru am 3 Jul. 2013
Well, using the knowledge that the number of funds is the size of ndata and the rows for each fund that has data is based on the Dates within each cell, we can actually preallocate a NaN array and simplify this process a bit.
% Preallocate a double array Returns of all NaNs and then fill in Dates and Funds
Returns = NaN(length(Dates)+1, length(ndata)+1);
Returns(2:end,1) = Dates;
Returns(1,2:end) = Funds
% Use the Dates we have in each cell of ndata to assign the Fund data
for n = 2:length(ndata)+1
% Extract the dates that we have for that Fund
D = ndata{n-1}(:,1);
Returns(D,n) = ndata{n-1}(:,3);
end
HTH!
  1 Kommentar
Sunny
Sunny am 3 Jul. 2013
Bearbeitet: Sunny am 3 Jul. 2013
Thanks for the reply Guru. I tried your code and it gives me a matrix similar to what I was getting with another code i.e.:
[NaN, 101 , 202, 303; 1, 4, NaN, NaN; 2, 4, 1, NaN; 3, 4, 1, 4.1; 4, 4.5, 1, 4.1; 5, 3, 1, 4.9; 0, 0, 1.1, 0; 0, 0, 1.2, 0; 0, 0, 3.2, 0];
Instead, I am looking for a code that returns:
[NaN, 101 , 202, 303; 1, 4, NaN, NaN; 2, 4.5, 1, NaN; 3, 3, 1.1, 4.1; 4, NaN, 1.2, 4.9; 5, NaN, 3.2, NaN];
The problem with your code is that the observations are not aligned correctly with the Dates, i.e. not in same rows as the Date they were recorded on. Although it is aligned with the correct Fund, i.e in same column as the Fund. Please let me know if anyone can help. Thanks in advance.

Melden Sie sich an, um zu kommentieren.


dpb
dpb am 4 Jul. 2013
Bearbeitet: dpb am 8 Jul. 2013
CORRECTION -- THE ISMEMBER search wasn't right in initial, sorry. The following corrected version produces the desired results...
A "deadahead" solution...can undoubtedly be more fully vectorized w/ cellfun and friends of similar thought but didn't have time to think it out...
MATL
r=sortrows(vertcat(d{:}),[1 2]); % make single array by date, stock
s=splitvec(r(:,2),'value','bracket'); % Bruno's function @ FEX
u1=unique(r(:,1); u2=unique(r(:,2); % find how many each
res=nan(length(u1))+1,length(u2))+1); % preallocate
res(1,2:end)=u2; % load the IDs on first row
res(2:end,1)=u1; % and the times in first column
for i=1:length(s)
[i1,i2]=dealvec(s(i,:));
[~,ib]=ismember(r(i1:i2,2),u2);
res(i+1,ib+1)=r(i1:i2,3)';
end
res =
NaN 101 202 303
1.0000 4.0000 NaN NaN
2.0000 4.5000 1.0000 NaN
3.0000 3.0000 1.1000 4.1000
4.0000 NaN 1.2000 4.9000
5.0000 NaN 3.2000 NaN
>>
...
NB: I renamed Bruno's function here to splitvec() instead of keeping CamelCase as Matlab doesn't use it and I have too short a memory retention span to recall what is/isn't if were to try to...suit yourself on that--just be aware that above won't work until you do one or t'other--i.e., either change the call to match Bruno's capitalization or rename to all lowercase to match my spelling.
Basically just pulls a subset vector from the list of dates separated by those dates and then finds their location in the columns that are preloaded.
I also used a little utility routine of mine called splitvec that is convenient when a short variable name is less typing than a repeated indexing expression.
MATL
function varargout = dealvec(x)
varargout = num2cell(x);
If you don't want to use it, it's equivalent is s1=s(i,1); s2=s(i,2);
Sorry didn't catch the indexing snafu initially; I just assumed I was good... :)

dpb
dpb am 8 Jul. 2013
OK, I recast as a function and tested so for a single shorter reporting I'll post it as a new answer even though is same algorithm...
MATL
>> % raw data...
>> ndata{:}
ans =
1.0000 101.0000 4.0000
2.0000 101.0000 4.5000
3.0000 101.0000 3.0000
ans =
2.0000 202.0000 1.0000
3.0000 202.0000 1.1000
4.0000 202.0000 1.2000
5.0000 202.0000 3.2000
ans =
3.0000 303.0000 4.1000
4.0000 303.0000 4.9000
>> % function results...
>> collectdata(ndata)
ans =
NaN 101.0000 202.0000 303.0000
1.0000 4.0000 NaN NaN
2.0000 4.5000 1.0000 NaN
3.0000 3.0000 1.1000 4.1000
4.0000 NaN 1.2000 4.9000
5.0000 NaN 3.2000 NaN
>>
>> % function code
function res=collectdata(d)
% Given cell array of nx3 per cell of date, stockID, return as
% an array collecting all returns by date and stockID with missing
% values as NaN
% splitvec is Bruno Luong's FEX submission splitVec
% dealvec is dpbozarth's utility routine
% function varargout = dealvec(x)
% varargout = num2cell(x);
r=sortrows(vertcat(d{:}),[1 2]); % make single array by date, stock
s=splitvec(r(:,1),'value','bracket'); % Bruno's function @ FEX
u1=unique(r(:,1)); u2=unique(r(:,2)); % find how many each
res=nan(length(u1)+1,length(u2)+1); % preallocate
res(1,2:end)=u2; % load the IDs on first row
res(2:end,1)=u1; % and the times in first column
for i=1:length(s)
[i1,i2]=dealvec(s(i,:));
[~,ib]=ismember(r(i1:i2,2),res(1,:));
res(i+1,ib)=r(i1:i2,3)';
end
end
  1 Kommentar
Sunny
Sunny am 9 Jul. 2013
Bearbeitet: Sunny am 9 Jul. 2013
I'm sorry but I'm having a hard time understanding your code and running it. As I mentioned, I am pretty new to MATLAB, so not sure how to run the functions.
I copied Bruno's function, and saved it as a function. Then I ran your code, but nothing. Could you please explain how to run your code, perhaps by keeping the same variable names as used above in the main question.
Thanks
Sunny

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu MATLAB finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by