Multivariate regression betas stock returns
2 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Emily Read
am 5 Jul. 2019
Kommentiert: Emily Read
am 11 Jul. 2019
I have a matrix of stock returns, 4280 (rows) by 7379 (columns).
I need to regress the 1st to 15th cell in column 3 (1,3) against the 1st to 15th cells in columns 1 and 2 (1,1), (1,2).
Then the 2nd to 16th cell in column 3 (2,3) against the 2nd to 16th cells in columns 1 and 2 (2,1) and (2,2) etc. So like a rolling window.
The columns remain constant, but each variable window (i:i+14) goes down by one row.
Once the whole column is done, it needs to go back up to the 4th column, but continue regressing against columns 1 and 2. Then the 5th column against columns 1 and 2, etc.
This is the code I have so far:
Nwin = 15; % set window for rolling regression
betasMkt = ret.*NaN; betasMkt(:,1) = ret (:,1); % initialize empty matrix for Market betas
betasDisp = ret.*NaN; betasDisp(:,1) = ret (:,1); % initialize empty matrix for CSAD betas
for j = 1:size(ret,2)-2
for i = 1:size(ret,1)-Nwin
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
vars = vars(~isnan(vars(:,1)),:);
vars = vars(~isnan(vars(:,2)),:);
vars = vars(~isnan(vars(:,3)),:);
if size(vars,1) > 15
y = vars(:,1);
x = [ones(rows(y),1) vars(:,2:3)];
b = regress(y,x);
betasMkt(i+Nwin,j+2) = b(2,1); % market beta
betasDisp(i+Nwin,j+2) = b(3,1); % dispersion beta
end
end
end
However, the betasMkt and betasDisp outputs are both just column 1 of 'ret'. It doesn't seem as if any regression has been performed.
Could anyone see where I am going wrong please?
I desperately need to figure this out soon.
Thank you for your time
4 Kommentare
dpb
am 6 Jul. 2019
Bearbeitet: dpb
am 6 Jul. 2019
From my reading of the doc, rows() is just size() specifically for a SQL fetch object -- it doesn't do anything different other than return a number. In your use it simply sets the upper bound of the for loop; it is the for loop and the index in there that actually "does something". I think it immaterial which you use unless size() can't read the return object after SQL query.
Akzeptierte Antwort
dpb
am 7 Jul. 2019
Bearbeitet: dpb
am 7 Jul. 2019
OK, just add as new Answer since preceding getting rather long...try this out; seems to work ok here on the sample dataset.
NB: The one place had logic error before (and you should have gotten an indexing error out of bounds owing to it that would be why only had one column because it died at end of first j loop) is that the row indices initialization was outside both loops, not before the i loop to start over after having done the initial set of all columns.
ret=xlsread('MultivariateExample.xlsx'); % read the data, shorten variable names
[r,c]=size(ret); % size of the data r(ows), c(olumns)
nWin=15; % window length
nOK=10; % minimum finite observations required
betasMkt=nan(r-nWin,c-2); % initialize NaN matrix for Market betas
betasDisp=betasMkt;
for j=1:c-2
i1=1; % initialize index variables each column pass
i2=nWin;
for i=1:r-nWin
vars=ret(i1:i2,[j+2 1:2]);
vars(any(isnan(vars),2),:)=[]; % make sure all observations finite
nobs=size(vars,1); % how many were ok
if nobs>nOK % how many required to be -- 10
b= regress(vars(:,1),[ones(nobs,1) vars(:,2:3)]);
betasDisp(i,j)=b(2);
betasMkt(i,j)=b(3);
i1=i1+1; i2=i2+1; % increment counters
end
end
end
NB:
I indexed the output arrays by 1:C-2 and 1:R-nWin so the actual third column results are the first set of coefficients. Not much point in keeping two columns of NaN simply to account for a column offset of a constant.
If you do need to know which actual set of indices built each weighted set of coefficients, I'd keep another auxiliary variable (say the value of i1 would be appropriate) along with the coefficients. You could also keep nObs as a second. Those two would be good candidates for the first two columns and you could then go back to a full-column array size. :)
3 Kommentare
dpb
am 7 Jul. 2019
What's this "my degree" stuff? I expect a share!!! <VBG>
Kidding aside, glad to help...teaching is the fun of it...hopefully a few coding tidbits will stick and will help going forward.
Weitere Antworten (1)
dpb
am 6 Jul. 2019
Just saw something overlooked last night...
if size(vars,1) > 15
I think there's your problem -- you only calculate a regression if you have more than 15 observations -- but your window size is 15 so can never happen. If the idea is to only compute if there are no missing values, either
if size(vars,1)==Nwin
...
or back when you're looking for whether are missing values, do something like
for j=...
for i = 1:size(ret,1)-Nwin
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
if any(isnan(vars,2)), continue, end % skip to next set if any missing values
...
While your indexing by rows works, my preferred way to reduce visual clutter for such is something like--
i1=1; % initialize indexing variables
i2=Nwin;
for j=...
for i=1:size(ret,1)-Nwin
vars=[ret(i1:i2,[j+2 1:2])];
if any(isnan(vars,2)), continue, end % skip to next set if any missing values
y= vars(:,1);
x= [ones(rows(y),1) vars(:,2:3)];
b= regress(y,x);
betasMkt(i+Nwin,j+2) = b(2,1);
betasDisp(i+Nwin,j+2) = b(3,1);
i1=i1+1; i2=i2+1; % increment counters
end
end
this just increments each index by one--no difference in actual calculation but simpler to look at instead of the computed indices every time.
See if fixing the count test doesn't solve your problem, though...
3 Kommentare
dpb
am 7 Jul. 2019
"They should each produce a matrix the same size as that of the returns (minus 15 rows for the size 15 window). Can you spot anything in my code that might be causing only 1 column to arise? I'd be surprised if there were only 3068 windows that worked."
Well, you changed i2 to 20 from nWin=15 so you've cut the size down there.
Without the full data set can't test for how many cases might be missing but you could put some logic in to count for those.
Siehe auch
Kategorien
Mehr zu Logical finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!