Multivariate regression betas stock returns

2 Ansichten (letzte 30 Tage)
Emily Read
Emily Read am 5 Jul. 2019
Kommentiert: Emily Read am 11 Jul. 2019
I have a matrix of stock returns, 4280 (rows) by 7379 (columns).
I need to regress the 1st to 15th cell in column 3 (1,3) against the 1st to 15th cells in columns 1 and 2 (1,1), (1,2).
Then the 2nd to 16th cell in column 3 (2,3) against the 2nd to 16th cells in columns 1 and 2 (2,1) and (2,2) etc. So like a rolling window.
The columns remain constant, but each variable window (i:i+14) goes down by one row.
Once the whole column is done, it needs to go back up to the 4th column, but continue regressing against columns 1 and 2. Then the 5th column against columns 1 and 2, etc.
This is the code I have so far:
Nwin = 15; % set window for rolling regression
betasMkt = ret.*NaN; betasMkt(:,1) = ret (:,1); % initialize empty matrix for Market betas
betasDisp = ret.*NaN; betasDisp(:,1) = ret (:,1); % initialize empty matrix for CSAD betas
for j = 1:size(ret,2)-2
for i = 1:size(ret,1)-Nwin
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
vars = vars(~isnan(vars(:,1)),:);
vars = vars(~isnan(vars(:,2)),:);
vars = vars(~isnan(vars(:,3)),:);
if size(vars,1) > 15
y = vars(:,1);
x = [ones(rows(y),1) vars(:,2:3)];
b = regress(y,x);
betasMkt(i+Nwin,j+2) = b(2,1); % market beta
betasDisp(i+Nwin,j+2) = b(3,1); % dispersion beta
end
end
end
However, the betasMkt and betasDisp outputs are both just column 1 of 'ret'. It doesn't seem as if any regression has been performed.
Could anyone see where I am going wrong please?
I desperately need to figure this out soon.
Thank you for your time
  4 Kommentare
Emily Read
Emily Read am 6 Jul. 2019
Bearbeitet: dpb am 6 Jul. 2019
Hi,
Thank you for your help, I'll change that part of the code now.
I can't attach the whole file since it's 4GBs, however, I have attached a sample of the first 7 columns and 78 rows.
I realised that this part of the code:
for j = 1:size(ret,2)-2
for i = 1:size(ret,1)-Nwin
Should be:
for j = 1:rows(ret')-2
for i = 1:rows(ret)-Nwin
So that it goes through the rows, doesn't just count them as the size function does.
However, this is no longer compatible with the 2018 version. Is there a way that I could adapt this 'rows' function so it is compatible with the 2018 version, rather than having to download the 2015 version?
Thanks so much for your help it is very much appreciated.
dpb
dpb am 6 Jul. 2019
Bearbeitet: dpb am 6 Jul. 2019
From my reading of the doc, rows() is just size() specifically for a SQL fetch object -- it doesn't do anything different other than return a number. In your use it simply sets the upper bound of the for loop; it is the for loop and the index in there that actually "does something". I think it immaterial which you use unless size() can't read the return object after SQL query.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

dpb
dpb am 7 Jul. 2019
Bearbeitet: dpb am 7 Jul. 2019
OK, just add as new Answer since preceding getting rather long...try this out; seems to work ok here on the sample dataset.
NB: The one place had logic error before (and you should have gotten an indexing error out of bounds owing to it that would be why only had one column because it died at end of first j loop) is that the row indices initialization was outside both loops, not before the i loop to start over after having done the initial set of all columns.
ret=xlsread('MultivariateExample.xlsx'); % read the data, shorten variable names
[r,c]=size(ret); % size of the data r(ows), c(olumns)
nWin=15; % window length
nOK=10; % minimum finite observations required
betasMkt=nan(r-nWin,c-2); % initialize NaN matrix for Market betas
betasDisp=betasMkt;
for j=1:c-2
i1=1; % initialize index variables each column pass
i2=nWin;
for i=1:r-nWin
vars=ret(i1:i2,[j+2 1:2]);
vars(any(isnan(vars),2),:)=[]; % make sure all observations finite
nobs=size(vars,1); % how many were ok
if nobs>nOK % how many required to be -- 10
b= regress(vars(:,1),[ones(nobs,1) vars(:,2:3)]);
betasDisp(i,j)=b(2);
betasMkt(i,j)=b(3);
i1=i1+1; i2=i2+1; % increment counters
end
end
end
NB:
I indexed the output arrays by 1:C-2 and 1:R-nWin so the actual third column results are the first set of coefficients. Not much point in keeping two columns of NaN simply to account for a column offset of a constant.
If you do need to know which actual set of indices built each weighted set of coefficients, I'd keep another auxiliary variable (say the value of i1 would be appropriate) along with the coefficients. You could also keep nObs as a second. Those two would be good candidates for the first two columns and you could then go back to a full-column array size. :)
  3 Kommentare
dpb
dpb am 7 Jul. 2019
What's this "my degree" stuff? I expect a share!!! <VBG>
Kidding aside, glad to help...teaching is the fun of it...hopefully a few coding tidbits will stick and will help going forward.
Emily Read
Emily Read am 11 Jul. 2019
Hahah! Thanks, you really have helped me so much. I can't thank you enough.
It is people like you who really make learning technical things like this possible.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

dpb
dpb am 6 Jul. 2019
Just saw something overlooked last night...
if size(vars,1) > 15
I think there's your problem -- you only calculate a regression if you have more than 15 observations -- but your window size is 15 so can never happen. If the idea is to only compute if there are no missing values, either
if size(vars,1)==Nwin
...
or back when you're looking for whether are missing values, do something like
for j=...
for i = 1:size(ret,1)-Nwin
vars = [ret(i:i+Nwin-1,j+2) ret(i:i+Nwin-1,1) ret(i:i+Nwin-1,2)];
if any(isnan(vars,2)), continue, end % skip to next set if any missing values
...
While your indexing by rows works, my preferred way to reduce visual clutter for such is something like--
i1=1; % initialize indexing variables
i2=Nwin;
for j=...
for i=1:size(ret,1)-Nwin
vars=[ret(i1:i2,[j+2 1:2])];
if any(isnan(vars,2)), continue, end % skip to next set if any missing values
y= vars(:,1);
x= [ones(rows(y),1) vars(:,2:3)];
b= regress(y,x);
betasMkt(i+Nwin,j+2) = b(2,1);
betasDisp(i+Nwin,j+2) = b(3,1);
i1=i1+1; i2=i2+1; % increment counters
end
end
this just increments each index by one--no difference in actual calculation but simpler to look at instead of the computed indices every time.
See if fixing the count test doesn't solve your problem, though...
  3 Kommentare
dpb
dpb am 7 Jul. 2019
I'll try to look more closely and at the data file later on tonight...
dpb
dpb am 7 Jul. 2019
"They should each produce a matrix the same size as that of the returns (minus 15 rows for the size 15 window). Can you spot anything in my code that might be causing only 1 column to arise? I'd be surprised if there were only 3068 windows that worked."
Well, you changed i2 to 20 from nWin=15 so you've cut the size down there.
Without the full data set can't test for how many cases might be missing but you could put some logic in to count for those.

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by