Forcing robustfit to return an intercept

5 Ansichten (letzte 30 Tage)
Rob Campbell
Rob Campbell am 11 Feb. 2025
Kommentiert: Star Strider am 11 Feb. 2025
Hi,
I have datasets where I am plotting variance as a function of mean. The datasets have substantial heteroscedasticity but don't really have high leverage outliers, as they are symmetric and n is large. I have Python code which uses HuberRegressor from sklearn.linear_model to fit a straight line. This works and I attach example below (from Python).
I am seeking to replicate this fit in MATLAB but have failed to get an intercept. I began with robustfit with the huber parameter. Whilst I get a line, the intercept is returned as zero with an SE of zero. So it is refusing to fit this parameter. In Python I get an intercept value. I tried various ways to get an intercept in MATLAB but all failed:
  1. Adding a column of zeros to X and disabling the intercept in the CLI options of robustfit.
  2. All the different fit types in robustfit.
  3. Weighting the points in way similar to that in my short Python function. (https://github.com/datajoint/anscombe-numcodecs/blob/main/src/anscombe_numcodecs/estimate.py around line 63)
  4. I tried fitlm both with and without the robust fit option.
All of the above result in a zero intercept. The only thing that gives me an intercept value is centering the X values around their mean. However, the fit fails badly on some datasets when I do this so I would like to avoid it. I have had no fit failures on the same data in Python.
I suspect what is happening is that the extreme heteroscedasticity makes the in SE of the intercept value very unreliable and MATLAB is somehow then choosing to avoid returning a value for it and sets it to zero. I don't really care about the SE and significance, but I do need the intercept because I go on to use that value. Any idea how to force MATLAB to return it?
  3 Kommentare
Rob Campbell
Rob Campbell am 11 Feb. 2025
load dataForFit
coefs = robustfit(dataForFit.X, dataForFit.Y,@(r) dataForFit.weights, [])
coefs = 2×1
0 151.0423
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
plot(dataForFit.X,dataForFit.Y,'.k'); hold on; y = coefs(1) + dataForFit.X*coefs(2); plot(dataForFit.X,y,'-r')
Rob Campbell
Rob Campbell am 11 Feb. 2025
Above is a typical example. Note coef for the intercept is zero. Slope is reasonable but the offset is not.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Star Strider
Star Strider am 11 Feb. 2025
Note that ‘Y’ is single precision. MATLAB defaults to the lowest precision, so casting it as double produces a non-zero intercept —
LD = load('dataForFit.mat')
LD = struct with fields:
dataForFit: [1x1 struct]
dataForFit = LD.dataForFit
dataForFit = struct with fields:
X: [8030x1 double] Y: [8030x1 single] weights: [8030x1 double]
% X = dataForFit.X;
% Y = double(dataForFit.Y);
% W = dataForFit.weights;
format longG
coefs = robustfit(dataForFit.X, double(dataForFit.Y) ,@(r) dataForFit.weights, [])
coefs = 2×1
1.0e+00 * -99521.2254128697 167.982076516062
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
.
  2 Kommentare
Rob Campbell
Rob Campbell am 11 Feb. 2025
Oh, wow. Thanks. It wouldn't have occured to me that was the problem since singles are already such high precision.
Star Strider
Star Strider am 11 Feb. 2025
As always, my pleasure!
MATLAB has strict rules about mixing different precision values.. This is useful, however it can sometimes catch the unwary (as it’s caught me more than once, the reason I now always look).

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by