Broken stick regression and find change point

17 Ansichten (letzte 30 Tage)
Joton
Joton am 25 Okt. 2017
Kommentiert: Joton am 27 Okt. 2017
Hi!
I've got two vectors, one with data on outside temperature, and the other one is heat supply for a building. I can scatter plot the vectors to get a visual of how the heat supply increases with lower temperature (Celsius). For higher temperature heat is only needed for water and therefore there is no longer a dependency between increasing temperature and lowered heat demand. As can be seen in the picture below
I would like to find the change point, (the temperature where building heating is starting to be needed) and perform a segmented regression (broken stick), one line for the temperature dependent part, and one line for the independent. Is there a way to do that in for example the curve fitting app? Or how do i write it in code?
I know about the function Findchangepts, but it only works with 1 vector, not 2, and not with a matrix.
  3 Kommentare
Joton
Joton am 26 Okt. 2017
Bearbeitet: Joton am 26 Okt. 2017
Yes, it can be interpreted that way, there could be a slight slope but that is more due to random variations in usage of warm water than physical factors. So, in the figure above, there should be a line with a negative slope between -15 (Celsius) to somewhere around 3-5(Celsius). And from there to +15 (Celsius) a line with no slope.
Joton
Joton am 26 Okt. 2017
Bearbeitet: Joton am 26 Okt. 2017
The important part though, is the slope before the breaking point and the position of the breaking point. As the line above the breaking point is of no larger interest, a slight slope there doesn´t matter.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

John D'Errico
John D'Errico am 26 Okt. 2017
Bearbeitet: John D'Errico am 26 Okt. 2017
Since you provide no data, I'll make some up.
plot(xdata,ydata,'o')
grid on
Not quite the same as yours, since yours is clearly not uniform variance. I will assume there is an unknown break location, and employ a model where the function is constant above the break, and continuous but not differentiable at the break.
Two small helper functions...
coefs = @(xbreak) [ones(numel(xdata),1),min(xdata,xbreak)-xbreak]\ydata;
errfun = @(xbreak) norm([ones(numel(xdata),1),min(xdata,xbreak)-xbreak]*coefs(xbreak) - ydata);
They assume that xdata and ydata are column vectors of the same length, residing in your workspace.
Solve for the break point:
xbreak = fminbnd(errfun,0,1)
xbreak =
0.59438654958197
Not bad, since the real function that I created had a break at 0.60.
C = coefs(xbreak)
ans =
-0.462584832269373
-1.32149120536521
Above the break point, the function is constant, at -0.4625...
Below the break point, the slope is estimated at -1.32.
So, the line below the break point can be written as
y = (x - xbreak)*C(2) + C(1)
Above the break point, it is simply
y = C(1)
  3 Kommentare
John D'Errico
John D'Errico am 27 Okt. 2017
It is a linear regression for ALL of the values, with an optimization in the middle, to find the break location. Because the function above the break was modeled as a constant one, I could get tricky, and solve it in a non-obvious way.
Effectively, this should have been modeled as a piecewise linear spline. Had I done it that way, there would have been more terms, and it would have looked more complex. But knowledge of the constant nature above the break allowed me to simplify things greatly using the min function.
As far as plotting goes, I would plot it as three points.
plot([min(x),xbreak,max(x)],[(min(x)-xbreak)*C(2)+C(1),C(1),C(1)],'r-')
For evaluation purposes, I would just use a piecewise function. An old FEX utility exists for such piecewise function evaluation, but the simplest solution is to just use a function like this:
yfun = @(x) (min(x,xbreak) - xbreak)*C(2) + C(1);
that works for any value of x.
Finally, I would point out that the fit that you got is a little high in the constant portion of the curve. That is because if you look in the lower part of that constant segment, there is a LOT of data that tends to draw the curve up a bit.
Remember that a simple regression ONLY assumes the data has noise in the Y direction. But really, this curve looks more like a curve where you want to model as if the noise is projected to the curve orthogonally. That is a significantly more difficult problem to solve, especially when the breakpoint is unknown.
Joton
Joton am 27 Okt. 2017
Thank you for good answers and explanations!
/Anton

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Linear and Nonlinear Regression finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by