From the polyfit documentation page: "[p,S,mu] = polyfit(x,y,n) performs centering and scaling to improve the numerical properties of both the polynomial and the fitting algorithm. This syntax additionally returns mu, which is a two-element vector with centering and scaling values. mu(1) is mean(x), and mu(2) is std(x). Using these values, polyfit centers x at zero and scales it to have unit standard deviation," If you call polyfit with three outputs, p is not a polynomial in x. It is a polynomial in the centered and scaled . y=[5,6,10,20,28,33,34,36,42];
p = polyfit(xdata, y, 1);
Let's look at p symbolically.
polynomialInX = vpa(psym, 5)
polynomialInX = Now let's look at the polynomial in the centered and scaled . [p, ~, mu] = polyfit(xdata, y, 1);
polynomialInXhat = vpa(poly2sym(p, xhat), 5)
polynomialInXhat = These look different. But what happens if we substitute the expression for into polynomialInXhat? vpa(subs(polynomialInXhat, xhat, (x-mu(1))/mu(2)), 5)
ans = That looks the same as polynomialInX. What if we evaluate both polynomials, polynomialInX at the unscaled X data and polynomialInXhat at the scaled X data?
valueUnscaled = vpa(subs(polynomialInX, x, xdata), 5)
valueUnscaled = valueScaled = vpa(subs(polynomialInXhat, xhat, (xdata-mu(1))./mu(2)), 5)
valueScaled = The difference doesn't really matter that much for the 1st degree polynomial and the small magnitude x data you're using. But suppose you were doing something that required you to take the fourth power of a year, like if you were trying to fit the census data to the population? pUnscaled = polyfit(cdate, pop, 4)
Warning: Polynomial is badly conditioned. Add points with distinct X values, reduce the degree of the polynomial, or try centering and scaling as described in HELP POLYFIT.
pUnscaled =
1.0e+00 *
4.75430030603743e-08 -0.000355569614612858 1.00320581128586 -1264.43935834017 600203.36463964
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
That leading coefficient is tiny becuase you're working with large numbers when you raise 2020 to the fourth power. That's why you receive a warning.
But if you'd centered and scaled the years from 1900 to 2020:
[pScaled, ~, mu] = polyfit(cdate, pop, 4)
pScaled =
0.704706162785502 0.92102307075127 23.4706157176829 73.8597813280959 62.2285498913524
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
mu =
1.0e+00 *
1890
62.0483682299543
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Now you're taking powers of numbers on the order of:
normalizedYears = normalize(cdate, 'center', mu(1), 'scale', mu(2))
normalizedYears =
-1.61164592805076
-1.45048133524568
-1.28931674244061
-1.12815214963553
-0.966987556830456
-0.80582296402538
-0.644658371220304
-0.483493778415228
-0.322329185610152
-0.161164592805076
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
and those numbers aren't nearly as large.
I'd much rather work with 6.7 than 16649664160000 and a leading coefficient near 0.7 rather than 4e-8.