goodnessOfFit

Goodness of fit between test and reference data for analysis and validation of identified models

Syntax

fit = goodnessOfFit(x,xref,cost_func)

Description

goodnessOfFit returns fit values that represent the error norm between test and reference data sets. If you want to compare and visualize simulated model output with measurement data, see also compare.

fit = goodnessOfFit(x,xref,cost_func) returns the goodness of fit between the test data x and the reference data xref using the cost function cost_func. fit is a quantitative representation of the closeness of x to xref. To perform multiple test-to-reference fit comparisons, you can specify x and xref as cell arrays of equal size that contain multiple test and reference data sets. With cell array inputs, fit returns an array of fit values.

example

Examples

collapse all

Calculate Goodness of Fit Between Estimated and Measured Data

Open Live Script

Find the goodness of fit between measured output data and the simulated output of an estimated model.

Obtain the measured output.

load iddata1 z1
yref = z1.y;

z1 is an iddata object containing measured input-output data. z1.y is the measured output.

Estimate a second-order transfer function model and simulate the model output y_est.

sys = tfest(z1,2);
y_est = sim(sys,z1(:,[],:));

Calculate the goodness of fit, or error norm, between the measured and estimated outputs. Specify the normalized root mean squared error (NRMSE) as the cost function.

cost_func = 'NRMSE';
y = y_est.y;
fit = goodnessOfFit(y,yref,cost_func)

fit = 
0.2943

Alternatively, you can use compare to calculate the fit. compare uses the NRMSE cost function, and expresses the fit percentage using the one's complement of the error norm. The fit relationship between compare and goodnessOfFit is therefore ${fit}_{compare} = (1 - {fit}_{gof}) * 100$ . A compare result of 100% is equivalent to a goodnessOfFit result of 0.

Specify an initial condition of zero to match the initial condition that goodnessOfFit assumes.

opt = compareOptions('InitialCondition','z');
compare(z1,sys,opt);

MATLAB figure

The fit results are equivalent.

Goodness of Fit for Multiple Data Sets

Open Live Script

Find the goodness of fit between measured and estimated outputs for two models.

Obtain the input-output measurements z2 from iddata2. Copy the measured output into reference output yref.

load iddata2 z2
yref = z2.y;

Estimate second-order and fourth-order transfer function models using z2.

sys2 = tfest(z2,2);
sys4 = tfest(z2,4);

Simulate both systems to get estimated outputs.

y_sim2 = sim(sys2,z2(:,[],:));
y2 = y_sim2.y;
y_sim4 = sim(sys4,z2(:,[],:));
y4 = y_sim4.y;

Create cell arrays from the reference and estimated outputs. The reference data set is the same for both model comparisons, so create identical reference cells.

yrefc = {yref yref};
yc = {y2 y4};

Compute fit values for the three cost functions.

fit_nrmse = goodnessOfFit(yc,yrefc,'NRMSE')

fit_nrmse = 1×2

    0.1429    0.1342

fit_nmse = goodnessOfFit(yc,yrefc,'NMSE')

fit_nmse = 1×2

    0.0204    0.0180

fit_mse = goodnessOfFit(yc,yrefc,'MSE')

fit_mse = 1×2

    1.0811    0.9540

A fit value of 0 indicates a perfect fit between reference and estimated outputs. The fit value rises as fit goodness decreases. For all three cost functions, the fourth-order model produces a better fit than the second-order model.

Input Arguments

collapse all

`x` — Data to test
matrix (default) | cell array

Data to test, specified as a matrix or cell array.

For a single test data set, specify an N_s-by-N matrix, where N_s is the number of samples and N is the number of channels. You must specify cost_fun as 'NRMSE' or 'NMSE' to use multiple-channel data.
For multiple test data sets, specify a cell array of length N_d, where N_d is the number of test-to-reference pairs and each cell contains one data matrix.

x must not contain any NaN or Inf values.

`xref` — Reference data
matrix (default) | cell array

Reference data with which to compare x, specified as a matrix or cell array.

For a single reference data set, specify an N_s-by-N matrix, where N_s is the number of samples and N is the number of channels. xref must be the same size as x. You must specify cost_fun as 'NRMSE' or 'NMSE' to use multiple-channel data.
For multiple reference data sets, specify a cell array of length N_d, where N_d is the number of test-to-reference pairs and each cell contains one reference data matrix. As with the individual data matrices, the cell array sizes for x and xref must match. Each ith element of fit corresponds to the pairs of the ith cells of x and xref.

xref must not contain any NaN or Inf values.

`cost_func` — Cost function
`'MSE'` | `'NRMSE'` | `'NMSE'`

Cost function to determine goodness of fit, specified as one of the following values. In the equations, the value fit applies to a single pairing of test and reference data sets.

Value Description Equation Notes

Value	Description	Equation	Notes
`'MSE'`	Mean squared error	$f i t = \frac{{‖ x - x r e f ‖}^{2}}{N s}$ where N_s is the number of samples and ‖ indicates the 2-norm of a vector.	fit is a scalar.
`'NRMSE'`	Normalized root mean squared error	$f i t (i) = \frac{‖ x r e f (:, i) - x (:, i) ‖}{‖ x r e f (:, i) - m e a n (x r e f (:, i)) ‖}$ where ‖ indicates the 2-norm of a vector. `fit` is a row vector of length N and i = 1,...,N, where N is the number of channels.	fit is a row vector. `'NRMSE'` is the cost function used by `compare`.
`'NMSE'`	Normalized mean squared error	$f i t (i) = \frac{{‖ x r e f (:, i) - x (:, i) ‖}^{2}}{{‖ x r e f (:, i) - m e a n (x r e f (:, i)) ‖}^{2}}$	fit is a row vector.

'MSE'

Mean squared error

$f i t = \frac{{‖ x - x r e f ‖}^{2}}{N s}$

where N_s is the number of samples and ‖ indicates the 2-norm of a vector.

fit is a scalar.

'NRMSE'

Normalized root mean squared error

$f i t (i) = \frac{‖ x r e f (:, i) - x (:, i) ‖}{‖ x r e f (:, i) - m e a n (x r e f (:, i)) ‖}$

where ‖ indicates the 2-norm of a vector. fit is a row vector of length N and i = 1,...,N, where N is the number of channels.

fit is a row vector. 'NRMSE' is the cost function used by compare.

'NMSE'

Normalized mean squared error

$f i t (i) = \frac{{‖ x r e f (:, i) - x (:, i) ‖}^{2}}{{‖ x r e f (:, i) - m e a n (x r e f (:, i)) ‖}^{2}}$

fit is a row vector.

Output Arguments

collapse all

`fit` — Goodness of fit
scalar | row vector | cell array

Goodness of fit between test and reference data set pairs, returned as a scalar, a row vector, or a cell array.

For a single test and reference data set pair, fit is returned as a scalar or row vector.
- If cost_fun is 'MSE', then fit is a scalar.
- If cost_fun is 'NRMSE' or 'NMSE', then fit is a column vector of length N, where N is the number of channels.
For multiple test and reference data set pairs, where x and xref are cell arrays of length N_D, fit is returned as a vector or a matrix.
- If cost_fun is 'MSE', then fit is a row vector of length N_D.
- If cost_fun is 'NRMSE' or 'NMSE', then fit is a matrix of size N-by- N_d, where N is the number of channels (data columns) and N_d represents the number of test pairs.
Each element of fit contains the goodness of fit values for the corresponding test data and reference pair.

Possible values for individual fit elements depend on the selection of cost_func.

If cost_func is 'MSE', each fit value is a positive scalar that grows with the error between test and reference data. A fit value of 0 indicates a perfect match between test and reference data.
If cost_func is 'NRMSE' or 'NMSE', fit values vary between -Inf and 1.
- 0 — Perfect fit to reference data (zero error)
- -Inf — Bad fit
- 1 — x is no better than a straight line at matching xref

Version History

Introduced in R2012a

expand all

R2020a: `goodnessOfFit`: Fit result represents the error norm for all three cost functions, with a value of zero indicating a perfect fit

goodnessOfFit now returns the error norm E as the fit value for all three cost functions (MSE, NRMSE, and NMSE). Previously, goodnessOfFit returned the one's complement of the error norm, 1-E, for fit values that used the NRMSE or NMSE cost functions. This change allows consistent fit-value interpretation across the three cost functions, with the ideal fit value of zero representing a perfect fit.

Previously computed NRMSE and NMSE fit values are the one's complements of the fit values computed with the current software. Similarly, the NRMSE fit value is now the one's complement of the fit used in the percentage value that compare computes. For example, if the previous goodnessOfFit fit value was 0.8, the current fit value is 0.2. A goodnessOfFit fit value of 0.2 is equivalent to a compare fit percentage of 80%.

goodnessOfFit

Syntax

Description

Examples

Calculate Goodness of Fit Between Estimated and Measured Data

Goodness of Fit for Multiple Data Sets

Input Arguments

x — Data to test matrix (default) | cell array

xref — Reference data matrix (default) | cell array

cost_func — Cost function 'MSE' | 'NRMSE' | 'NMSE'

Output Arguments

fit — Goodness of fit scalar | row vector | cell array

Version History

R2020a: goodnessOfFit: Fit result represents the error norm for all three cost functions, with a value of zero indicating a perfect fit

See Also

`x` — Data to test
matrix (default) | cell array

`xref` — Reference data
matrix (default) | cell array

`cost_func` — Cost function
`'MSE'` | `'NRMSE'` | `'NMSE'`

`fit` — Goodness of fit
scalar | row vector | cell array

R2020a: `goodnessOfFit`: Fit result represents the error norm for all three cost functions, with a value of zero indicating a perfect fit