summary
(Not Recommended) Print summary of dataset array
The dataset data type is not recommended. To work with heterogeneous data,
use the MATLAB®
table data type instead. See MATLAB
table documentation for more information.
Syntax
summary(A)
s = summary(A)
Description
summary(A) prints a summary of a dataset array and
the variables that it contains.
s = summary(A) returns a scalar structure s that
contains a summary of the dataset A and the variables that
A contains. For more information on the fields in s,
see Outputs.
Summary information depends on the type of the variables in the data set:
For numerical variables,
summarycomputes a five-number summary of the data, giving the minimum, the first quartile, the median, the third quartile, and the maximum.For logical variables,
summarycounts the number oftrues andfalses in the data.For categorical variables,
summarycounts the number of data at each level.
Output Arguments
The following list describes the fields in the structure s:
Description— A character array containing the dataset description.Variables— A structure array with one element for each dataset variable in A. Each element has the following fields:Name— A character vector containing the name of the variable.Description— A character vector containing the variable's description.Units— A character vector containing the variable's units.Size— A numeric vector containing the size of the variable.Class— A character vector containing the class of the variable.Data— A scalar structure containing the following fields.For numeric variables:
Probabilities— A numeric vector containing the probabilities [0.0 .25 .50 .75 1.0] and NaN (if any are present in the corresponding dataset variable).Quantiles— A numeric vector containing the values that correspond to 'Probabilities' for the corresponding dataset variable, and a count of NaNs (if any are present).
For logical variables:
Values— The logical vector [true false].Counts— A numeric vector of counts for each logical value.
For categorical variables:
Levels— A cell array containing the labels for each level of the corresponding dataset variable.Counts— A numeric vector of counts for each level.
'Data'is empty if variable is not numeric, categorical, or logical. If a dataset variable has more than one column, then the corresponding'Quantiles'or'Counts'field is a matrix or an array.
Examples
Summarize Fisher's iris data:
load fisheriris
species = nominal(species);
data = dataset(species,meas);
summary(data)
species: [150x1 nominal]
setosa versicolor virginica
50 50 50
meas: [150x4 double]
min 4.3000 2 1 0.1000
1st Q 5.1000 2.8000 1.6000 0.3000
median 5.8000 3 4.3500 1.3000
3rd Q 6.4000 3.3000 5.1000 1.8000
max 7.9000 4.4000 6.9000 2.5000Summarize the data in hospital.mat:
load hospital
summary(hospital)
Dataset array created from the data file hospital.dat.
The first column of the file ("id") is used for observation
names. Other columns ("sex" and "smoke") have been
converted from their original coded values into categorical
and logical variables. Two sets of columns ("sys" and
"dia", "trial1" through "trial4") have been combined into
single variables with multivariate observations. Column
headers have been replaced with more descriptive variable
names. Units have been added where appropriate.
LastName: [100x1 cell array of character vectors]
Sex: [100x1 nominal]
Female Male
53 47
Age: [100x1 double, Units = Yrs]
min 1st Q median 3rd Q max
25 32 39 44 50
Weight: [100x1 double, Units = Lbs]
min 1st Q median 3rd Q max
111 130.5000 142.5000 180.5000 202
Smoker: [100x1 logical]
true false
34 66
BloodPressure: [100x2 double, Units = mm Hg]
Systolic/Diastolic
min 109 68
1st Q 117.5000 77.5000
median 122 81.5000
3rd Q 127.5000 89
max 138 99
Trials: [100x1 cell, Units = Counts]
From zero to four measurement trials performed