Main Content

summary

Description

summary(A) displays a summary that includes the properties of and statistics for the input data.

example

summary(A,dim) operates along dimension dim. For example, you can summarize each row in a matrix A using summary(A,2).

example

summary(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input combinations in the previous syntaxes. For example, summary(A,Statistics="std") includes only the standard deviation of the input data A.

example

s = summary(___) returns a structure that contains a summary of the input data.

example

Examples

collapse all

Create a matrix of type double, and display a summary of the matrix that includes the default statistics for each matrix column.

A = rand(5,3);
summary(A)
A: 5x3 double

    NumMissing           0             0             0  
    Min             0.1270        0.0975        0.1576  
    Median          0.8147        0.5469        0.8003  
    Max             0.9134        0.9649        0.9706  
    Mean            0.6786        0.5691        0.6742  
    Std             0.3285        0.3921        0.3487  

Display a summary that includes statistics for each matrix row.

summary(A,2)
A: 5x3 double

    NumMissing      Min      Median       Max       Mean        Std  

        0         0.09754    0.15761    0.81472    0.35663    0.39786
        0          0.2785    0.90579    0.97059    0.71829    0.38225
        0         0.12699    0.54688    0.95717    0.54368     0.4151
        0         0.48538    0.91338    0.95751    0.78542    0.26078
        0         0.63236    0.80028    0.96489    0.79918    0.16627

Create a categorical vector containing three categories.

A = categorical(["A";"B";"C";"A";"C"])
A = 5x1 categorical
     A 
     B 
     C 
     A 
     C 

Display a summary of the vector that includes the number of occurrences of each category.

summary(A)
A: 5x1 categorical

     A                2 
     B                1 
     C                2 
     <undefined>      0 

Create a matrix of type double and display a summary of the matrix that includes the sum of each matrix column in addition the default statistics.

A = rand(5,3);
summary(A,Statistics=["default" "var" "sum"])
A: 5x3 double

    NumMissing           0             0             0  
    Min             0.1270        0.0975        0.1576  
    Median          0.8147        0.5469        0.8003  
    Max             0.9134        0.9649        0.9706  
    Mean            0.6786        0.5691        0.6742  
    Std             0.3285        0.3921        0.3487  
    Var             0.1079        0.1537        0.1216  
    Sum             3.3932        2.8453        3.3710  

Create a table with four variables of different data types.

num = rand(6,1);
num2 = single(rand(6,1));
cat = categorical(["a";"a";"b";"a";"b";"c"]);
dt = datetime(2016:2021,1,1)';
T = table(num,num2,cat,dt)
T=6×4 table
      num       num2      cat        dt     
    _______    _______    ___    ___________

    0.81472     0.2785     a     01-Jan-2016
    0.90579    0.54688     a     01-Jan-2017
    0.12699    0.95751     b     01-Jan-2018
    0.91338    0.96489     a     01-Jan-2019
    0.63236    0.15761     b     01-Jan-2020
    0.09754    0.97059     c     01-Jan-2021

Display a summary of the table.

summary(T)
T: 6x4 table

Variables:

    num: double
    num2: single
    cat: categorical (3 categories)
    dt: datetime

Statistics for applicable variables:

            NumMissing          Min                   Median                   Max                    Mean                    Std      

    num         0                0.0975                      0.7235             0.9134                      0.5818             0.3776  
    num2        0                0.1576                      0.7522             0.9706                      0.6460             0.3708  
    cat         0                                                                                                                      
    dt          0           01-Jan-2016        02-Jul-2018 12:00:00        01-Jan-2021        02-Jul-2018 12:00:00        16401:17:23  

Load a table of data from the provided file.

load T

Display a summary of the table with additional table and variable metadata, including custom metadata. Omit statistics from the summary.

summary(T,Detail="high",Statistics="none")
T: 100x4 table

Description: Simulated patient data

Variables:

    Status: categorical
        Instrument:  [1x1 cell]
    Age: double (Yrs)
        Instrument:  height rod
    Smoker: logical
        Instrument:  [1x1 cell]
    BloodPressure: 2-column double (mm Hg)
        Description:  Systolic/Diastolic
        Instrument:  bloodp pressure cuff

The summary includes the metadata properties that describe the table and its variables. Access the properties.

T.Properties
ans = 
  TableProperties with properties:

             Description: 'Simulated patient data'
                UserData: []
          DimensionNames: {'Row'  'Variables'}
           VariableNames: {'Status'  'Age'  'Smoker'  'BloodPressure'}
           VariableTypes: ["categorical"    "double"    "logical"    "double"]
    VariableDescriptions: {''  ''  ''  'Systolic/Diastolic'}
           VariableUnits: {''  'Yrs'  ''  'mm Hg'}
      VariableContinuity: []
                RowNames: {100x1 cell}

   Custom Properties (access using t.Properties.CustomProperties.<name>):
              Instrument: {''  'height rod'  ''  'bloodp pressure cuff'}

Create a timetable.

MeasurementTime = datetime(["2024-01-01";"2024-02-01";"2024-03-01"]);
Temp = [37;39;42];
TT = timetable(MeasurementTime,Temp)
TT=3×1 timetable
    MeasurementTime    Temp
    _______________    ____

    01-Jan-2024         37 
    01-Feb-2024         39 
    01-Mar-2024         42 

Return a summary of the timetable.

s = summary(TT)
s = struct with fields:
    MeasurementTime: [1x1 struct]
               Temp: [1x1 struct]

The MeasurementTime field of the structure contains a summary of the row times.

s.MeasurementTime
ans = struct with fields:
          Size: [3 1]
          Type: 'datetime'
      TimeZone: ''
    SampleRate: NaN
     StartTime: 01-Jan-2024
    NumMissing: 0
           Min: 01-Jan-2024
        Median: 01-Feb-2024
           Max: 01-Mar-2024
          Mean: 31-Jan-2024 08:00:00
           Std: 720:07:59
      TimeStep: 1mo

The Temp field of the structure contains a summary of the Temp variable. Access the median.

s.Temp.Median
ans = 
39

Input Arguments

collapse all

Input data, specified as an array, table, or timetable.

Operating dimension for array, specified as a positive integer scalar, a vector of positive integers, or "all". If you do not specify dim, then the default is the first array dimension whose size does not equal 1.

If the input array is categorical, then dim must be a scalar.

Consider an input matrix, A:

  • summary(A,1) displays statistics for each column of A.

  • summary(A,2) displays statistics for each row of A.

Specifying dim is not supported when the input data is a table or timetable.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: summary(A,Statistics="allstats")

Level of detail to display for table or timetable input data, specified as one of these values:

  • "low" — Provide a concise summary. Display the variable name, type, unit, and description for each table variable.

  • "high" — Provide a verbose summary. Display all table and variable metadata in addition to details in "low". For categorical variables, "high" also displays the categories and counts.

summary accesses metadata that describes a table and its variables through the Properties property of the table.

The Detail name-value argument does not configure the summary when you return the summary as a scalar structure. The summary structure always includes all table and variable metadata.

Example: summary(A,Detail="high") displays table and variable metadata in addition to the variable names, types, units, and descriptions.

Statistics to compute, specified as one or more of the following values. For table and timetable data, the specified statistics are computed for all applicable variables, including row times for timetable data.

For the "default" value, the statistics to compute depend on the data type of the input data.

Data Type

Statistics to Compute

  • double

  • single

  • duration

  • datetime

  • Other types

  • "nummissing"

  • "min"

  • "median"

  • "max"

  • "mean"

  • "std"

Integer
  • "nummissing"

  • "min"

  • "median"

  • "max"

  • "mean"

logical

"counts"

Non-ordinal categorical
  • "counts"

  • "nummissing"

Ordinal categorical
  • "counts"

  • "nummissing"

  • "min"

  • "median"

  • "max"

  • string

  • char

  • Cell array of character vectors

"nummissing"

To compute a different set of statistics, you can specify one or more of these values. To specify multiple statistics, list the options in a string array or cell array.

Statistic

Description

"nummissing"

Number of missing elements

"min"

Minimum

"median"

Median

"max"

Maximum

"q1"

First quartile or 25th percentile

"q3"

Third quartile or 75th percentile

"mean"

Mean

"std"

Standard deviation

"var"

Variance

"mode"

Mode

"range"

Maximum minus minimum

"sum"

Sum

"numunique"

Number of distinct nonmissing elements

"nnz"

Number of nonzero and nonmissing elements

"counts"

Number of occurrences of each category

"allstats"

All statistics previously listed

"none"

No statistics

You can also specify Statistics as a function handle that must:

  • Accept one input data argument.

  • Return one output that is scalar or has the same size as the input data in all dimensions except for a size of 1 along the first dimension.

  • For table or timetable input data, operate along each variable separately.

When summary computes a statistic:

  • If the function encounters an error, the summary does not include that statistic.

  • If the function encounters missing values, it omits those values from the computation, with the exception of the "nummissing" statistic. To include missing values, use a function handle, such as @sum instead of "sum".

Example: summary(A,Statistics=["mean" "var" "mode"]) computes the mean, variance, and mode.

Example: summary(A,Statistics={"default",myFun1}) computes the result of myFun1 in addition to the default statistics.

Table or timetable variables to summarize, specified as one of the values in this table.

Variables in the table or timetable not specified by the DataVariables name-value argument are not included in the summary.

Indexing SchemeValues to SpecifyExamples

Variable names

  • A string scalar or character vector

  • A string array or cell array of character vectors

  • A pattern object

  • "A" or 'A' — A variable named A

  • ["A" "B"] or {'A','B'} — Two variables named A and B

  • "Var"+digitsPattern(1) — Variables named "Var" followed by a single digit

Variable index

  • An index number that refers to the location of a variable in the table

  • A vector of numbers

  • A logical vector. Typically, this vector is the same length as the number of variables, but you can omit trailing 0 (false) values.

  • 3 — The third variable from the table

  • [2 3] — The second and third variables from the table

  • [false false true] — The third variable

Function handle

  • A function handle that takes a table variable as input and returns a logical scalar

  • @isnumeric — All the variables containing numeric values

Variable type

  • A vartype subscript that selects variables of a specified type

  • vartype("numeric") — All the variables containing numeric values

Example: summary(A,DataVariables=["Var1" "Var2" "Var4"]) displays a summary of Var1, Var2, and Var4.

Output Arguments

collapse all

Summary of input data, returned as a scalar structure.

  • If the input data is a table or timetable, then each field in s contains a summary of one of the variables. If A is a timetable, s also contains a field with the summary of the row times.

  • If the input data is an array, then each field in s contains a property or statistic.

Extended Capabilities

Version History

Introduced in R2013b

expand all