delimitedTextImportOptions

Import options object for delimited text

Description

A DelimitedTextImportOptions object enables you to specify how MATLAB^® imports tabular data from delimited text files. The object contains properties that control the data import process, including the handling of errors and missing data.

Creation

You can create a DelimitedTextImportOptions object using either the detectImportOptions function or the delimitedTextImportOptions function (described here):

Use detectImportOptions to detect and populate the import properties based on the contents of the delimited text file specified in filename.
```
opts = detectImportOptions(filename);
```
Use delimitedTextImportOptions to define the import properties based on your import requirements.

Syntax

opts = delimitedTextImportOptions

opts = delimitedTextImportOptions('NumVariables',numVars)

opts = delimitedTextImportOptions(___,Name,Value)

Description

opts = delimitedTextImportOptions creates a DelimitedTextImportOptions object with one variable.

example

opts = delimitedTextImportOptions('NumVariables',numVars) creates the object with the number of variables specified in numVars.

example

opts = delimitedTextImportOptions(___,Name,Value) specifies additional properties for DelimitedTextImportOptions object using one or more name-value pair arguments.

example

Input Arguments

expand all

`numVars` — Number of variables
positive scalar integer

Number of variables, specified as a positive scalar integer.

Properties

expand all

Variable Properties

`VariableNames` — Variable names
cell array of character vectors | string array

Variable names, specified as a cell array of character vectors or string array. The VariableNames property contains the names to use when importing variables.

If the data contains N variables, but no variable names are specified, then the VariableNames property contains {'Var1','Var2',...,'VarN'}.

To support invalid MATLAB identifiers as variable names, such as variable names containing spaces and non-ASCII characters, set the value of VariableNamingRule to 'preserve'.

Example: opts.VariableNames returns the current (detected) variable names.

Example: opts.VariableNames(3) = {'Height'} changes the name of the third variable to Height.

Data Types: char | string | cell

`VariableNamingRule` — Flag to preserve variable names
`"modify"` (default) | `"preserve"`

Flag to preserve variable names, specified as either "modify" or "preserve".

"modify" — Convert invalid variable names (as determined by the isvarname function) to valid MATLAB identifiers.
"preserve" — Preserve variable names that are not valid MATLAB identifiers such as variable names that include spaces and non-ASCII characters.

Starting in R2019b, variable names and row names can include any characters, including spaces and non-ASCII characters. Also, they can start with any characters, not just letters. Variable and row names do not have to be valid MATLAB identifiers (as determined by the isvarname function). To preserve these variable names and row names, set the value of VariableNamingRule to "preserve". Variable names are not refreshed when the value of VariableNamingRule is changed from "modify" to "preserve".

Data Types: char | string

`VariableTypes` — Data types of variable
cell array of character vectors | string array

Data type of variable, specified as a cell array of character vectors, or string array containing a set of valid data type names. The VariableTypes property designates the data types to use when importing variables.

To update the VariableTypes property, use the setvartype function.

Example: opts.VariableTypes returns the current variable data types.

Example: opts = setvartype(opts,'Height',{'double'}) changes the data type of the variable Height to double.

`SelectedVariableNames` — Subset of variables to import
character vector | string scalar | cell array of character vectors | string array | array of numeric indices

Subset of variables to import, specified as a character vector, string scalar, cell array of character vectors, string array or an array of numeric indices.

SelectedVariableNames must be a subset of names contained in the VariableNames property. By default, SelectedVariableNames contains all the variable names from the VariableNames property, which means that all variables are imported.

Use the SelectedVariableNames property to import only the variables of interest. Specify a subset of variables using the SelectedVariableNames property and use readtable to import only that subset.

To support invalid MATLAB identifiers as variable names, such as variable names containing spaces and non-ASCII characters, set the value of VariableNamingRule to 'preserve'.

Example: opts.SelectedVariableNames = {'Height','LastName'} selects only two variables, Height and LastName, for the import operation.

Example: opts.SelectedVariableNames = [1 5] selects only two variables, the first variable and the fifth variable, for the import operation.

Example: T = readtable(filename,opts) returns a table containing only the variables specified in the SelectedVariableNames property of the opts object.

`VariableOptions` — Type specific variable import options
array of variable import options objects

Type specific variable import options, returned as an array of variable import options objects. The array contains an object corresponding to each variable specified in the VariableNames property. Each object in the array contains properties that support the importing of data with a specific data type.

Variable options support these data types: numeric, text, logical, datetime, or categorical.

To query the current (or detected) options for a variable, use the getvaropts function.

To set and customize options for a variable, use the setvaropts function.

Example: opts.VariableOptions returns a collection of VariableImportOptions objects, one corresponding to each variable in the data.

Example: getvaropts(opts,'Height') returns the VariableImportOptions object for the Height variable.

Example: opts = setvaropts(opts,'Height','FillValue',0) sets the FillValue property for the variable Height to 0.

Location Properties

`DataLines` — Data location
positive scalar integer | array of positive scalar integers

Data location, specified as a positive scalar integer or a N-by-2 array of positive scalar integers. Specify DataLines using one of these forms.

Specify as	Description
`n`	Specify the first line that contains the data. Specifying the value using `n` sets the value of `DataLines` property to `[n inf]`. The importing function reads all rows between `n` and the end-of-file. `n` must be a positive integer greater than zero.
`[n1 n2]`	Specify the line range that contains the data. `n1` is the first line that contains the data and the `n2` is the last line that contains the data. Values in the array `[n1 n2]` must be nonzero positive integers and `n2` must be greater than `n1`.
`[n1 n2; n3 n4;...]`	Specify multiple line ranges to read with an `N-`by`-2` array containing `N` different line ranges. A valid array of multiple line ranges must: Specify line ranges in an increasing order, that is the first line range specified in the array appears in the file before the other line ranges. Contain only nonoverlapping line ranges. When specifying multiple line ranges, use `Inf` only when specifying the end of the last line range in the array. For example, `[1 3; 5 6; 8 Inf]`.

Specify as

Description

n

Specify the first line that contains the data. Specifying the value using n sets the value of DataLines property to [n inf]. The importing function reads all rows between n and the end-of-file.

n must be a positive integer greater than zero.

[n1 n2]

Specify the line range that contains the data. n1 is the first line that contains the data and the n2 is the last line that contains the data.

Values in the array [n1 n2] must be nonzero positive integers and n2 must be greater than n1.

[n1 n2; n3 n4;...]

Specify multiple line ranges to read with an N-by-2 array containing N different line ranges.

A valid array of multiple line ranges must:

Specify line ranges in an increasing order, that is the first line range specified in the array appears in the file before the other line ranges.
Contain only nonoverlapping line ranges.

When specifying multiple line ranges, use Inf only when specifying the end of the last line range in the array. For example, [1 3; 5 6; 8 Inf].

Example: opts.DataLines = 5 sets the DataLines property to the value [5 inf]. Read all rows of data starting from row 5 to the end-of-file.

Example: opts.DataLines = [2 6] sets the property to read lines 2 through 6.

Example: opts.DataLines = [1 3; 5 6; 8 inf] sets the property to read rows 1, 2, 3, 5, 6, and all rows between 8, and the end-of-file.

`RowNamesColumn` — Row names location
`0` (default) | positive scalar integer

Row names location, specified as a positive scalar integer. The RowNamesColumn property specifies the location of the column containing the row names.

If RowNamesColumn is specified as 0, then do not import the row names. Otherwise, import the row names from the specified column.

Example: opts.RowNamesColumn = 2;

`VariableNamesLine` — Variable names location
`0` (default) | positive scalar integer

Variable names location, specified as a positive scalar integer. The VariableNamesLine property specifies the line number where variable names are located.

If VariableNamesLine is specified as 0, then do not import the variable names. Otherwise, import the variable names from the specified line.

Example: opts.VariableNamesLine = 6;

`VariableDescriptionsLine` — Variable description location
`0` (default) | positive scalar integer

Variable description location, specified as a positive scalar integer. The VariableDescriptionsLine property specifies the line number where variable descriptions are located.

If VariableDescriptionsLine is specified as 0, then do not import the variable descriptions. Otherwise, import the variable descriptions from the specified line.

Example: opts.VariableDescriptionsLine = 7;

`VariableUnitsLine` — Variable units location
`0` (default) | positive scalar integer

Variable units location, specified as a positive scalar integer. The VariableUnitsLine property specifies the line number where variable units are located.

If VariableUnitsLine is specified as 0, then do not import the variable units. Otherwise, import the variable units from the specified line.

Example: opts.VariableUnitsLine = 8;

Delimited Text Properties

`Delimiter` — Field delimiter characters
string array | character vector | cell array of character vectors

Field delimiter characters in a delimited text file, specified as a string array, character vector, or cell array of character vectors.

Example: "Delimiter","|"

Example: "Delimiter",[";","*"]

`Whitespace` — Characters to treat as white space
character vector | string scalar

Characters to treat as white space, specified as a character vector or string scalar containing one or more characters.

Example: 'Whitespace',' _'

Example: 'Whitespace','?!.,'

`LineEnding` — End-of-line characters
`["\n","\r","\r\n"]` (default) | string array | character vector | cell array of character vectors

End-of-line characters, specified as a string array, character vector, or cell array of character vectors.

Example: "LineEnding","\n"

Example: "LineEnding","\r\n"

Example: "LineEnding",["\b",":"]

`CommentStyle` — Style of comments
string array | character vector | cell array of character vectors

Style of comments, specified as a string array, character vector, or cell array of character vectors. For single- and multi-line comments, the starting identifier must be the first non-white-space character. For single-line comments, specify a single identifier to treat lines starting with the identifier as comments. For multi-line comments, lines from the starting (first) identifier to the ending (second) identifier are treated as comments. No more than two character vectors of identifiers can be specified.

For example, to ignore the line following a percent symbol as the first non-white-space character, specify CommentStyle as "%".

Example: "CommentStyle",["/*"]

Example: "CommentStyle",["/*","*/"]

`ConsecutiveDelimitersRule` — Procedure to manage consecutive delimiters
`"split"` | `"join"` | `"error"`

Procedure to manage consecutive delimiters in a delimited text file, specified as one of the values in this table.

Value	Behavior
`"split"`	Split the consecutive delimiters into multiple fields.
`"join"`	Join the delimiters into one delimiter.
`"error"`	Return an error and cancel the import operation.

`LeadingDelimitersRule` — Procedure to manage leading delimiters
`"keep"` | `"ignore"` | `"error"`

Procedure to manage leading delimiters in a delimited text file, specified as one of the values in this table.

Value	Behavior
`"keep"`	Keep the delimiter.
`"ignore"`	Ignore the delimiter.
`"error"`	Return an error and cancel the import operation.

`TrailingDelimitersRule` — Procedure to manage trailing delimiters
`'keep'` | `'ignore'` | `'error'`

Procedure to manage trailing delimiters in a delimited text file, specified as one of the values in this table.

Leading Delimiters Rule	Behavior
`'keep'`	Keep the delimiter.
`'ignore'`	Ignore the delimiter.
`'error'`	Return an error and abort the import operation.

`Encoding` — Character encoding scheme
`''` | `'UTF-8'` | `'system'` | `'ISO-8859-1'` | `'windows-1251'` | `'windows-1252'` | ...

Character encoding scheme associated with the file, specified as the comma-separated pair consisting of 'Encoding' and 'system' or a standard character encoding scheme name.

When you do not specify any encoding, the function uses automatic character set detection to determine the encoding when reading the file.

Example: 'Encoding','system' uses the system default encoding.

Data Types: char | string

Replacement Rules

`MissingRule` — Procedure to manage missing data
`'fill'` (default) | `'error'` | `'omitrow'` | `'omitvar'`

Procedure to manage missing data, specified as one of the values in this table.

Missing Rule	Behavior
`'fill'`	Replace missing data with the contents of the `FillValue` property. The `FillValue` property is specified in the `VariableImportOptions` object of the variable being imported. For more information on accessing the `FillValue` property, see `setvaropts`.
`'error'`	Stop importing and display an error message showing the missing record and field.
`'omitrow'`	Omit rows that contain missing data.
`'omitvar'`	Omit variables that contain missing data.

Example: opts.MissingRule = 'omitrow';

Data Types: char | string

`EmptyLineRule` — Procedure to handle empty lines
`'skip'` | `'read'` | `'error'`

Procedure to handle empty lines in the data, specified as 'skip', 'read', or 'error'. The importing function interprets white space as empty.

Empty Line Rule	Behavior
`'skip'`	Skip the empty lines.
`'read'`	Import the empty lines. The importing function parses the empty line using the values specified in `VariableWidths`, `VariableOptions`, `MissingRule`, and other relevant properties, such as `Whitespace`.
`'error'`	Display an error message and abort the import operation.

Example: opts.EmptyLineRule = 'skip';

Data Types: char | string

`ImportErrorRule` — Procedure to handle import errors
`'fill'` (default) | `'error'` | `'omitrow'` | `'omitvar'`

Procedure to handle import errors, specified as one of the values in this table.

Import Error Rule	Behavior
`'fill'`	Replace the data where the error occurred with the contents of the `FillValue` property. The `FillValue` property is specified in the `VariableImportOptions` object of the variable being imported. For more information on accessing the `FillValue` property, see `setvaropts`.
`'error'`	Stop importing and display an error message showing the error-causing record and field.
`'omitrow'`	Omit rows where errors occur.
`'omitvar'`	Omit variables where errors occur.

Example: opts.ImportErrorRule = 'omitvar';

Data Types: char | string

`ExtraColumnsRule` — Procedure to handle extra columns
`'addvars'` | `'ignore'` | `'wrap'` | `'error'`

Procedure to handle extra columns in the data, specified as one of the values in this table.

Extra Columns Rule	Behavior
`'addvars'`	To import extra columns, create new variables. If there are `N` extra columns, then import new variables as `'ExtraVar1', 'ExtraVar2',..., 'ExtraVarN'`. Extra columns of data are imported as if their `VariableTypes` are `char`.
`'ignore'`	Ignore the extra columns of data.
`'wrap'`	Wrap the extra columns of data to new records. This action does not change the number of variables.
`'error'`	Display an error message and abort the import operation.

Data Types: char | string

Object Functions

`getvaropts`	Get variable import options
`setvaropts`	Set variable import options
`setvartype`	Set variable data types
`preview`	Preview eight rows from file using import options

Examples

collapse all

Define Import Options for Variables in Delimited Text File

Open Live Script

Define an import options object to read multiple variables from patients.dat.

Based on the contents of your file, define these variable properties: names, types, delimiter character, data starting location, and the extra column rule.

varNames = {'LastName','Gender','Age','Location','Height','Weight','Smoker'} ;
varTypes = {'char','categorical','int32','char','double','double','logical'} ;
delimiter = ',';
dataStartLine = 2;
extraColRule = 'ignore';

Use the delimitedTextImportOptions function and your variable information to initialize the import options object opts.

opts = delimitedTextImportOptions('VariableNames',varNames,...
                                'VariableTypes',varTypes,...
                                'Delimiter',delimiter,...
                                'DataLines', dataStartLine,...
                                'ExtraColumnsRule',extraColRule);

Use the preview function with the import options object to preview the data.

preview('patients.dat',opts)

ans=8×7 table
      LastName      Gender    Age              Location               Height    Weight    Smoker
    ____________    ______    ___    _____________________________    ______    ______    ______

    {'Smith'   }    Male      38     {'County General Hospital'  }      71       176      false 
    {'Johnson' }    Male      43     {'VA Hospital'              }      69       163      false 
    {'Williams'}    Female    38     {'St. Mary's Medical Center'}      64       131      false 
    {'Jones'   }    Female    40     {'VA Hospital'              }      67       133      false 
    {'Brown'   }    Female    49     {'County General Hospital'  }      64       119      false 
    {'Davis'   }    Female    46     {'St. Mary's Medical Center'}      68       142      false 
    {'Miller'  }    Female    33     {'VA Hospital'              }      64       142      false 
    {'Wilson'  }    Male      40     {'VA Hospital'              }      68       180      false

Import the data using readtable.

T = readtable('patients.dat',opts);
whos T

  Name        Size            Bytes  Class    Attributes

  T         100x7             33987  table

Version History

Introduced in R2016b

expand all

R2018b: Create options object using `delimitedTextImportOptions` function

Use the delimitedTextImportOptions function to create a DelimitedTextImportOptions object. Previously, you could create this object only by using the detectImportOptions function.

delimitedTextImportOptions

Description

Creation

Syntax

Description

Input Arguments

numVars — Number of variables positive scalar integer

Properties

Variable Properties

VariableNames — Variable names cell array of character vectors | string array

VariableNamingRule — Flag to preserve variable names "modify" (default) | "preserve"

VariableTypes — Data types of variable cell array of character vectors | string array

SelectedVariableNames — Subset of variables to import character vector | string scalar | cell array of character vectors | string array | array of numeric indices

VariableOptions — Type specific variable import options array of variable import options objects

Location Properties

DataLines — Data location positive scalar integer | array of positive scalar integers

RowNamesColumn — Row names location 0 (default) | positive scalar integer

VariableNamesLine — Variable names location 0 (default) | positive scalar integer

VariableDescriptionsLine — Variable description location 0 (default) | positive scalar integer

VariableUnitsLine — Variable units location 0 (default) | positive scalar integer

Delimited Text Properties

Delimiter — Field delimiter characters string array | character vector | cell array of character vectors

Whitespace — Characters to treat as white space character vector | string scalar

LineEnding — End-of-line characters ["\n","\r","\r\n"] (default) | string array | character vector | cell array of character vectors

CommentStyle — Style of comments string array | character vector | cell array of character vectors

ConsecutiveDelimitersRule — Procedure to manage consecutive delimiters "split" | "join" | "error"

LeadingDelimitersRule — Procedure to manage leading delimiters "keep" | "ignore" | "error"

TrailingDelimitersRule — Procedure to manage trailing delimiters 'keep' | 'ignore' | 'error'

Encoding — Character encoding scheme '' | 'UTF-8' | 'system' | 'ISO-8859-1' | 'windows-1251' | 'windows-1252' | ...

Replacement Rules

MissingRule — Procedure to manage missing data 'fill' (default) | 'error' | 'omitrow' | 'omitvar'

EmptyLineRule — Procedure to handle empty lines 'skip' | 'read' | 'error'

ImportErrorRule — Procedure to handle import errors 'fill' (default) | 'error' | 'omitrow' | 'omitvar'

ExtraColumnsRule — Procedure to handle extra columns 'addvars' | 'ignore' | 'wrap' | 'error'

Object Functions

Examples

Define Import Options for Variables in Delimited Text File

Version History

R2018b: Create options object using delimitedTextImportOptions function

See Also

Functions

Live Editor Tasks

`numVars` — Number of variables
positive scalar integer

`VariableNames` — Variable names
cell array of character vectors | string array

`VariableNamingRule` — Flag to preserve variable names
`"modify"` (default) | `"preserve"`

`VariableTypes` — Data types of variable
cell array of character vectors | string array

`SelectedVariableNames` — Subset of variables to import
character vector | string scalar | cell array of character vectors | string array | array of numeric indices

`VariableOptions` — Type specific variable import options
array of variable import options objects

`DataLines` — Data location
positive scalar integer | array of positive scalar integers

`RowNamesColumn` — Row names location
`0` (default) | positive scalar integer

`VariableNamesLine` — Variable names location
`0` (default) | positive scalar integer

`VariableDescriptionsLine` — Variable description location
`0` (default) | positive scalar integer

`VariableUnitsLine` — Variable units location
`0` (default) | positive scalar integer

`Delimiter` — Field delimiter characters
string array | character vector | cell array of character vectors

`Whitespace` — Characters to treat as white space
character vector | string scalar

`LineEnding` — End-of-line characters
`["\n","\r","\r\n"]` (default) | string array | character vector | cell array of character vectors

`CommentStyle` — Style of comments
string array | character vector | cell array of character vectors

`ConsecutiveDelimitersRule` — Procedure to manage consecutive delimiters
`"split"` | `"join"` | `"error"`

`LeadingDelimitersRule` — Procedure to manage leading delimiters
`"keep"` | `"ignore"` | `"error"`

`TrailingDelimitersRule` — Procedure to manage trailing delimiters
`'keep'` | `'ignore'` | `'error'`

`Encoding` — Character encoding scheme
`''` | `'UTF-8'` | `'system'` | `'ISO-8859-1'` | `'windows-1251'` | `'windows-1252'` | ...

`MissingRule` — Procedure to manage missing data
`'fill'` (default) | `'error'` | `'omitrow'` | `'omitvar'`

`EmptyLineRule` — Procedure to handle empty lines
`'skip'` | `'read'` | `'error'`

`ImportErrorRule` — Procedure to handle import errors
`'fill'` (default) | `'error'` | `'omitrow'` | `'omitvar'`

`ExtraColumnsRule` — Procedure to handle extra columns
`'addvars'` | `'ignore'` | `'wrap'` | `'error'`

R2018b: Create options object using `delimitedTextImportOptions` function