textscan

Read formatted data from text file or string

Syntax

C = textscan(fileID,formatSpec)

C = textscan(fileID,formatSpec,N)

C = textscan(chr,formatSpec)

C = textscan(chr,formatSpec,N)

C = textscan(___,Name,Value)

[C,position]
= textscan(___)

Description

C = textscan(fileID,formatSpec) reads data from an open text file into a cell array, C. The text file is indicated by the file identifier, fileID. Use fopen to open the file and obtain the fileID value. When you finish reading from a file, close the file by calling fclose(fileID).

textscan attempts to match the data in the file to the conversion specifier in formatSpec. The textscan function reapplies formatSpec throughout the entire file and stops when it cannot match formatSpec to the data.

example

C = textscan(fileID,formatSpec,N) reads file data using the formatSpec N times, where N is a positive integer. To read additional data from the file after N cycles, call textscan again using the original fileID. If you resume a text scan of a file by calling textscan with the same file identifier (fileID), then textscan automatically resumes reading at the point where it terminated the last read.

example

C = textscan(chr,formatSpec) reads the text from character vector chr into cell array C. When reading text from a character vector, repeated calls to textscan restart the scan from the beginning each time. To restart a scan from the last position, request a position output.

textscan attempts to match the data in character vector chr to the format specified in formatSpec.

example

C = textscan(chr,formatSpec,N) uses the formatSpec N times, where N is a positive integer.

C = textscan(___,Name,Value) specifies options using one or more Name,Value pair arguments, in addition to any of the input arguments in the previous syntaxes.

example

[C,position] = textscan(___) returns the position in the file or the character vector at the end of the scan as the second output argument. For a file, this is the value that ftell(fileID) would return after calling textscan. For a character vector, position indicates how many characters textscan read.

example

Examples

collapse all

Read Floating-Point Numbers

Open Live Script

Read a character vector containing floating-point numbers.

chr = '0.41 8.24 3.57 6.24 9.27';
C = textscan(chr,'%f');

The specifier '%f' in formatSpec tells textscan to match each field in chr to a double-precision floating-point number.

Display the contents of cell array C.

celldisp(C)

Read the same character vector, and truncate each value to one decimal digit.

C = textscan(chr,'%3.1f %*1d');

The specifier %3.1f indicates a field width of 3 digits and a precision of 1. The textscan function reads a total of 3 digits, including the decimal point and the 1 digit after the decimal point. The specifier, %*1d, tells textscan to skip the remaining digit.

Display the contents of cell array C.

celldisp(C)

Read Hexadecimal Numbers

Open Live Script

Read a character vector that represents a set of hexadecimal numbers. Text that represents hexadecimal numbers includes the digits 0-9, the letters a-f or A-F, and optionally the prefixes 0x or 0X.

To match the fields in hexnums to hexadecimal numbers, use the '%x' specifier. The textscan function converts the fields to unsigned 64-bit integers.

hexnums = '0xFF 0x100 0x3C5E A F 10';
C = textscan(hexnums,'%x')

C = 1×1 cell array
    {6×1 uint64}

Display the contents of C as a row vector.

transpose(C{:})

ans = 1×6 uint64 row vector

     255     256   15454      10      15      16

You can convert the fields to signed or unsigned integers, having 8, 16, 32, or 64 bits. To convert the fields in hexnums to signed 32-bit integers, use the '%xs32' specifier.

C = textscan(hexnums,'%xs32');
transpose(C{:})

ans = 1×6 int32 row vector

     255     256   15454      10      15      16

You can also specify a field width for interpreting the input. In that case, the prefix counts towards the field width. For example, if you set the field width to three, as in %3x, then textscan splits the text '0xAF 100' into three pieces of text, '0xA', 'F', and '100'. It treats the three pieces of text as different hexadecimal numbers.

C = textscan('0xAF 100','%3x');
transpose(C{:})

ans = 1×3 uint64 row vector

    10    15   256

Read Binary Numbers

Open Live Script

Read a character vector that represents a set of binary numbers. Text that represents binary numbers includes the digits 0 and 1, and optionally the prefixes 0b or 0B.

To match the fields in binnums to binary numbers, use the '%b' specifier. The textscan function converts the fields to unsigned 64-bit integers.

binnums = '0b101010 0b11 0b100 1001 10';
C = textscan(binnums,'%b')

C = 1×1 cell array
    {5×1 uint64}

Display the contents of C as a row vector.

transpose(C{:})

ans = 1×5 uint64 row vector

   42    3    4    9    2

You can convert the fields to signed or unsigned integers, having 8, 16, 32, or 64 bits. To convert the fields in binnums to signed 32-bit integers, use the '%bs32' specifier.

C = textscan(binnums,'%bs32');
transpose(C{:})

ans = 1×5 int32 row vector

   42    3    4    9    2

You can also specify a field width for interpreting the input. In that case, the prefix counts towards the field width. For example, if you set the field width to three, as in %3b, then textscan splits the text '0b1010 100' into three pieces of text, '0b1', '010', and '100'. It treats the three pieces of text as different binary numbers.

C = textscan('0b1010 100','%3b');
transpose(C{:})

ans = 1×3 uint64 row vector

   1   2   4

Read Different Types of Data

Open Live Script

Load the data file and read each column with the appropriate type.

Load file scan1.dat and preview its contents in a text editor. A screen shot is shown below.

 filename = 'scan1.dat';

Open the file, and read each column with the appropriate conversion specifier. textscan returns a 1-by-9 cell array C.

fileID = fopen(filename);
C = textscan(fileID,'%s %s %f32 %d8 %u %f %f %s %f');
fclose(fileID);
whos C

  Name      Size            Bytes  Class    Attributes

  C         1x9              2393  cell

View the MATLAB® data type of each of the cells in C.

C=1×9 cell array
    {3×1 cell}    {3×1 cell}    {3×1 single}    {3×1 int8}    {3×1 uint32}    {3×1 double}    {3×1 double}    {3×1 cell}    {3×1 double}

Examine the individual entries. Notice that C{1} and C{2} are cell arrays. C{5} is of data type uint32, so the first two elements of C{5} are the maximum values for a 32-bit unsigned integer, or intmax('uint32').

celldisp(C)

 
C{1}{1} =
 
09/12/2005
 
 
C{1}{2} =
 
10/12/2005
 
 
C{1}{3} =
 
11/12/2005
 
 
C{2}{1} =
 
Level1
 
 
C{2}{2} =
 
Level2
 
 
C{2}{3} =
 
Level3
 
 
C{3} =
 
   12.3400
   23.5400
   34.9000

 
 
C{4} =
 
   45
   60
   12

 
 
C{5} =
 
   4294967295
   4294967295
       200000

 
 
C{6} =
 
   Inf
  -Inf
    10

 
 
C{7} =
 
       NaN
    0.0010
  100.0000

 
 
C{8}{1} =
 
Yes
 
 
C{8}{2} =
 
No
 
 
C{8}{3} =
 
No
 
 
C{9} =
 
   5.1000 + 3.0000i
   2.2000 - 0.5000i
   3.1000 + 0.1000i

Remove Literal Text

Open Live Script

Remove the literal text 'Level' from each field in the second column of the data from the previous example. A preview of the file is shown below.

Open the file and match the literal text in the formatSpec input.

filename = 'scan1.dat';
fileID = fopen(filename);
C = textscan(fileID,'%s Level%d %f32 %d8 %u %f %f %s %f');
fclose(fileID);
C{2}

ans = 3×1 int32 column vector

   1
   2
   3

View the MATLAB® data type of the second cell in C. The second cell of the 1-by-9 cell array, C, is now of data type int32.

disp( class(C{2}) )

int32

Skip the Remainder of a Line

Open Live Script

Read the first column of the file in the previous example into a cell array, skipping the rest of the line.

filename = 'scan1.dat';
fileID = fopen(filename);
dates = textscan(fileID,'%s %*[^\n]');
fclose(fileID);
dates{1}

ans = 3×1 cell
    {'09/12/2005'}
    {'10/12/2005'}
    {'11/12/2005'}

textscan returns a cell array dates.

Specify Delimiter and Empty Value Conversion

Open Live Script

Load the file data.csv and preview its contents in a text editor. A screen shot is shown below. Notice the file contains data separated by commas and also contains empty values.

Read the file, converting empty cells to -Inf.

filename = 'data.csv';
fileID = fopen(filename);
C = textscan(fileID,'%f %f %f %f %u8 %f',...
'Delimiter',',','EmptyValue',-Inf);
fclose(fileID);
column4 = C{4}, column5 = C{5}

column4 = 2×1

     4
  -Inf

column5 = 2×1 uint8 column vector

    0
   11

textscan returns a 1-by-6 cell array, C. The textscan function converts the empty value in C{4} to -Inf, where C{4} is associated with a floating-point format. Because MATLAB® represents unsigned integer -Inf as 0, textscan converts the empty value in C{5} to 0, and not -Inf.

Specify Text to be Treated as Empty or Comments

Open Live Script

Load the file data2.csv and preview its contents in a text editor. A screen shot is shown below. Notice the file contains data that can be interpreted as comments and other entries such as 'NA' or 'na' that may indicate empty fields.

filename = 'data2.csv';

Designate the input that textscan should treat as comments or empty values and scan the data into C.

fileID = fopen(filename);
C = textscan(fileID,'%s %n %n %n %n','Delimiter',',',...
'TreatAsEmpty',{'NA','na'},'CommentStyle','//');
fclose(fileID);

Display the output.

celldisp(C)

 
C{1}{1} =
 
abc
 
 
C{1}{2} =
 
def
 
 
C{2} =
 
     2
   NaN

 
 
C{3} =
 
   NaN
     5

 
 
C{4} =
 
     3
     6

 
 
C{5} =
 
     4
     7

Treat Repeated Delimiters as One

Open Live Script

Load the file data3.csv and preview its contents in a text editor. A screen shot is shown below. Notice the file contains repeated delimiters.

filename = 'data3.csv';

To treat the repeated commas as a single delimiter, use the MultipleDelimsAsOne parameter, and set the value to 1 (true).

fileID = fopen(filename);
C = textscan(fileID,'%f %f %f %f','Delimiter',',',...
'MultipleDelimsAsOne',1);
fclose(fileID);


celldisp(C)

Specify Repeated Conversion Specifiers and Collect Numeric Data

Open Live Script

Load the data file grades.txt for this example and preview its contents in a text editor. A screen shot is shown below. Notice the file contains repeated delimiters.

filename = 'grades.txt';

Read the column headers using the format '%s' four times.

fileID = fopen(filename);
formatSpec = '%s';
N = 4;
C_text = textscan(fileID,formatSpec,N,'Delimiter','|');

Read the numeric data in the file.

C_data0 = textscan(fileID,'%d %f %f %f')

C_data0=1×4 cell array
    {4×1 int32}    {4×1 double}    {4×1 double}    {4×1 double}

The default value for CollectOutput is 0 (false), so textscan returns each column of the numeric data in a separate array.

Set the file position indicator to the beginning of the file.

frewind(fileID);

Reread the file and set CollectOutput to 1 (true) to collect the consecutive columns of the same class into a single array. You can use the repmat function to indicate that the %f conversion specifier should appear three times. This technique is useful when a format repeats many times.

C_text = textscan(fileID,'%s',N,'Delimiter','|');
C_data1 = textscan(fileID,['%d',repmat('%f',[1,3])],'CollectOutput',1)

C_data1=1×2 cell array
    {4×1 int32}    {4×3 double}

The test scores, which are all double, are collected into a single 4-by-3 array.

Close the file.

fclose(fileID);

Read or Skip Quoted Text and Numeric Fields

Open Live Script

Read the first and last columns of data from a text file. Skip a column of text and a column of integer data.

Load the file names.txt and preview its contents in a text editor. A screen shot is shown below. Notice that the file contains two columns of quoted text, followed by a column of integers, and finally a column of floating point numbers.

filename = 'names.txt';

Read the first and last columns of data in the file. Use the conversion specifier, %q to read the text enclosed by double quotation marks ("). %*q skips the quoted text, %*d skips the integer field, and %f reads the floating-point number. Specify the comma delimiter using the 'Delimiter' name-value pair argument.

fileID = fopen(filename,'r');
C = textscan(fileID,'%q %*q %*d %f','Delimiter',',');
fclose(fileID);

Display the output. textscan returns a cell array C where the double quotation marks enclosing the text are removed.

celldisp(C)

 
C{1}{1} =
 
Smith, J.
 
 
C{1}{2} =
 
Bates, G.
 
 
C{1}{3} =
 
Curie, M.
 
 
C{1}{4} =
 
Murray, G.
 
 
C{1}{5} =
 
Brown, K.
 
 
C{2} =
 
   71.1000
   69.3000
   64.1000
  133.0000
   64.9000

Read Foreign-Language Dates

Open Live Script

Load the file german_dates.txt and preview its contents in a text editor. A screen shot is shown below. Notice that the first column of values contains dates in German and the second and third columns are numeric values.

filename = 'german_dates.txt';

Open the file. Specify the character encoding scheme associated with the file as the last input to fopen.

fileID = fopen(filename,'r','n','ISO-8859-15');

Read the file. Specify the format of the dates in the file using the %{dd % MMMM yyyy}D specifier. Specify the locale of the dates using the DateLocale name-value pair argument.

C = textscan(fileID,'%{dd MMMM yyyy}D %f %f',...
    'DateLocale','de_DE','Delimiter',',');
fclose(fileID);

View the contents of the first cell in C. The dates display in the language MATLAB uses depending on your system locale.

C{1}

ans = 3×1 datetime
   01 January 2014 
   01 February 2014
   01 March 2014

Read Nondefault Control Characters

Open Live Script

Use sprintf to convert nondefault escape sequences in your data.

Create text that includes a form feed character, \f. Then, to read the text using textscan, call sprintf to explicitly convert the form feed.

lyric = sprintf('Blackbird\fsinging\fin\fthe\fdead\fof\fnight');
C = textscan(lyric,'%s','delimiter',sprintf('\f'));
C{1}

ans = 7×1 cell
    {'Blackbird'}
    {'singing'  }
    {'in'       }
    {'the'      }
    {'dead'     }
    {'of'       }
    {'night'    }

textscan returns a cell array, C.

Resume Scanning

Open Live Script

Resume scanning from a position other than the beginning.

If you resume a scan of the text, textscan reads from the beginning each time. To resume a scan from any other position, use the two-output argument syntax in your initial call to textscan.

For example, create a character vector called lyric. Read the first word of the character vector, and then resume the scan.

lyric = 'Blackbird singing in the dead of night';
[firstword,pos] = textscan(lyric,'%9c',1);
lastpart = textscan(lyric(pos+1:end),'%s');

Input Arguments

collapse all

`fileID` — File identifier
numeric scalar

File identifier of an open text file, specified as a number. Before reading a file with textscan, you must use fopen to open the file and obtain the fileID.

Data Types: double

`formatSpec` — Format of the data fields
character vector | string

Format of the data fields, specified as a character vector or a string of one or more conversion specifiers. When textscan reads the input, it attempts to match the data to the format specified in formatSpec. If textscan fails to match a data field, it stops reading and returns all fields read before the failure.

The number of conversion specifiers determines the number of cells in output array, C.

Numeric Fields

This table lists available conversion specifiers for numeric inputs.

Numeric Input Type	Conversion Specifier	Output Class
Integer, signed	`%d`	`int32`
	`%d8`	`int8`
	`%d16`	`int16`
	`%d32`	`int32`
	`%d64`	`int64`
Integer, unsigned	`%u`	`uint32`
	`%u8`	`uint8`
	`%u16`	`uint16`
	`%u32`	`uint32`
	`%u64`	`uint64`
Floating-point number	`%f`	`double`
	`%f32`	`single`
	`%f64`	`double`
	`%n`	`double`
Hexadecimal number, unsigned integer	`%x`	`uint64`
	`%xu8`	`uint8`
	`%xu16`	`uint16`
	`%xu32`	`uint32`
	`%xu64`	`uint64`
Hexadecimal number, signed integer	`%xs8`	`int8`
	`%xs16`	`int16`
	`%xs32`	`int32`
	`%xs64`	`int64`
Binary number, unsigned integer	`%b`	`uint64`
	`%bu8`	`uint8`
	`%bu16`	`uint16`
	`%bu32`	`uint32`
	`%bu64`	`uint64`
Binary number, signed integer	`%bs8`	`int8`
	`%bs16`	`int16`
	`%bs32`	`int32`
	`%bs64`	`int64`

Nonnumeric Fields

This table lists available conversion specifiers for inputs that include nonnumeric characters.

Nonnumeric Input Type	Conversion Specifier	Details
Character	`%c`	Read any single character, including a delimiter.
Text Array	`%s`	Read as a cell array of character vectors.
Text Array	`%q`	Read as a cell array of character vectors. If the text begins with a double quotation mark (`"`), omit the leading quotation mark and its accompanying closing mark, which is the second instance of a lone double quotation mark. Replace escaped double quotation marks (for example, `""abc""`) with lone double quotation marks (`"abc"`). `%q` ignores any double quotation marks that appear after the closing double quotation mark. Example: `'%q'` reads `'"Joe ""Lightning"" Smith, Jr."'` as `'Joe "Lightning" Smith, Jr.'`.
Dates and time	`%D`	Read the same way as `%q` above, and then convert to a datetime value.
Dates and time	`%{fmt}D`	Read the same way as `%q` above, and then convert it to a datetime value. `fmt` describes the format of the input text. The `fmt` input is a character vector of letter identifiers that is a valid value for the `Format` property of a datetime. `textscan` converts text that does not match this format to `NaT` values. For more information about datetime display formats, see the `Format` property for datetime arrays. Example: `'%{dd-MMM-yyyy}D'` specifies the format of a date such as `'01-Jan-2014'`.
Duration	`%T`	Read the same way as `%q` above, and then convert to a duration value.
Duration	`%{fmt}T`	Read the same way as `%q` above, and then convert it to a duration value. `fmt` describes the format of the input text. The `fmt` input is a character vector of letter identifiers that is a valid value for the `Format` property of a duration. `textscan` converts text that does not match this format to `NaN` values. For more information about duration display formats, see the `format` property for duration arrays. Example: `'%{hh:mm:ss}T'` specifies the format of a duration such as `'10:30:15'`, which represents 10 hours, 30 minutes, and 15 seconds.
Category	`%C`	Read the same way as `%q`, and then convert to a category name in a categorical array. `textscan` converts `<undefined>` text to an undefined value in the output categorical array.
Pattern-matching	`%[...]`	Read as a cell array of character vectors, the characters inside the brackets up to the first nonmatching character. To include `]` in the set, specify it first: `%[]...]`. Example: `%[mus]` reads `'summer '` as `'summ'`.
Pattern-matching	`%[^...]`	Exclude characters inside the brackets, reading until the first matching character. To exclude `]`, specify it first: `%[^]...]`. Example: `%[^xrg]` reads `'summer '` as `'summe'`.

Optional Operators

Conversion specifiers in formatSpec can include optional operators, which appear in the following order (includes spaces for clarity):

In %*5.2f, * designates ignore field, 5 is the field width, .2 is the precision, and f is the conversion character.

Optional operators include:

Fields and Characters to Ignore

textscan reads all characters in your file in sequence, unless you tell it to ignore a particular field or a portion of a field.

Insert an asterisk character (*) after the percent character (%) to skip a field or a portion of a character field.

Operator	Action Taken
`%*k`	Skip the field. `k` is any conversion specifier identifying the field to skip. `textscan` does not create an output cell for any such fields. Example: `'%s %s %s %s %s %*s %s'` (spaces are optional) converts the text `'Blackbird singing in the dead of night'` into four output cells with `'Blackbird' 'in' 'the' 'night'`
`'%*ns'`	Skip up to `n` characters, where `n` is an integer less than or equal to the number of characters in the field. Example: `'%*3s %s'` converts `'abcdefg'` to `'defg'`. When the delimiter is a comma, the same delimiter converts `'abcde,fghijkl'` to a cell array containing `'de';'ijkl'`.
`'%*nc'`	Skip `n` characters, including delimiter characters.

Operator

Action Taken

%*k

Skip the field. k is any conversion specifier identifying the field to skip. textscan does not create an output cell for any such fields.

Example: '%s %*s %s %s %*s %*s %s' (spaces are optional) converts the text
'Blackbird singing in the dead of night' into four output cells with
'Blackbird' 'in' 'the' 'night'

'%*ns'

Skip up to n characters, where n is an integer less than or equal to the number of characters in the field.

Example: '%*3s %s' converts 'abcdefg' to 'defg'. When the delimiter is a comma, the same delimiter converts 'abcde,fghijkl' to a cell array containing 'de';'ijkl'.

'%*nc'

Skip n characters, including delimiter characters.

Field Width
textscan reads the number of characters or digits specified by the field width or precision, or up to the first delimiter, whichever comes first. A decimal point, sign (+ or -), exponent character, and digits in the numeric exponent are counted as characters and digits within the field width. For complex numbers, the field width refers to the individual widths of the real part and the imaginary part. For the imaginary part, the field width includes + or − but not i or j. Specify the field width by inserting a number after the percent character (%) in the conversion specifier.
Example: %5f reads '123.456' as 123.4.
Example: %5c reads 'abcdefg' as 'abcde'.
When the field width operator is used with single characters (%c), textscan also reads delimiter, white-space, and end-of-line characters.
Example: %7c reads 7 characters, including white-space, so'Day and night' reads as 'Day and'.
Precision
For floating-point numbers (%n, %f, %f32, %f64), you can specify the number of decimal digits to read.
Example: %7.2f reads '123.456' as 123.45.
Literal Text to Ignore
textscan ignores the text appended to the formatSpec conversion specifier.
Example: Level%u8 reads 'Level1' as 1.
Example: %u8Step reads '2Step' as 2.

Data Types: char | string

`N` — Number of times to apply `formatSpec`
`Inf` (default) | positive integer

Number of times to apply formatSpec, specified as a positive integer.

`chr` — Input text
character vector | string

Input text to read.

Data Types: char | string

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: C = textscan(fileID,formatSpec,'HeaderLines',3,'Delimiter',',') skips the first three lines of the data, and then reads the remaining data, treating commas as a delimiter.

Names are not case sensitive.

`CollectOutput` — Logical indicator determining data concatenation
`false` (default) | `true`

Logical indicator determining data concatenation, specified as the comma-separated pair consisting of 'CollectOutput' and either true or false. If true, then the importing function concatenates consecutive output cells of the same fundamental MATLAB^® class into a single array.

`CommentStyle` — Symbols designating text to ignore
character vector | cell array of character vectors | string | string array

Symbols designating text to ignore, specified as the comma-separated pair consisting of 'CommentStyle' and a character vector, cell array of character vectors, string, or string array.

For example, specify a character such as '%' to ignore text following the symbol on the same line. Specify a cell array of two character vectors, such as {'/*','*/'}, to ignore any text between those sequences.

MATLAB checks for comments only at the start of each field, not within a field.

Example: 'CommentStyle',{'/*','*/'}

Data Types: char | string

`DateLocale` — Locale for reading dates
character vector | string

Locale for reading dates, specified as the comma-separated pair consisting of 'DateLocale' and a character vector in the form xx_YY, where xx is a lowercase ISO 639-1 two-letter code that specifies a language, and YY is an uppercase ISO 3166-1 alpha-2 code that specifies a country. For a list of common values for the locale, see the Locale name-value pair argument for the datetime function.

Use DateLocale to specify the locale in which textscan should interpret month and day of week names and abbreviations when reading text as dates using the %D format specifier.

Example: 'DateLocale','ja_JP'

`Delimiter` — Field delimiter characters
character vector | cell array of character vectors | string | string array

Field delimiter characters, specified as the comma-separated pair consisting of 'Delimiter' and a character vector or a cell array of character vectors. Specify multiple delimiters in a cell array of character vectors.

Example: 'Delimiter',{';','*'}

textscan interprets repeated delimiter characters as separate delimiters, and returns an empty value to the output cell.

Within each row of data, the default field delimiter is white-space. White-space can be any combination of space (' '), backspace ('\b'), or tab ('\t') characters. If you do not specify a delimiter, then:

the delimiter characters are the same as the white-space characters. The default white-space characters are ' ', '\b', and '\t'. Use the 'Whitespace' name-value pair argument to specify alternate white-space characters.
textscan interprets repeated white-space characters as a single delimiter.

When you specify one of the following escape sequences as a delimiter, textscan converts that sequence to the corresponding control character:

`\b`	Backspace
`\n`	Newline
`\r`	Carriage return
`\t`	Tab
`\\`	Backslash (`\`)

Data Types: char | string

`EmptyValue` — Returned value for empty numeric fields
`NaN` (default) | scalar

Returned value for empty numeric fields in delimited text files, specified as the comma-separated pair consisting of 'EmptyValue' and a scalar.

`EndOfLine` — End-of-line characters
character vector | string

End-of-line characters, specified as the comma-separated pair consisting of 'EndOfLine' and a character vector or string. The character vector must be '\r\n' or it must specify a single character. Common end-of-line characters are a newline character ('\n') or a carriage return ('\r'). If you specify '\r\n', then the importing function treats any of \r, \n, and the combination of the two (\r\n) as end-of-line characters.

The default end-of-line sequence is \n, \r, or \r\n, depending on the contents of your file.

If there are missing values and an end-of-line sequence at the end of the last line in a file, then the importing function returns empty values for those fields. This ensures that individual cells in output cell array, C, are the same size.

Example: 'EndOfLine',':'

Data Types: char | string

`ExpChars` — Exponent characters
`'eEdD'` (default) | character vector | string

Exponent characters, specified as the comma-separated pair consisting of 'ExpChars' and a character vector or string. The default exponent characters are e, E, d, and D.

Data Types: char | string

`HeaderLines` — Number of header lines
`0` (default) | positive integer

Number of header lines, specified as the comma-separated pair consisting of 'HeaderLines' and a positive integer. textscan skips the header lines, including the remainder of the current line.

`MultipleDelimsAsOne` — Multiple delimiter handling
`0 (false)` (default) | `1 (true)`

Multiple delimiter handling, specified as the comma-separated pair consisting of 'MultipleDelimsAsOne' and either true or false. If true, then the importing function treats consecutive delimiters as a single delimiter. Repeated delimiters separated by white-space are also treated as a single delimiter. You must also specify the Delimiter option.

Example: 'MultipleDelimsAsOne',1

`ReturnOnError` — Behavior when `textscan` fails to read or convert
`1 (true)` (default) | `0 (false)`

Behavior when textscan fails to read or convert, specified as the comma-separated pair consisting of 'ReturnOnError' and either true or false. If true, textscan terminates without an error and returns all fields read. If false, textscan terminates with an error and does not return an output cell array.

`TreatAsEmpty` — Placeholder text to treat as empty value
character vector | cell array of character vectors | string | string array

Placeholder text to treat as empty value, specified as the comma-separated pair consisting of 'TreatAsEmpty' and a character vector, cell array of character vectors, string, or string array. This option only applies to numeric fields.

Data Types: char | string

`Whitespace` — White-space characters
`' \b\t'` (default) | character vector | string

White-space characters, specified as the comma-separated pair consisting of 'Whitespace' and a character vector or string containing one or more characters. textscan adds a space character, char(32), to any specified Whitespace, unless Whitespace is empty ('') and formatSpec includes any conversion specifier.

When you specify one of the following escape sequences as any white-space character, textscan converts that sequence to the corresponding control character:

`\b`	Backspace
`\n`	Newline
`\r`	Carriage return
`\t`	Tab
`\\`	Backslash (`\`)

Data Types: char | string

`TextType` — Output data type of text
`'char'` (default) | `'string'`

Output data type of text, specified as the comma-separated pair consisting of 'TextType' and either 'char' or 'string'. If you specify the value 'char', then textscan returns text as a cell array of character vectors. If you specify the value 'string', then textscan returns text as an array of type string.

Output Arguments

collapse all

`C` — File or text data
cell array

File or text data, returned as a cell array.

For each numeric conversion specifier in formatSpec, the textscan function returns a K-by-1 MATLAB numeric vector to the output cell array, C, where K is the number of times that textscan finds a field matching the specifier.

For each text conversion specifier (%s, %q, or %[...]) in formatSpec, the textscan function returns a K-by-1 cell array of character vectors, where K is the number of times that textscan finds a field matching the specifier. For each character conversion that includes a field width operator, textscan returns a K-by-M character array, where M is the field width.

For each datetime or categorical conversion specifier in formatSpec, the textscan function returns a K-by-1 datetime or categorical vector to the output cell array, C, where K is the number of times that textscan finds a field matching the specifier.

`position` — Position in the file or character vector
integer

Position at the end of the scan, in the file or the character vector, returned as an integer of class double. For a file, ftell(fileID) would return the same value after calling textscan. For a character vector, position indicates how many characters textscan read.

Algorithms

textscan converts numeric fields to the specified output type according to MATLAB rules regarding overflow, truncation, and the use of NaN, Inf, and -Inf. For example, MATLAB represents an integer NaN as zero. If textscan finds an empty field associated with an integer format specifier (such as %d or %u), it returns the empty value as zero and not NaN.

When matching data to a text conversion specifier, textscan reads until it finds a delimiter or an end-of-line character. When matching data to a numeric conversion specifier, textscan reads until it finds a nonnumeric character. When textscan can no longer match the data to a particular conversion specifier, it attempts to match the data to the next conversion specifier in the formatSpec. Sign (+ or -), exponent characters, and decimal points are considered numeric characters.

In the number -12.345e+6, the - and + are signs. The 1, 2, 3, 4, 5, and 6 are digits. The period or dot is the decimal point. The e is the exponent character.

Sign	Digits	Decimal Point	Digits	Exponent Character	Sign	Digits
Read one sign character if it exists.	Read one or more digits.	Read one decimal point if it exists.	If there is a decimal point, read one or more digits that immediately follow it.	Read one exponent character if it exists.	If there is an exponent character, read one sign character.	If there is an exponent character, read one or more digits that follow it.

textscan imports any complex number as a whole into a complex numeric field, converting the real and imaginary parts to the specified numeric type (such as %d or %f). Valid forms for a complex number are:

±`<real>`±`<imag>i\|j`	Example: `5.7-3.1i`
±`<imag>i\|j`	Example: `-7j`

Do not include embedded white space in a complex number. textscan interprets embedded white space as a field delimiter.

Extended Capabilities

expand all

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.

Version History

Introduced before R2006a

expand all

R2022b: Use function in thread-based environments

This function supports thread-based environments.

textscan

Syntax

Description

Examples

Read Floating-Point Numbers

Read Hexadecimal Numbers

Read Binary Numbers

Read Different Types of Data

Remove Literal Text

Skip the Remainder of a Line

Specify Delimiter and Empty Value Conversion

Specify Text to be Treated as Empty or Comments

Treat Repeated Delimiters as One

Specify Repeated Conversion Specifiers and Collect Numeric Data

Read or Skip Quoted Text and Numeric Fields

Read Foreign-Language Dates

Read Nondefault Control Characters

Resume Scanning

Input Arguments

fileID — File identifier numeric scalar

formatSpec — Format of the data fields character vector | string

N — Number of times to apply formatSpec Inf (default) | positive integer

chr — Input text character vector | string

Name-Value Arguments

CollectOutput — Logical indicator determining data concatenation false (default) | true

CommentStyle — Symbols designating text to ignore character vector | cell array of character vectors | string | string array

DateLocale — Locale for reading dates character vector | string

Delimiter — Field delimiter characters character vector | cell array of character vectors | string | string array

EmptyValue — Returned value for empty numeric fields NaN (default) | scalar

EndOfLine — End-of-line characters character vector | string

ExpChars — Exponent characters 'eEdD' (default) | character vector | string

HeaderLines — Number of header lines 0 (default) | positive integer

MultipleDelimsAsOne — Multiple delimiter handling 0 (false) (default) | 1 (true)

ReturnOnError — Behavior when textscan fails to read or convert 1 (true) (default) | 0 (false)

TreatAsEmpty — Placeholder text to treat as empty value character vector | cell array of character vectors | string | string array

Whitespace — White-space characters ' \b\t' (default) | character vector | string

TextType — Output data type of text 'char' (default) | 'string'

Output Arguments

C — File or text data cell array

position — Position in the file or character vector integer

Algorithms

Extended Capabilities

Thread-Based Environment Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

Version History

R2022b: Use function in thread-based environments

See Also

Topics

`fileID` — File identifier
numeric scalar

`formatSpec` — Format of the data fields
character vector | string

`N` — Number of times to apply `formatSpec`
`Inf` (default) | positive integer

`chr` — Input text
character vector | string

`CollectOutput` — Logical indicator determining data concatenation
`false` (default) | `true`

`CommentStyle` — Symbols designating text to ignore
character vector | cell array of character vectors | string | string array

`DateLocale` — Locale for reading dates
character vector | string

`Delimiter` — Field delimiter characters
character vector | cell array of character vectors | string | string array

`EmptyValue` — Returned value for empty numeric fields
`NaN` (default) | scalar

`EndOfLine` — End-of-line characters
character vector | string

`ExpChars` — Exponent characters
`'eEdD'` (default) | character vector | string

`HeaderLines` — Number of header lines
`0` (default) | positive integer

`MultipleDelimsAsOne` — Multiple delimiter handling
`0 (false)` (default) | `1 (true)`

`ReturnOnError` — Behavior when `textscan` fails to read or convert
`1 (true)` (default) | `0 (false)`

`TreatAsEmpty` — Placeholder text to treat as empty value
character vector | cell array of character vectors | string | string array

`Whitespace` — White-space characters
`' \b\t'` (default) | character vector | string

`TextType` — Output data type of text
`'char'` (default) | `'string'`

`C` — File or text data
cell array

`position` — Position in the file or character vector
integer

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.