Main Content

isstrprop

Determine which characters in input strings are of specified category

Description

TF = isstrprop(str,category) determines if characters in the input text are of the specified category, such as letters, numbers, or whitespace. For example, isstrprop('ABC123','alpha') returns a 1-by-6 logical array, [1 1 1 0 0 0], indicating that the first three characters are letters.

  • If str is a character array, string scalar, or numeric array, then isstrprop returns a logical array.

  • If str is a cell array of character vectors or a string array, then isstrprop returns a cell array of logical vectors.

example

TF = isstrprop(str,category,'ForceCellOutput',tf), where tf is 1 (true), returns TF as a cell array even when str is a character array, string scalar, or numeric array. The default for tf is 0 (false).

example

Examples

collapse all

Create a character vector and determine which characters are letters.

chr = '123 Maple Street'
chr = 
'123 Maple Street'
TF = isstrprop(chr,'alpha')
TF = 1x16 logical array

   0   0   0   0   1   1   1   1   1   0   1   1   1   1   1   1

Find indices for the letters in chr using TF.

idx = find(TF)
idx = 1×11

     5     6     7     8     9    11    12    13    14    15    16

chr(idx)
ans = 
'MapleStreet'

Create string arrays. Then determine which characters belong to various categories using the isstrprop function.

Create a string scalar and determine which of its characters are numeric digits.

str = "123 Maple Street"
str = 
"123 Maple Street"
TF = isstrprop(str,'digit')
TF = 1x16 logical array

   1   1   1   0   0   0   0   0   0   0   0   0   0   0   0   0

Create a nonscalar string array. Determine which characters in each string are whitespace characters. isstrprop returns a cell array in which each cell contains results for a string in str.

str = ["123 Maple St.";"456 Oak St."]
str = 2x1 string
    "123 Maple St."
    "456 Oak St."

TF = isstrprop(str,'wspace')
TF=2×1 cell array
    {[0 0 0 1 0 0 0 0 0 1 0 0 0]}
    {[    0 0 0 1 0 0 0 1 0 0 0]}

To display the results for the second string, str(2), index into TF{2}.

TF{2}
ans = 1x11 logical array

   0   0   0   1   0   0   0   1   0   0   0

Create a cell array of character vectors. Determine which characters are whitespace characters.

C = {'123 Maple St.';'456 Oak St.'}
C = 2x1 cell
    {'123 Maple St.'}
    {'456 Oak St.'  }

TF = isstrprop(C,'wspace')
TF=2×1 cell array
    {[0 0 0 1 0 0 0 0 0 1 0 0 0]}
    {[    0 0 0 1 0 0 0 1 0 0 0]}

Find the punctuation characters in a character vector. isstrprop returns a logical vector indicating which characters belong to that category. Force isstrprop to return the logical vector in a cell array.

chr = 'A horse! A horse! My kingdom for a horse!'
chr = 
'A horse! A horse! My kingdom for a horse!'
TF = isstrprop(chr,'punct','ForceCellOutput',true)
TF = 1x1 cell array
    {[0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]}

Find indices for the punctuation marks in chr using TF{1}.

find(TF{1})
ans = 1×3

     8    17    41

Create a numeric array. Determine which numbers correspond to character codes for letters.

X = [77 65 84 76 65 66]
X = 1×6

    77    65    84    76    65    66

TF = isstrprop(X,'alpha')
TF = 1x6 logical array

   1   1   1   1   1   1

isstrprop identifies all the numbers as character codes for letters. Convert the numbers to their corresponding characters with the char function.

c = char(X)
c = 
'MATLAB'

Input Arguments

collapse all

Input array, specified as a string array, character array, cell array of character vectors, or numeric array.

If str is a numeric array, then isstrprop treats the numbers as Unicode® character codes. If the numbers are double- or single-precision floating-point numbers, then isstrprop rounds them to the nearest integer values before interpreting them as character codes.

Data Types: string | char | cell | double | single | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Character category, specified as a character vector or string scalar. isstrprop classifies the characters in str according to categories defined by the Unicode standard.

Category

Description

alpha

Letters.

alphanum

Letters or numeric digits.

cntrl

Control characters (for example, char(0:20)).

digit

Numeric digits.

graphic

Graphic characters. isstrprop treats all Unicode characters as graphic characters, except for the following:

  • Unassigned characters

  • Whitespace characters

  • The line separator

  • The paragraph separator

  • Control characters

  • Private user-defined characters

  • Surrogate characters

lower

Lowercase letters.

print

Graphic characters, plus char(32).

punct

Punctuation characters.

wspace

Whitespace characters. This range includes the ANSI® C definition of whitespace, {' ','\t','\n','\r','\v','\f'}, in addition to some other Unicode characters.

upper

Uppercase letters.

xdigit

Valid hexadecimal digits.

True or false, specified as 1 or 0.

Output Arguments

collapse all

True or false, returned as a logical array or cell array of logical vectors.

  • If str is a character vector, string scalar, or numeric array, then TF is a logical array indicating which characters belong to the specified category.

  • If str is a cell array of character vectors or a string array, then TF is a cell array. For each element of str, the corresponding cell of TF contains a logical vector indicating which characters in that element belong to the specified category.

Tips

Whitespace characters for which the wspace option returns true include tab, line feed, vertical tab, form feed, carriage return, and space, in addition to some other Unicode characters. To see all characters for which the wspace option returns true, enter the following command, and then look up the returned decimal codes in a Unicode reference:

find(isstrprop(char(1):char(intmax('uint16')),'wspace'))

Extended Capabilities

Version History

Introduced before R2006a