File Exchange

## Customizable Natural-Order Sort

version 1.11.0.0 (12.5 KB) by Stephen Cobeldick

### Stephen Cobeldick (view profile)

Natural-order sort of a cell array of strings, with customizable numeric format.

20 Downloads

Updated 25 Mar 2018

View License

Editor's Note: This file was selected as MATLAB Central Pick of the Week

To sort filenames or filepaths use NATSORTFILES:
http://www.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort
To sort the rows of a cell array of strings use NATSORTROWS:
http://www.mathworks.com/matlabcentral/fileexchange/47433-natural-order-row-sort

### Summary ###

Alphanumeric sort of a cell array of strings (1xN char). Sorts the strings taking into account the values of any numeric substrings occurring within those strings. Compare for example:

X = {'a2', 'a10', 'a1'};
sort(X)
ans = 'a1' 'a10' 'a2'
natsort(X)
ans = 'a1' 'a2' 'a10'

By default NATSORT simply treats all consecutive digits as integer values, however NATSORT also provides optional user-control over the numeric substring recognition and parsing via a regular expression, allowing the numeric substrings to have:
* a +/- sign
* a decimal point and decimal fraction
* E-notation exponent
* decimal, octal, hexadecimal or binary notation
* prefixes/suffixes/literals which can be ignored
* Inf or NaN value
* any feature supported by regular expressions, including look-arounds, quantifiers, etc.

The numeric class can be chosen to suit the substrings' numeric data:
* DOUBLE
* INT*
* UINT*

And of course the sorting itself can also be controlled:
* ascending/descending sort direction
* character case sensitivity/insensitivity
* relative order of numeric substrings vs. characters

### Examples ###

The default is for integer numeric substrings, as shown in the example in the introduction.

%% Multiple integer substrings (e.g. release version numbers):
B = {'v10.6', 'v9.10', 'v9.5', 'v10.10', 'v9.10.20', 'v9.10.8'};
sort(B)
ans = 'v10.10' 'v10.6' 'v9.10' 'v9.10.20' 'v9.10.8' 'v9.5'
natsort(B)
ans = 'v9.5' 'v9.10' 'v9.10.8' 'v9.10.20' 'v10.6' 'v10.10'

%% Integer, decimal or Inf number substrings, possibly with +/- signs:
C = {'test+Inf', 'test11.5', 'test-1.4', 'test', 'test-Inf', 'test+0.3'};
sort(C)
ans = 'test' 'test+0.3' 'test+Inf' 'test-1.4' 'test-Inf' 'test11.5'
natsort(C, '(-|+)?(Inf|\d+\.?\d*)')
ans = 'test' 'test-Inf' 'test-1.4' 'test+0.3' 'test11.5' 'test+Inf'

%% Integer or decimal number substrings, possibly with an exponent:
D = {'0.56e007', '', '4.3E-2', '10000', '9.8'};
sort(D)
ans = '' '0.56e007' '10000' '4.3E-2' '9.8'
natsort(D, '\d+\.?\d*(E(+|-)?\d+)?')
ans = '' '4.3E-2' '9.8' '10000' '0.56e007'

%% Hexadecimal number substrings (possibly with '0X' prefix):
E = {'a0X7C4z', 'a0X5z', 'a0X18z', 'aFz'};
sort(E)
ans = 'a0X18z' 'a0X5z' 'a0X7C4z' 'aFz'
natsort(E, '(?<=a)(0X)?[0-9A-F]+', '%x')
ans = 'a0X5z' 'aFz' 'a0X18z' 'a0X7C4z'

%% Binary number substrings (possibly with '0B' prefix):
F = {'a11111000100z', 'a0B101z', 'a0B000000000011000z', 'a1111z'};
sort(F)
ans = 'a0B000000000011000z' 'a0B101z' 'a11111000100z' 'a1111z'
natsort(F, '(0B)?+', '%b')
ans = 'a0B101z' 'a1111z' 'a0B000000000011000z' 'a11111000100z'

%% UINT64 number substrings (with full precision!):
natsort({'a18446744073709551615z', 'a18446744073709551614z'}, [], '%lu')
ans = 'a18446744073709551614z' 'a18446744073709551615z'

%% Case sensitivity:
G = {'a2', 'A20', 'A1', 'a10', 'A2', 'a1'};
natsort(G, [], 'ignorecase') % default
ans = 'A1' 'a1' 'a2' 'A2' 'a10' 'A20'
natsort(G, [], 'matchcase')
ans = 'A1' 'A2' 'A20' 'a1' 'a2' 'a10'

%% Sort direction:
H = {'2', 'a', '3', 'B', '1'};
natsort(H, [], 'ascend') % default
ans = '1' '2' '3' 'a' 'B'
natsort(H, [], 'descend')
ans = 'B' 'a' '3' '2' '1'

%% Relative sort-order of number substrings compared to characters:
V = num2cell(char(32+randperm(63)));
cell2mat(natsort(V, [], 'asdigit')) % default
ans = '!"#\$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_'
cell2mat(natsort(V, [], 'beforechar'))
ans = '0123456789!"#\$%&'()*+,-./:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_'
cell2mat(natsort(V, [], 'afterchar'))
ans = '!"#\$%&'()*+,-./:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_0123456789'

### Comments and Ratings (8)

H. M. Villanueva López

### H. M. Villanueva López (view profile)

Save me days of programming. Got this going in 1 minute.

Xiaohui Gu

### Xiaohui Gu (view profile)

Very useful！

Stephen Cobeldick

### Stephen Cobeldick (view profile)

@Akim Borbuev: the function NATSORT will parse all of the input cell array, so if you only want to sort by the first column then you can use indexing to select just that column, e.g. given an nx2 cell array C:

[~,idx] = natsort(C(:,1));
D = C(idx,:);

I then used the second output from NATSORT to sort the rows of the cell array. Does that do what you want?

Akim Borbuev

### Akim Borbuev (view profile)

Hello,

I have a nx2 cell which I would like to apply your natural order sort to the first column but with respective values in the second column. Is it possible?
Thank you.

Regards,
Akim

Jan van den Broecke

Junjie Wang

qiap chen

It's good tool

Chang hsiung

### Updates

 25 Mar 2018 1.11.0.0 * Consistent internal variable names. 22 Mar 2018 1.10.0.0 * Improve blurb and HTML. 8 Mar 2017 1.10.0.0 * Minor help edit. 29 Sep 2016 1.10.0.0 * Add HTML documentation. 28 Feb 2016 1.10.0.0 * Improve input checking. 7 Jan 2016 1.9.0.0 * Improve binary numeric handling. * Improve handling of skipped fields. * Add an example of skipped field usage. 25 Feb 2015 1.8.0.0 * Improved binary substring parsing. * Better examples. 20 Dec 2014 1.7.0.0 - Update documentation only, improve examples. 3 Aug 2014 1.6.0.0 - Add binary numeric parsing. - Improve input checking. - Replace multiple debugging output arrays with one cell array. - Allow lookarounds in regular expression. 2 Jul 2014 1.5.0.0 - Simplify hexadecimal example. - Correct output summary. 26 Apr 2014 1.4.0.0 - Now parses hexadecimal and octal substrings. - int64 and uint64 parsed at full precision. - Allow in any order. - For debugging: return indices of character and numeric arrays. 23 Aug 2012 1.3.0.0 - Implement more compact sort algorithm. - "sscanf" numeric format can be controlled by an optional input argument. - Provide use examples. - Output debugging arrays now char+numeric. 14 Feb 2012 1.1.0.0 - Add examples showing different numeric tokens. - Case-insensitive sort is now default.
##### MATLAB Release Compatibility
Created with R2010b
Compatible with any release
##### Platform Compatibility
Windows macOS Linux