Customizable Natural-Order Sort

Version 3.4.6 (81,8 KB) von Stephen23
Alphanumeric sort of a cell/string/categorical array, with customizable number format.
5,1K Downloads
Aktualisiert 1. Mai 2024

Lizenz anzeigen

Anmerkung des Herausgebers: This file was selected as MATLAB Central Pick of the Week

To sort any file-names or folder-names use NATSORTFILES:
To sort the rows of a string/cell array use NATSORTROWS:
Summary
Alphanumeric sort the text in a string/cell/categorical array. Sorts the text by character code taking into account the values of any number substrings. Compare for example:
X = {'a2', 'a10', 'a1'};
sort(X)
ans = 'a1' 'a10' 'a2'
natsort(X)
ans = 'a1' 'a2' 'a10'
By default NATSORT interprets all consecutive digits as integer numbers, the number substring recognition can be specified using a regular expression, allowing the number substrings to have:
  • a +/- sign
  • a decimal point and decimal fraction
  • E-notation exponent
  • decimal, octal, hexadecimal or binary notation
  • Inf or NaN values
  • criteria supported by regular expressions: lookarounds, quantifiers, etc.
And of course the sorting itself can also be controlled:
  • ascending/descending sort direction
  • character case sensitivity/insensitivity
  • relative order of numbers vs. characters
  • relative order of numbers vs NaNs
Examples
%% Multiple integers (e.g. release version numbers):
>> A = {'v10.6', 'v9.10', 'v9.5', 'v10.10', 'v9.10.20', 'v9.10.8'};
>> sort(A) % for comparison.
ans = 'v10.10' 'v10.6' 'v9.10' 'v9.10.20' 'v9.10.8' 'v9.5'
>> natsort(A)
ans = 'v9.5' 'v9.10' 'v9.10.8' 'v9.10.20' 'v10.6' 'v10.10'
%% Integer, decimal, NaN, or Inf numbers, possibly with +/- signs:
>> B = {'test+NaN', 'test11.5', 'test-1.4', 'test', 'test-Inf', 'test+0.3'};
>> sort(B) % for comparison.
ans = 'test' 'test+0.3' 'test+NaN' 'test-1.4' 'test-Inf' 'test11.5'
>> natsort(B, '[-+]?(NaN|Inf|\d+\.?\d*)')
ans = 'test' 'test-Inf' 'test-1.4' 'test+0.3' 'test11.5' 'test+NaN'
%% Integer or decimal numbers, possibly with an exponent:
>> C = {'0.56e007', '', '43E-2', '10000', '9.8'};
>> sort(C) % for comparison.
ans = '' '0.56e007' '10000' '43E-2' '9.8'
>> natsort(C, '\d+\.?\d*(E[-+]?\d+)?')
ans = '' '43E-2' '9.8' '10000' '0.56e007'
%% Hexadecimal numbers (with '0X' prefix):
>> D = {'a0X7C4z', 'a0X5z', 'a0X18z', 'a0XFz'};
>> sort(D) % for comparison.
ans = 'a0X18z' 'a0X5z' 'a0X7C4z' 'a0XFz'
>> natsort(D, '0X[0-9A-F]+', '%i')
ans = 'a0X5z' 'a0XFz' 'a0X18z' 'a0X7C4z'
%% Binary numbers:
>> E = {'a11111000100z', 'a101z', 'a000000000011000z', 'a1111z'};
>> sort(E) % for comparison.
ans = 'a000000000011000z' 'a101z' 'a11111000100z' 'a1111z'
>> natsort(E, '[01]+', '%b')
ans = 'a101z' 'a1111z' 'a000000000011000z' 'a11111000100z'
%% Case sensitivity:
>> F = {'a2', 'A20', 'A1', 'a10', 'A2', 'a1'};
>> natsort(F, [], 'ignorecase') % default
ans = 'A1' 'a1' 'a2' 'A2' 'a10' 'A20'
>> natsort(F, [], 'matchcase')
ans = 'A1' 'A2' 'A20' 'a1' 'a2' 'a10'
%% Sort order:
>> G = {'2', 'a', '', '3', 'B', '1'};
>> natsort(G, [], 'ascend') % default
ans = '' '1' '2' '3' 'a' 'B'
>> natsort(G, [], 'descend')
ans = 'B' 'a' '3' '2' '1' ''
>> natsort(G, [], 'num<char') % default
ans = '' '1' '2' '3' 'a' 'B'
>> natsort(G, [], 'char<num')
ans = '' 'a' 'B' '1' '2' '3'
%% UINT64 numbers (with full precision):
>> natsort({'a18446744073709551615z', 'a18446744073709551614z'}, [], '%lu')
ans = 'a18446744073709551614z' 'a18446744073709551615z'

Zitieren als

Stephen23 (2024). Customizable Natural-Order Sort (https://www.mathworks.com/matlabcentral/fileexchange/34464-customizable-natural-order-sort), MATLAB Central File Exchange. Abgerufen.

Kompatibilität der MATLAB-Version
Erstellt mit R2010b
Kompatibel mit R2009b und späteren Versionen
Plattform-Kompatibilität
Windows macOS Linux
Kategorien
Mehr zu String Parsing finden Sie in Help Center und MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Veröffentlicht Versionshinweise
3.4.6

* Documentation improvements.

3.4.5

* Accept decimal comma as well as decimal point.
* HTML example use string arrays.

3.4.4

* Add testcases.

3.4.3

* Now R2009b compatible.

3.4.2

* Edit description & help.

3.4.1

* Edit description & help.

3.4.0

* Add plenty of testcases.
* Fix bug in descending sort with an empty input array.

3.3.0

* Improve test function, add test cases.

3.2.0

* Update TESTFUN.

3.1.0

* More robust TESTFUN pretty-print code.
* Improve option checking.

3.0.5

* Improve examples.

3.0.4

* Correct summary.

3.0.3

* Improve string handling.

3.0.2

* Simplify numeric class handling.
* Add permutations test examples.

3.0.1

* handle single element with no number.

3.0.0

* Accepts and sorts a string array, categorical array, cell array of char, etc.
* Regular expression and optional arguments may be string or char.
* Simplify char<num algorithm.
* Simplify debugging output cell array.

2.1.2

* Consistent alignment tab/spaces.

2.1.1

* Add error IDs.

2.1.0

* Fix handling of char<num.

2.0.0

Total rewrite: faster and less memory.
* Remove 'asdigit' option.
* Rename 'beforechar' and 'afterchar' to 'num<char' and 'char<num'.
* Add options 'num<NaN' and 'NaN<num'.
* Improve HTML documentation.
* Include testcases.

1.11.0.0

* Consistent internal variable names.

1.10.0.0

* Minor help edit.
* Improve input checking.
* Improve blurb and HTML.
* Add HTML documentation.

1.9.0.0

* Improve binary numeric handling.
* Improve handling of skipped fields.
* Add an example of skipped field usage.

1.8.0.0

* Improved binary substring parsing.
* Better examples.

1.7.0.0

- Update documentation only, improve examples.

1.6.0.0

- Add binary numeric parsing.
- Improve input checking.
- Replace multiple debugging output arrays with one cell array.
- Allow lookarounds in regular expression.

1.5.0.0

- Simplify hexadecimal example.
- Correct output summary.

1.4.0.0

- Now parses hexadecimal and octal substrings.
- int64 and uint64 parsed at full precision.
- Allow <options> in any order.
- For debugging: return indices of character and numeric arrays.

1.3.0.0

- Implement more compact sort algorithm.
- "sscanf" numeric format can be controlled by an optional input argument.
- Provide use examples.
- Output debugging arrays now char+numeric.

1.1.0.0

- Add examples showing different numeric tokens.
- Case-insensitive sort is now default.

1.0.0.0