How do I get my MATLAB editor to read UTF-8 characters? UTF-8 characters in blank squares in editors, but in the command window and workspace works fine.

139 Ansichten (letzte 30 Tage)
I have a project, which is commented in UTF-8 characters.
I tried changing the system locale on my Windows 10, however MATLAB editor is not recognizing UTF-8 characters(in blank squares). I'm not sure what to do here.
If I open the same .m file in text editor, it works fine.
How do I get my MATLAB editor to read UTF-8? Thank you
feature('DefaultCharacterSet')
feature('locale')
ans =
UTF-8
ans =
ctype: 'en_US.windows-1252'
collate: 'en_US.windows-1252'
time: 'en_US.windows-1252'
numeric: 'en_US_POSIX.windows-1252'
monetary: 'en_US.windows-1252'
messages: 'en_US.windows-1252'
encoding: 'windows-1252'
terminalEncoding: 'windows-949'
jvmEncoding: 'Cp1252'
status: 'MathWorks locale management system initialized.'
warning: 'System locale setting, ko_KR, is different from user locale setti…'
  3 Kommentare
Hongyu Shi
Hongyu Shi am 1 Mai 2016
I'm having the same issue for a long time. It seems that sometimes it can handle Chinese characters when I change my system into Chinese default language. But in English system they just becomes a lot of question marks.
Keith Hooks
Keith Hooks am 30 Aug. 2016
I'm having the same problem. It seems that editing the lcdata.xml file to change the encoding of en_US has no effect at all. It stays windows-1252 no matter what I change it to in the file.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Jinghao Lei
Jinghao Lei am 20 Okt. 2016
I have a very tricky way to solve this problem. And it seems works. In my case, (windows matlab 2016b x64)
feature('locale')
always output below even I have modified lcdata.xml
ctype: 'zh_CN.GBK'
...
so, I delete this in lcdata.xml (in codeset)
<encoding name="GBK">
<encoding_alias name="936">
</encoding>
then I change following
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
</encoding>
to
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
<encoding_alias name="GBK"/>
</encoding>
The point is cheat matlab GBK is just alias of utf8
  9 Kommentare
Isidora Timkov-Glumac
Isidora Timkov-Glumac am 14 Sep. 2023
@Jinghao Lei I have the same version of MatLab as you, but my ctype is 'en_US_POSIX.US-ASCII'. How should I modify the lcdata file so that this hack works for me? Thank you!
Simon Diehl
Simon Diehl am 30 Nov. 2023
This solution worked for me as welll. I'm using Matlab R2023b on Windows Server 2019. My encoding was windows-1252.
This is not necessary on newer operating systems. I don't experience this problem with R2023b on Windows 10.
My steps where:
  1. Go to Program Files\MATLAB\R2023b\bin
  2. Rename lcdata.xml to lcdata_old.xml
  3. Copy lcdata_utf8.xml and rename it to lcdata.xml
  4. Open the file and go to section codeset <!-- Codeset entry -->
  5. Comment out <encoding name="windows-1252" ...
  6. Go to <encoding name="UTF-8"> and add the alias <encoding_alias name="1252">
The final sections look like this:
<!-- <encoding name="windows-1252" jvm_encoding="Cp1252">
<encoding_alias name="1252"/>
</encoding> -->
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
<encoding_alias name="1252"/>
</encoding>
And the result is:
feature("locale")
ans =
struct with fields:
ctype: 'de_DE.UTF-8'
collate: 'de_DE.UTF-8'
time: 'de_DE.UTF-8'
numeric: 'en_US_POSIX.UTF-8'
monetary: 'de_DE.UTF-8'
messages: 'de_DE.UTF-8'
encoding: 'UTF-8'
terminalEncoding: 'IBM850'
jvmEncoding: 'UTF-8'
status: 'MathWorks locale management system initialized.'
warning: ''
The editor now works as expected. Thank you very much for the solution @Jinghao Lei

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Michael Cappello
Michael Cappello am 31 Okt. 2017
Bearbeitet: Rik am 30 Mär. 2023
% read in the file
fID = fopen(filename, 'r', 'n', 'UTF-8');
bytes = fread(fID);
fclose(fID);
The data read from the file can then be converted into Unicode characters, like so:
unic = native2unicode(bytes, 'UTF-8');
if you want, clear the Carriage Returns, set the Line Feeds to a space
unic(unic == 10) = []; unic(unic == 13) = ' ';
disp(unic'); % display the Unicode text

Kategorien

Mehr zu Toolbox Distribution finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by