> On 16 Apr 2018, at 17:06, Benoit St-Jean via Pharo-users > <pharo-users@lists.pharo.org> wrote: > > > From: Benoit St-Jean <bstj...@yahoo.com> > Subject: UTF-8 encoding > Date: 16 April 2018 at 17:06:28 GMT+2 > To: Any question about pharo is welcome <pharo-users@lists.pharo.org> > Reply-To: Benoit St-Jean <bstj...@yahoo.com> > > > Regarding the problems I have with my firstname and file paths and utf-8 > encoding, I found something weird in the UTF-8 encoding. In fact, to be more > precise, I found something strange when converting a String to a ByteArray > (which UTF-8 encoders convert from) > > If I look at the example in the comment of ByteArray>>utf8Decoded, 'Les > élèves français' is encoded as: > > #[76 101 115 32 195 169 108 195 168 118 101 115 32 102 114 97 110 195 167 97 > 105 115] > > NOW, if I take that very same string, 'Les élèves français' , and convert it > to a ByteArray, I get : > 'Les élèves français' asByteArray printString.
You cannot do that, that is using a null encoding, which is almost always wrong. Please read (the first part) of https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html carefully. > #[76 101 115 32 233 108 232 118 101 115 32 102 114 97 110 231 97 105 115] > > The 2 don't match! > > By the way, this problem exists on Pharo 5.1, 6.1 and 7.1 (on Windows 10 ) > > Can anyone confirm/infirm on another platform to see if this is > Windows-specific? This behaviour is not-windows specific, it is like that on all platforms, and it is correct ;-) > ----------------- > Benoît St-Jean > Yahoo! Messenger: bstjean > Twitter: @BenLeChialeux > Pinterest: benoitstjean > Instagram: Chef_Benito > IRC: lamneth > Blogue: endormitoire.wordpress.com > "A standpoint is an intellectual horizon of radius zero". (A. Einstein) > >