Hi Craig, > On 3 Mar 2022, at 23:15, craig <cr...@hivemind.net> wrote: > > Hi Guys, > > > I'm reading a text file which is supposed to be ASCII encoded. This file > contains a list of filepaths and was created by a Python program. > > > Well, it turns-out that file names on Windows can contain illegal UTF-8 > characters. This causes ZnUTF8Encoder to signal 'Illegal leading byte for > utf-8 encoding' and crash the program. > > > I would like to handle this situation more elegantly, is there a more > appropriate code-page to use for the Windows filesystem? > > <4b5aa143.png> > > > > Craig
We support more than 80 different character encoders. Of course, you should first know what encoder is being used, after that, it is easy to use a different encoder. Consider: '/tmp/foo.txt' asFileReference readStreamEncoded: #utf8 do: [ :in | in upToEnd ]. '/tmp/foo.txt' asFileReference readStreamEncoded: #windows1252 do: [ :in | in upToEnd ]. '/tmp/foo.txt' asFileReference readStreamEncoded: #latin1 do: [ :in | in upToEnd ]. ZnCharacterEncoder knownEncodingIdentifiers. #windows1252 asZnCharacterEncoder. If you could post a small example of your file, I could try to help. It will probably be #windows1252 or #latin1. HTH, Sven