Hi Tim, An introduction to this part of the system is in https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html [Character Encoding and Resource Meta Description] from the "Enterprise Pharo" book.
The error means that a file that you try to read as UTF-8 does contain things that are invalid with respect to the UTF-8 standard. Are you sure the file is in UTF-8, maybe it is in ASCII, Latin-1 or something else ? It is possible to customise the encoding to something different than the default UTF-8. For non-UTF encoders, there is a strict/lenient option to disallow/allow illegal stuff (but then you will get these in your strings). I can show you how to do that if you want. Sven > On 20 Jul 2021, at 10:31, Tim Mackinnon <tim@testit.works> wrote: > > Hi - I’m doing a bit of log file processing with Pharo - and I’ve hit an > unexpected error and am wondering what the best way to approach it is. > > It seems that I have a log file that has unexpected characters, and so my > readStream loop that reads lines gets an error: "ZnInvalidUTF8: Illegal > continuation byte for utf-8 encoding”. > > For some reason this file (unlike my others) seems to contain characters that > it shouldn’t - but what is the best way for me to continue processing? Should > I be opening my files in a different way - or can I resume the error somehow- > I’m not familiar with this area of Pharo and am after a bit of advice. > > My code is like this (and I get the error when doing nextLine) > > > parseStream: aFileStream with: aBlock > | line items | > [ (line := aFileStream nextLine) isNil ] > whileFalse: [ > items := $/ split: line. > items size = 3 ifTrue: [aBlock value: items]] > > My stream is created like this: > > firmEfs := (pathName , '/' , firmName , '_files') asFileReference. > details parseStream: firmEfs readStream. > > > Should I be opening the stream a bit differently - or can I catch that > encoding error and resume it with some safe character? > > Thanks for any help. > > Tim