> El 20 jul 2021, a las 11:45, Sven Van Caekenberghe <s...@stfx.eu> escribió: > > > >> On 20 Jul 2021, at 11:03, Sven Van Caekenberghe <s...@stfx.eu> wrote: >> >> Hi Tim, >> >> An introduction to this part of the system is in >> https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html >> [Character Encoding and Resource Meta Description] from the "Enterprise >> Pharo" book. >> >> The error means that a file that you try to read as UTF-8 does contain >> things that are invalid with respect to the UTF-8 standard. >> >> Are you sure the file is in UTF-8, maybe it is in ASCII, Latin-1 or >> something else ? >> >> It is possible to customise the encoding to something different than the >> default UTF-8. For non-UTF encoders, there is a strict/lenient option to >> disallow/allow illegal stuff (but then you will get these in your strings). >> >> I can show you how to do that if you want. > > '/var/log/system.log' asFileReference readStreamDo: [ :in | in upToEnd ]. > > '/var/log/system.log' asFileReference binaryReadStreamDo: [ :in | > (ZnCharacterReadStream on: in encoding: #ascii) upToEnd ]. > > '/var/log/system.log' asFileReference binaryReadStreamDo: [ :in | > (ZnCharacterReadStream on: in encoding: ZnCharacterEncoder ascii > beLenient) upToEnd ].
There is also readStreamEncoded:[do:], which is a bit more concise but does the same :) > > HTH > >> Sven >> >>> On 20 Jul 2021, at 10:31, Tim Mackinnon <tim@testit.works> wrote: >>> >>> Hi - I’m doing a bit of log file processing with Pharo - and I’ve hit an >>> unexpected error and am wondering what the best way to approach it is. >>> >>> It seems that I have a log file that has unexpected characters, and so my >>> readStream loop that reads lines gets an error: "ZnInvalidUTF8: Illegal >>> continuation byte for utf-8 encoding”. >>> >>> For some reason this file (unlike my others) seems to contain characters >>> that it shouldn’t - but what is the best way for me to continue processing? >>> Should I be opening my files in a different way - or can I resume the error >>> somehow- I’m not familiar with this area of Pharo and am after a bit of >>> advice. >>> >>> My code is like this (and I get the error when doing nextLine) >>> >>> >>> parseStream: aFileStream with: aBlock >>> | line items | >>> [ (line := aFileStream nextLine) isNil ] >>> whileFalse: [ >>> items := $/ split: line. >>> items size = 3 ifTrue: [aBlock value: items]] >>> >>> My stream is created like this: >>> >>> firmEfs := (pathName , '/' , firmName , '_files') asFileReference. >>> details parseStream: firmEfs readStream. >>> >>> >>> Should I be opening the stream a bit differently - or can I catch that >>> encoding error and resume it with some safe character? >>> >>> Thanks for any help. >>> >>> Tim