> El 20 jul 2021, a las 11:45, Sven Van Caekenberghe <s...@stfx.eu> escribió:
> 
> 
> 
>> On 20 Jul 2021, at 11:03, Sven Van Caekenberghe <s...@stfx.eu> wrote:
>> 
>> Hi Tim,
>> 
>> An introduction to this part of the system is in 
>> https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html
>>  [Character Encoding and Resource Meta Description] from the "Enterprise 
>> Pharo" book.
>> 
>> The error means that a file that you try to read as UTF-8 does contain 
>> things that are invalid with respect to the UTF-8 standard.
>> 
>> Are you sure the file is in UTF-8, maybe it is in ASCII, Latin-1 or 
>> something else ?
>> 
>> It is possible to customise the encoding to something different than the 
>> default UTF-8. For non-UTF encoders, there is a strict/lenient option to 
>> disallow/allow illegal stuff (but then you will get these in your strings).
>> 
>> I can show you how to do that if you want.
> 
> '/var/log/system.log' asFileReference readStreamDo: [ :in | in upToEnd ].
> 
> '/var/log/system.log' asFileReference binaryReadStreamDo: [ :in |
>       (ZnCharacterReadStream on: in encoding: #ascii) upToEnd ].
> 
> '/var/log/system.log' asFileReference binaryReadStreamDo: [ :in |
>       (ZnCharacterReadStream on: in encoding: ZnCharacterEncoder ascii 
> beLenient) upToEnd ].

There is also readStreamEncoded:[do:], which is a bit more concise but does the 
same :)

> 
> HTH
> 
>> Sven
>> 
>>> On 20 Jul 2021, at 10:31, Tim Mackinnon <tim@testit.works> wrote:
>>> 
>>> Hi - I’m doing a bit of log file processing with Pharo - and I’ve hit an 
>>> unexpected error and am wondering what the best way to approach it is.
>>> 
>>> It seems that I have a log file that has unexpected characters, and so my 
>>> readStream loop that reads lines gets an error: "ZnInvalidUTF8: Illegal 
>>> continuation byte for utf-8 encoding”.
>>> 
>>> For some reason this file (unlike my others) seems to contain characters 
>>> that it shouldn’t - but what is the best way for me to continue processing? 
>>> Should I be opening my files in a different way - or can I resume the error 
>>> somehow- I’m not familiar with this area of Pharo and am after a bit of 
>>> advice.
>>> 
>>> My code is like this (and I get the error when doing nextLine)
>>> 
>>> 
>>> parseStream: aFileStream with: aBlock
>>>     | line items |
>>>     [ (line := aFileStream nextLine) isNil ]
>>>             whileFalse: [ 
>>>                     items := $/ split: line.
>>>                     items size = 3 ifTrue: [aBlock value: items]]
>>> 
>>> My stream is created like this:
>>> 
>>> firmEfs := (pathName , '/' , firmName , '_files') asFileReference.
>>> details parseStream: firmEfs readStream.
>>> 
>>> 
>>> Should I be opening the stream a bit differently - or can I catch that 
>>> encoding error and resume it with some safe character?
>>> 
>>> Thanks for any help.
>>> 
>>> Tim

Reply via email to