On 19/10/2020 05:58, Mladen Gogala via Python-list wrote:
On Sun, 18 Oct 2020 21:00:18 +1300, dn wrote:
On 18/10/2020 12:58, Mladen Gogala via Python-list wrote:
On Sat, 17 Oct 2020 22:51:11 +0000, Mladen Gogala wrote:
BTW, I used this
cp /var/log/syslog ./in-file.log
#!/usr/bin/env python3
import io
with open("in-file.log","r") as infile:
      for line in infile:
          print(line)
I got a different error:
Traceback (most recent call last):
    File "./test.py", line 4, in <module>
      for line in infile:
    File "/usr/lib/python3.8/codecs.py", line 322, in decode
      (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd8 in position
897:
invalid continuation byte

@Mladen: is syslog a text file or binary format?
Hi!
Syslog is the system log. It's a text file. This only happens if I use
infile as iterable. If I use readline, all is well:

#!/usr/bin/env python3
import io
with open("in-file.log","r") as infile:
     while True:
         line=infile.readline()
         if not line:
             break
         print(line)

I don't particularly like this idiom, but it works. That is probably a bug
in the utf-8 decoder on Ubuntu. It doesn't happen on my Fedora 32 VM. I
haven't tried with infile.reconfigure(encoding=None)

[Slightly OT from OP]

Some logging has started to move from simple-text to a more compressed?efficient 'binary' - hence my thinking.
Your observation, doubly-interesting.

Fedora uses UTF-8 by default. I would have expected the same of Ubuntu. One wonders if different decoder/encoder defaults are set by the repo-managers, or some-such explanation.
Using Fedora 32, (as before), and a copy of "/var/log/messages" because 
it doesn't use "syslog", it works happily:
>>> with open( "messages", "r" ) as infile:
...      for line in infile:
...          print(line)
...          break
...
Oct 18 00:01:01 JrBrown systemd[1]: Starting update of the root trust anchor for DNSSEC validation in unbound...
However, the decisive-point is the actual data. Have you worked-out 
which line in the log causes the error - and thus the offending string 
of characters?
--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to