Hi, Do all those people in Cc need to read this? If you really want to keep this public, maybe debian-qa@ is enough? (I personally don't feel I need to read this at this time; if I had time to spend on UDD, I would fix actual bugs)
Thanks Lucas On 18/05/20 at 21:57 +0200, Andreas Tille wrote: > On Mon, May 18, 2020 at 08:35:33PM +0200, Stéphane Blondon wrote: > > > > Can you send me the file 'gatherer.${I_dont_know_the_command}' which > > raises the UnicodeDecodeError exception? I will try to write a working > > patch. > > I simply added a debug line: > > udd(python3) $ git diff > diff --git a/udd/ddtp_gatherer.py b/udd/ddtp_gatherer.py > index bbf041b..d32b85f 100644 > --- a/udd/ddtp_gatherer.py > +++ b/udd/ddtp_gatherer.py > @@ -239,6 +239,7 @@ class ddtp_gatherer(gatherer): > self.log.exception("Error reading %s%s", dir, filename) > > def _open_file(path): > + print(path) > with open(path, 'rb') as f: > raw_content = f.read() > encoding = chardet.detect(raw_content)["encoding"] > > > which leads to > > > udd(python3) $ ./update-and-run.sh ddtp > /srv/mirrors/debian/dists/squeeze-proposed-updates/main/i18n/Translation-en.bz2 > /srv/mirrors/debian/dists/squeeze-proposed-updates/non-free/i18n/Translation-en.bz2 > /srv/mirrors/debian/dists/squeeze-proposed-updates/contrib/i18n/Translation-en.bz2 > /srv/mirrors/debian/dists/stretch-proposed-updates/main/i18n/Translation-en.bz2 > Traceback (most recent call last): > File "/srv/udd.debian.org/udd//udd.py", line 88, in <module> > exec("gatherer.%s()" % command) > File "<string>", line 1, in <module> > File "/srv/udd.debian.org/udd/udd/ddtp_gatherer.py", line 127, in run > h.update(f.read()) > File "/usr/lib/python3.8/codecs.py", line 322, in decode > (result, consumed) = self._buffer_decode(data, self.errors, final) > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 11: > invalid continuation byte > > > While you can download the files from any Debian mirror I've attached > > /srv/mirrors/debian/dists/stretch-proposed-updates/main/i18n/Translation-en.bz2 > to this mail. My guess is that translations from stretch will not be > touched any more and thus we need to cope somehow with the existing > encoding. > > Thanks a lot for your help > > Andreas. > > -- > http://fam-tille.de