Control: retitle -1 Incorrectly parsing whitespace in Deb822.iter_paragraphs On Tue, 13 Nov 2018 at 23:42, Marcus Furlong <furlo...@gmail.com> wrote: > > > > I have come across a case where whitespace is added in > > > Packages{.gz,.bz2} and I am not sure how it should be parsed. > > [...] > > > Should this whitespace be parsed as a paragraph delimiter? > > > > For a Packages file, each paragraph is defined as a set of DEBIAN/control > > paragraphs; the Description field is not allowed to contain lines that are > > whitespace-only. > > > > https://wiki.debian.org/DebianRepository/Format#A.22Packages.22_Indices > > > > https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-description > > > > So the strict answer is yes, it should be a paragraph delimiter but most > > implementations seem to be more forgiving in what they accept. > > > > Note that for debian/control files in source packages, whitespace-only lines > > are treated as paragraph separators so that whitespace errors in an editor > > don't accidentally make packages disappear from the archive. > > > > > > > Currently, the whitespace is being treated as a paragraph delimiter, > > > in python-debian, but not by apt-get, etc. > > > > Could you expand on this with an example, perhaps? > > > > python-debian actually uses python-apt for dealing with Sources and Packages > > I was incorrect. As you have shown, python-apt works correctly. > > > files (i.e. the exact same code as apt) and already does treat > > whitespace-only > > lines as being part of a paragraph rather than breaking them: > > > > > > $ ipython3 > > Python 3.6.7 (default, Oct 21 2018, 08:08:16) > > Type "copyright", "credits" or "license" for more information. > > > > In [1]: from debian.deb822 import Packages > > > > In [2]: with open('Packages') as fh: > > ...: for p in Packages.iter_paragraphs(fh): > > ...: if p['Version'] == '1.25.0-1529904044': > > ...: print(p) > > I've narrowed down where the issue occurs. It happens when passing the > contents rather than the file handle to iter_paragraphs: > > ~# ipython3 > Python 3.5.3 (default, Jan 19 2017, 14:11:04) > Type "copyright", "credits" or "license" for more information. > > IPython 5.1.0 -- An enhanced Interactive Python. > ? -> Introduction and overview of IPython's features. > %quickref -> Quick reference. > help -> Python's own help system. > object? -> Details about 'object', use 'object??' for extra details. > > In [1]: from debian.deb822 import Packages > > In [2]: with open('Packages') as fh: > ...: for p in Packages.iter_paragraphs(fh.read()): > ...: if 'version' not in p: > ...: print(p) > ...: > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > Homepage: https://code.visualstudio.com/ > > > In [3]: > > Passing the contents does the correct thing in all other cases, so not > sure why it would be having an issue with this? > > -- > Marcus Furlong
-- Marcus Furlong