Hi
I wanted you to be CCed on this bug. Unfortunately, my BTS X-CC appears
to not have worked (I do not see it on the relevant lists), so here goes. :)
Note there is already a follow up, which is not included here. You may
want to read https://bugs.debian.org/1064454 before answering.
On Thu, 22 Feb 2024 12:10:06 +0100 Niels Thykier <ni...@thykier.net> wrote:
Package: debian-policy
Severity: wishlist
X-Debbugs-Cc: ni...@thykier.net
X-Debbugs-Cc: lintian-ma...@debian.org
X-Debbugs-Cc: de...@lists.debian.org
X-Debbugs-Cc: debian-dpkg@lists.debian.org
Hi
I am hoping we can agree to some tighter bounds on the field names
followed by relevant tools implementing the same restrictions in our
next versions. Ideally all deb822 parsers should follow suit; I have
CC'ed the major ones I could think of.
I will submit a MR for the python-debian's parser if a change is
agreed, which would eventually cover dak.
My rationale for this request is as follows.
The policy says the following about a field name:
> The field name is composed of US-ASCII characters excluding control
characters, space, and colon (i.e., characters in the ranges U+0021 (!) through
U+0039 (9), and U+003B (;) through U+007E (~), inclusive). Field names must not
begin with the comment character (U+0023 #), nor with the hyphen character (U+002D
-).
Which leads to the interesting case that:
```
Package: foo
Depends: bar,
${shlib:Depends}
```
is a *valid* deb822 paragraph with 3 fields. The last being named
`${shlib` with the value `Depends}`. I spotted this while debugging the
python-debian parser that did not react to what I thought was a clear
syntax error.
While the parser is technically correct given the Policy description
above, I cannot see any case where this is a desirable outcome. Instead,
I think we should classify this as a syntax error such that users are
instructed to indent `${shlib:Depends}` token.
A simple fix would be to forbid `$` at the start of the field. Though, I
think at this point it would make sense to prune more special characters
from the list like `%&{}[]()/\\<>|$` from anywhere in the field. Note
that we definitely need to keep `_` as it is used in debconf template
files for translations. Both single and double underscore is used here,
though they always come first in the field name.
The reason why I want a tighter bound is that currently, the following
things are also considered valid field names:
>
|foo(>=1:2.0),
foo(>=2:2.0),
,foo(>=3:2.0)
,foo(>=4:2.0)[amd64]<!nodeb>
)': We can make inverted crying smileys as field names!
`Foo`: Now with markdown formatting of the field name!
$(foo): Can we trick something into running shell commands?
/etc/passwd: Maybe path traversal
In all cases, I see no value to allowing these absurd field names and
only potential downsides.
Best regards,
Niels
PS: I doubt anything actually uses field names in an unsafe manner any
more. But we do have some history with deb822 files.
But someone will eventually try to do this. We already received a bug
report against debhelper long ago about being able to do path traversal
via questionable package names. Largely, this was considered a non-issue
because dpkg in its default configuration would abort and debhelper has
a "run arbitrary code via the upstream build system" feature anyway.
Additionally, 10+ years ago. lintian would create a file per field of
the file it scanned. Here, if it had used perl's "2-arg" open, you might
have been able to do something "fun". I do not remember if it did and
the exploit is a decade overdue even if.