Hello everyone, I've written another revision of my proposal, this is version 3 of it, the previous ones are on this email thread on debian-security@lists.debian.org.
I did get some feedback from the Security Team privately, it wasn't anything confidential, it's just that some members of the team only noticed my proposal after I sent it to the private mailing list, and the biggest part of the feedback was that they wanted some time to think about it. This time I'm cc'ing the team's mailing list as well, so replies will show up here. Not much has changed in this version, but it should be better than the previous one, this is more of a chance to get feedback from the team again. ******************************************************************************** ## A proposal to significantly reduce reported false-positives (no affected-code shipped) - version 3 I would like to propose something which will lower the amount of reported false-positive CVEs to our users by about 20%. ******************************************************************************** ## tl;dr Debian over-reports on numbers of affected CVEs, the main reason is that we don't have a unique way of stating that a CVE does not affect Debian when we don't build the affected package's feature (or hardening blocks exploits). In these cases, we mark the CVEs as affecting our packages because the source-code contains the vulnerable code (the binary package doesn't). This leads to ourselves and our users being required to manually distinguish which CVEs affect them and which don't anytime there's a need to look at the data. It's effectively noise and we end up reporting the binary packages as affected when that's not true (both on the OVAL files and on the security-tracker json file we generate). I propose we mark those cases as not-affected. Alternatively, I mention an option to create a new state to indicate that the resulting package is not affected due to the build options, but that the source-code contains the vulnerability. I also explain why that's not my prefered approach. ******************************************************************************** ## Problem statement The possible outcomes of a CVE assessment in our security-tracker are[0]: > <no-dsa> | <unfixed> | <undetermined> | <not-affected> | <itp> | <ignored> | > <postponed> We also have the following severity levels [0]: > SEVERITY_LEVEL : (unimportant) | (low) | (medium) | (high) "unimportant" being defined as: > unimportant: This problem does not affect the Debian binary package, e.g., a > vulnerable source file, which is not built, a vulnerable file in > doc/foo/examples/, PHP Safe mode bugs, path disclosure (doesn't matter on > Debian). All "non-issues in practice" fall also into this category, like > issues only "exploitable" if the code in question is setuid root, exploits > which only work if someone already has administrative privileges or similar. > This severity is also used for vulnerabilities in packages which are not > covered by security support. We have a problem in the way we assess CVEs when the generated package is not affected but the source code contains the vulnerability. Our current process is to set "no-dsa" and lower the severity to "unimportant", although it's also possible that in some cases people are making use of "ignored", which represents "won't fix". The result is that "unimportant/no-dsa" CVEs can mean two things: 1) We are affected but we the severity is too low, eg.: packages not covered by security support, the CVE is considered a non-issue by our security-team but we are still affected... 2) We are definitely not affected since we don't build that feature of the software or we have hardening in place which prevents this from being exploited. This leads to our users, who are interested in knowing which CVEs affect their systems, having to check the notes of every CVE on security-tracker to filter-out the false-positives. Besides that, we also struggle with this ourselves, as someone who would like to fix CVEs will have to filter-out these false-positives themselves. Considering the broad usage of Debian (especially on containers), being able to correctly mark these cases as not affecting the binary packages will have a huge impact on all of the industry. I'm not being over-optimistic here, a lot of effort ends up being spent on generating CVE reports and then having to justify why each one is not fixed. Whether the requirements around CVE fixing are right or wrong is a story of its own, but we have the potential to make ourselves and our users' lives easier with this. ******************************************************************************** ## Proposed solution I propose that we start setting CVEs to "not-affected" when the following is true for all officially supported architectures: * We don't ship the affected source package. * We don't build the affected feature. * We have hardening which makes the exploit impossible (only in the cases when there's no doubt about it). If we still want to flag the cases where a build with different flags might change that assertion, we can use the "(free text comment)" section of the NOTES[0] to mention it. In other words, we keep tracking source packages for our assessments, the difference is that when the built package is not-affected, our assessment will be "not-affected" for that release. Effectively this proposal means I would push an MR updating the documentation [0] and start changing those CVEs to not-affected. I'm not asking for anyone to do the work. As a point of reference, I'm not aware of anyone else evaluating CVEs like we currently do today, the expectation around this is that "not-affected" is used if it's impossible to exploit the installed package. If someone performs their own build of a package with different flags, that's not officially supported by us anymore. ******************************************************************************** # Stats As a way of sampling the impact of this issue, I've done a high-level check on how many sets of affected package-CVE we have in our debian:stable docker image[1]. Out of the 82 affected package/CVE pairs, 15 were clear cases of our packages not being affected. Out of the rest of those, the majority are other cases where we are reporting non-issues, but those require a deeper investigation so I don't want to assume they also fall under this case. So 18% of the reported affected packages are false-positives. Based on what I've seen, I believe this is a fair estimate to extrapolate. I've listed some examples to this issue at [2]. I'm confident this is the main reason for us to be over-reporting the number of affected CVEs for our releases, the second one being that we tend to not double check if older releases are affected if the CVE is not important [6]. ******************************************************************************** ## Alternative solution If using the "free text comment"[0] is not a good enough way of stating that only the source contains the vulnerable code: ## A1) Add a new sub-state "only-source-vulnerable", to be used in addition to "not-affected" ## A2) Add a new mutually exclusive state to the set: "not-affected-build-artifacts" I don't like these approaches because they increase the complexity of our process, a new state is more costly than a free text mention, where there's not a clear benefit/motivation. What's the value in saying the sources carry the vulnerable code? If someone does their own modified build of a package, all bets are off and that's not an official package. This also means we would have to modify: 1) The code that generates the OVAL files [3]; 2) The code that generates security-tracker json [4]; 3) The git hook that validates the contents of the tracking file [5]. The amount of code we need to modify is not an argument on whether it's the right thing to do or not, but at some point the implementation cost outweighs the benefit, and it's not clear to me what's the benefit. It should also be mentioned that identifying cases where only the source-code is vulnerable will never be done perfectly due to how easy it is to miss a bundled library which is not used. For example, rsync bundles zlib and we do not set rsync as affected for all zlib CVEs (rsync does not use the bundled lib), would we like otherwise to be the case? Then comparing A1 with A2: Coming up with a new state is confusing as systems/people reading that might end up parsing it as "affected". So I prefer A1 over A2 if my prefered option is not chosen. This being said, the non-preferred alternatives are still better than the current situation IMHO. ******************************************************************************** [0] https://security-team.debian.org/security_tracker.html#summary-of-tracker-syntax "ignored" and "postponed" are sub-states, supposed to be used together with "no-dsa". [1] $ grype debian:stable [2] https://security-tracker.debian.org/tracker/CVE-2011-3374 [2] https://security-tracker.debian.org/tracker/CVE-2022-0563 [2] https://security-tracker.debian.org/tracker/CVE-2017-18018 [2] https://security-tracker.debian.org/tracker/CVE-2019-19882 [2] https://security-tracker.debian.org/tracker/CVE-2023-28320 [3] https://www.debian.org/security/oval/ [4] https://security-tracker.debian.org/tracker/data/json [5] https://salsa.debian.org/security-tracker-team/security-tracker/-/blob/ab5ceb2e9d531c73d59bb26d67505d24eec16c22/bin/check-syntax [6] I've seen a few cases where the vulnerability didn't exist in the versions we shipped in oldstable or oldoldstable but we didn't check it due to the low severity, so we report as "affected" to be on the safe side. I have fixed a few cases of this myself, but I have some ideas on how to automate some of it by comparing it with other distros. This is something I plan to work on, but only after solving the issue on this proposal. All of this being said, I think Debian is exceptionally good, and still a reference for the industry, with regards to identifying the exact range of affected versions of a software and publishing that for everyone to see in the security-tracker. Cheers, -- Samuel Henrique <samueloph>