Sorry for resurrecting this old thread, but I've been looking at how to deal with renamed packages in CVE triaging again. When we last talked about this, we observed how we were sometimes missing packages during triage, e.g. `tiff3` that was present in wheezy. That's not an issue anymore since wheezy is gone, but the problem occurs more broadly in other packages.
In fact, it seems to me this is similar to the broader of embedded code copies. We could generalize renamed packages to the embedded code copies problem. We have a database of those in data/embedded-code-copies already, although I'm not sure how up to date that file actually is, nor how it is currently used in the workflow. It seems to me any database of renames we could be would clearly overlap with the embedded-code-copies file, so I figured I would write a (Python, we already have Perl and bash ones...) to start with. I have tried to upload this in a fork on salsa but gave up as push (of a single commit!) was stuck "resolving deltas"... Anyways, here's the snippet: https://salsa.debian.org/anarcat/security-tracker/snippets/70 The next step is to figure out how to actually modify the data/CVE/list file to introduce the changes. Considering the large number of packages in the embedded-code-copies file, I am not sure we want to retroactively change all previous entries. jmm suggested we run a cronjob that would keep track of where it is in history which would resolve this nicely. One question that remains is what, exactly, to add in the CVE metadata. One problem we faced last we looked at this is that we needed to add an entry like: SOURCEPACKAGE <removed> ... which would (e.g.) get triaged to: SOURCEPACKAGE <removed> [wheezy] SOURCEPACKAGE <not-affected> (or whatever) ... later on. This requires inside knowledge of the suites and their packages, something I find surprisingly hard to do in the security tracker. With embedded-code-copies, we will have to add something for all the other source packages, e.g.: OTHERSOURCE <undetermined> Right now, it seems that all scripts that hammer at those files do so with their own ad-hoc parsing code. Is that the recommended way of chopping those files up? Or is there a better parsing library out there? Thanks for any advice, A.