The tests include binary files. Unless something has changed, I can't create a quilt patch that modifies a binary file. The qpdf test suite is full of PDF files, and PDF files are binary files, even when they are hand-created, as many of mine are. This makes including tests in patches challenging. There are a few unit tests that we could include, but they will likely not backport cleanly, which adds additional risk, and I don't think including them in the patch will add any value since the tests are specifically crafted to prove that this bug is fixed, and we have already done that through current releases and the manual tests. The bug affects a very specific pattern in the data. This part of the code that was affected is not actively modified, and the bug involved changing an infrequently-traversed section of code. The tokenizer is very low level and stable, and I did an audit of the code again after this bug was found.
qpdf has been an open source project since 2008 and has been in debian since the beginning. I am the author and debian maintainer. It is excessively unlikely that there will a "next" SRU that will come from other than the version control system, and unless I am hit by the proverbial bus, I will probably be the originator of it. While qpdf has a wide following, if I were not there to fix problems like this and push them through, it is not likely that problems like this would even be entered into an SRU process. It's been quite a long time since there's been any divergence between the debian and Ubuntu versions of the package. While I agree that more testing is better, I wonder if the benefit we will get is actually worth the effort. The new tests exercise specific case related to this fix. If there is still a bug of this type, adding the new tests will not help us find them. I don't think it would be worth the hassle to tweak the test system or jump through some hoops to allow the binary files involved in the tests to get into the patch -- that is more likely to cause a problem than to prevent one. The same is true of trying to backport the unit tests. qpdf processes millions of pages a day across a wide range of free and commercial systems. My employer processes over a million pages a day all by itself. In qpdf's 21 year history, 15 of which have been open source, this is only the second data-loss bug, and it has the potential to cause silent corruption of data in rare corner cases. I appreciate the diligence here, but I really think the level of testing that has already gone into this, the inclusion of new patches in the latest release, is adequate, and now that the bug is known, our priority should be getting a well- reviewed, well-tested patch in front of users to minimize the risk of data loss. Please let me know what else I need to do to get this through. I will cooperate, of course. My intention here is not to argue, but rather to try to assure you that the testing is already quite thorough, most likely exceeding the standard that would be applied to most packages by their authors. We are passing a point of diminishing return. Thanks for considering my argument. -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to qpdf in Ubuntu. https://bugs.launchpad.net/bugs/2039804 Title: SRU request: qpdf: data loss bug affecting versions 11.0.0 through 11.6.2 Status in Qpdf: Fix Released Status in qpdf package in Ubuntu: New Status in qpdf source package in Lunar: New Status in qpdf source package in Mantic: New Bug description: Notes: * I am the upstream author and debian maintainer for qpdf. * This bug has been fixed in debian unstable and testing with version 11.6.3, but because 24.04 is not yet open, it has not synced. This should not block fixing 23.04 and 22.04. I have uploaded 11.6.3 to my ppa: https://launchpad.net/~qpdf/+archive/ubuntu/qpdf * I am attaching debdiffs for lunar and mantic Upstream bug https://github.com/qpdf/qpdf/issues/1050 revealed a bug in qpdf's lexical layer that would cause qpdf to discard the character in a binary string following an octal quoted character with 1 or 2 digits. The PDF spec allows octal digits to be \d, \dd, or \ddd, and allows the first two forms if the next character is other than an octal digit. Most PDF writers never use the \d or \dd forms, but some do. With default options, qpdf does not parse or alter strings inside content streams, so this bug is not likely to affect page content. However, binary strings of this sort are common in the document /ID and may also appear in metadata for encrypted files. In some cases, such as the file in #1050, this bug can cause error, in this case, because the discarded character was the string end delimiter. In most case, this bug results in silent data loss. The fix is very small and locally contained. The upstream fix includes several new test cases, but the patch I will include to fix the issue only includes the relevant code change. I also reported this as a debian bug: https://bugs.debian.org/cgi- bin/bugreport.cgi?bug=1054158 It was approved as a stable update by debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054119 [ Impact ] The bug could result in silent corruption of binary strings in PDF metadata. It could also result in failure of qpdf to process a valid file. Data loss justifies a stable update. [ Test Plan ] The test file in https://github.com/qpdf/qpdf/issues/1050 can be used to prove that the bug exists in versions >= 11.0.0 and <= 11.6.2 and that the bug is fixed in 11.6.3. The upstream fix includes several additional automated test cases. These are not included in the patch, but they are included in the upstream commit that fixes the bug: https://github.com/qpdf/qpdf/commit/1ecc6bb29e24a4f89470ff91b2682b46e0576ad4 How to test the SRU package on Ubuntu manually (copied from Jay's comment #6 below): Running `qpdf --check 018.pdf` where `018.pdf` is the file attached to the upstream bug will reproduce the issue. With the current version in 22.04 and 23.04, you will see something like this: ``` WARNING: /tmp/z/018.pdf (xref stream: object 17 1, offset 110340): EOF while reading token WARNING: /tmp/z/018.pdf (xref stream: object 17 1, offset 110830): unexpected EOF WARNING: /tmp/z/018.pdf (xref stream: object 17 1, offset 110830): parse error while reading object WARNING: /tmp/z/018.pdf (xref stream: object 17 1, offset 110830): expected endobj WARNING: /tmp/z/018.pdf: file is damaged WARNING: /tmp/z/018.pdf (offset 110267): xref not found WARNING: /tmp/z/018.pdf: Attempting to reconstruct cross-reference table qpdf: /tmp/z/018.pdf: unable to find trailer dictionary while recovering damaged file ``` After the fix, you will see ``` checking /home/ejb/Downloads/018.pdf PDF Version: 1.7 File is not encrypted File is not linearized No syntax or stream encoding errors found; the file may still contain errors that qpdf cannot detect ``` (obviously with the full paths based on whatever you call the file). [ Where problems could occur ] This fix has a very low risk of causing a regression. The fix is very localized to qpdf's lexical layer and is in a code path that only occurs when a 1-digit or 2-digit octal quoted character is terminated by other than an octal digit. This is the first bug in qpdf's lexical layer in many years. It was introduced by a pull request from a reliable and consistent contributor who has made may improvements to qpdf's performance. The fix follows the established pattern of how to handle instances in which a character triggers a state change and has to be reprocessed in the new state. qpdf has a rigorous test suite and an extremely good quality record. It processes millions of documents daily by many commercial entities. My current employer runs millions of pages a day through qpdf. [ Other Info ] See also Upstream bug report: https://github.com/qpdf/qpdf/issues/1050 Corresponding debian bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054158 Debian stable release approval: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054119 To manage notifications about this bug go to: https://bugs.launchpad.net/qpdf/+bug/2039804/+subscriptions -- Mailing list: https://launchpad.net/~desktop-packages Post to : desktop-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~desktop-packages More help : https://help.launchpad.net/ListHelp