On Wed, 2018-11-21 at 11:45 +0100, Fabian Groffen wrote: > > > > > > 5. **Metadata is not compressed.** This is not a significant > > > > > > problem, > > > > > > it is just listed for completeness. > > > > > > > > > > > > > > > > > > Goals for a new container format > > > > > > -------------------------------- > > > > > > > > > > > > The following goals have been set for a replacement format: > > > > > > > > > > > > 1. **The packages must remain contained in a single file.** As a > > > > > > matter > > > > > > of user convenience, it should be possible to transfer binary > > > > > > packages without having to use multiple files, and to install > > > > > > them > > > > > > from any location. > > > > > > > > > > > > 2. **The file format must be entirely based on common file formats, > > > > > > respecting best practices, with as little customization as > > > > > > necessary > > > > > > to satisfy the requirements.** In particular, it is unacceptable > > > > > > to create new binary formats. > > > > > > > > > > I take this as your personal opinion. I don't quite get why it is > > > > > unacceptable to create a new binary format though. In particular when > > > > > you're looking for efficiency, such format could serve your purposes. > > > > > As long as it's clearly defined, I don't see the problem with a binary > > > > > format either. > > > > > Could you add why it is you think binary formats are unacceptable > > > > > here? > > > > > > > > Because custom binary formats require specialized tooling, and are > > > > a royal PITA when the user wants to do something that the author of > > > > specialized tooling just happened not to think worthwhile, or when > > > > the tooling is not available for some reason. And before you ask really > > > > silly questions, yes, I did fight binary packages over hex editor > > > > at some point. > > > > > > Which I still don't understand, to be frank. I think even Portage > > > exposes python APIs to get to the data. > > > > Compare the time needed to make a trivial (but unforeseen) change > > on a format that's transparent vs a format that requires you to learn > > its spec and/or API, write a program and debug it. > > I was under the impression you could unpack a tbz2 into data and xpak, > then unpack both, modify the contents with an editor or whatever, and > then pack the whole stuff back into a tbz2 again. This can be done > worst case scenario by emerge -k <pkg>, modifying the vdb and quickpkg > <pkg> afterwards.
In the described example, the whole necessity of modifying the binary package arises from it being broken, therefore unsuitable for 'emerge -k'. > I know that with portage-utils you can do this easily with the qtbz2 and > qxpak commands. No need to do anything with a hex editor, or know > anything about how it's done. Actually, you need to: a. know that portage-utils has the appropriate tools (it's non-obvious), b. know how to use portage-utils. This is non-obvious. It took me a while to figure out that I need to use qtbz2 before using qxpak (why would it work only on split data when the format is explicitly written to be used on top of compressed archive?!). > Obvious advantage of your approach is that you don't need q* tools, but > can use tar instead. The editting is as trivial though. In your case > you need a special procedure to reconstruct the binpkg should you want > to keep your special properties (label, order) which equates to q* tools > somewhat. Except you don't need to keep them. The spec is quite explicit that they're optimizations and that the package must work even if they're lost as a part of editing exercise. > > > > > The most trivial case is an attempted recovery of a broken system. > > > > If you don't have Portage working and don't have portage-utils > > > > installed, do you really prefer a custom format which will require you > > > > to fetch and compile special tools? Or is one that can be processed > > > > with tools you're quite likely to have on every system, like tar? > > > > > > Well, I think the idea behind the original binpkg format was to use tar > > > directly on the files in emergency scenarios like these... > > > The assumption was bzip2 decompressor and tar being available. > > > I think it is an example of how you add something, while still allowing > > > to fallback on existing tools. > > > > Except progress in compressors has made it work less and less reliably. > > It's mostly an example how to be *clever*. However, being clever > > usually doesn't pay off in the long term, compared to doing things *in a > > simple way*. > > We agree it is hackish, and we agree we can do without. You simply > exaggerate the problem, IMO, which mostly isn't there, because it works > fine today. It can also be solved today using shell tools. > > % head -c `grep -abo 'XPAKPACK' > $EPREFIX/usr/portage/packages/sys-apps/sed-4.5.tbz2 | sed 's/:.*$//'` > $EPREFIX/usr/portage/packages/sys-apps/sed-4.5.tbz2 | tar -jxf - > > results in no warnings/errors from bzip about trailing garbage, possible > thanks to the spec being smart enough about this. Well, you aren't going to call that simple, are you? Plus, I think your solution would fail if bzip2 output just happened to contain 'XPAKPACK' string. Not saying it's likely to happen but relying on fixed strings not happening accidentally is not good design. -- Best regards, Michał Górny
signature.asc
Description: This is a digitally signed message part