Hi, I made a proposal of an updated deb format definition. I based that on the manpage deb (part of dpkg-dev), and on reverse engineering of dpkg-deb/build.c. I hope I've written the standard in a right and easy to understandable way. I did (by purpose) not add anything about signatures etc, but I just wanted to document what we have at current. Discussion about additions should (IMHO) be kept seperate.
IMHO this definition should become part of the policy; I propose either an new chapter 12, or an addition to chapter 3 Binary packages, whatever seems more appropriate. This means that also some parts of Appendix B could be removed at this occasion. I'm also Ccing one bug of apt-utils, where I also got some of the information from, and debian-devel. Please restrict the crossposting on answers if usefull. Cheers, Andi DESCRIPTION The .deb format is the Debian binary package file format. It is understood by dpkg 0.93.76 and later, and is generated by default by all versions of dpkg since 1.2.0 and all i386/ELF versions since 1.1.1elf. The format described here is used since Debian 0.93; details of the old format are described in deb-old(5). OVERALL FORMAT The file is an ar archive in a certain ar version and with a magic number of !<arch>. Due to the robustness principle, extracting tools should be able to cope with as many of the different ar versions as possible; if they don't, its at maximum a wishlist bug. On the other hand, tools providing .deb-files MUST only provide strictly standard compatible files. Every other behaviour is a serious bug! The first member of the archive is name debian-binary and contains a series of lines, separated by newlines. Currently only one line is present, the format version number. The 2.0 format is current, and this format is described in that document. Programs which read .deb-files should be prepared for the minor number to be increased and new lines to be present, and should ignore these if this is the case. If the major number has a value a programm doesn't know, an incompatible change has happend, and the program should abort with an error. OVERALL AR FORMAT The ar-format is (by purpose) one of the most ancient formats. This has the reason that it should be possible to unpack .deb-files on as many different computers as possible. Furthermore, it makes it also more easy for our code to handle it. Any ar files can be written as AR-FILE := HEADER [MEMBER]*. The header is the string "!<arch>\n" (not null terminated). Each member itself consists of the member head, and of the body, and, if necessary, a padding '\n'. All information in the members head is printable ascii, and each value is padded with spaces on the right side; at least one space must be present, so the information must be shorter than the maximum number of bytes available. The head is composed of the name (16 bytes), the date in seconds since epoch (1970-1-1 0:00:00 UTC) in decimal notion (12 bytes), the uid and gid of the owner in decimal notion (each 6 bytes; usually both 0), the file member mode in octal notion, begining with 1 (8 bytes; usually 100644), the size of the member body (the size is measure without possible padding to the body; 10 bytes) and the two bytes "`\n". After the member head, the member body follows unquoted; if the member body has uneven lenght, it is padded with a single '\n'; so any members start on an even byte boundry. So, the initial member looks like: debian-binary 1070194109 0 0 100644 4 ` 2.0 Newer ar features (as longer file names, filesnames with spaces, ...) are a violation of this standard; however, extracting tools should try to support them as good as possible, but if they do not, that's just at maximum a wishlist bug. DEB 2 ARCHIVE MEMBERS Archives with the major number 2 must have (after the initial member debian-binary) in this exact order the members control.tar.gz and data.tar.gz. After this, optional members can follow, but they must have a '_' as the first character of their name. control.tar.gz is a gzipped tar archive containing the package control information, as a series of plain files, of which the file control is mandatory and contains the core control information. Please see the Debian Packaging Manual, section 2.2 for details of these files. The control tarball may optionally contain an entry for `.', the current directory. data.tar.gz contains the filesystem archive as a gzipped tar archive. DEB 1 ARCHIVE MEMBERS See the man-page deb-old(5) for a definition. -- http://home.arcor.de/andreas-barth/ PGP 1024/89FB5CE5 DC F1 85 6D A6 45 9C 0F 3B BE F1 D0 C5 D1 D9 0C