On 08/03/2016 09:00 PM, Andreas Tille wrote: > 2b) Do the conversion of the format in postinst at the expense > of users time which is acceptable since the package usually > unpacks on high performance machines and not so many > installations which means bandwidth and disk space on Debian > mirrors should be saved here instead of users machine > > Source tarball 256MB + binary package ~250MB (estimated)
Personally, I think that'd probably be the best solution, at least as long as there are not too many updates to the package. I'm thinking that if the data changes once or twice a year, that'd be OK. If it's twice a week, then I think the only realistic solution would be 3b). There are some large data packages in sid already though, even reaching the sizes you describe, but if you can avoid this, especially for low-popcon packages, I think having the user's computer do a little more work in postinst is a reasonable trade- off here. For reference, the top 5 sorted by deb size: Package deb Size (GiB) Installed Size (GiB) --------------------------------------------------------------------- flightgear-data-base 1.06257 1.50826 freefoam-dev-doc 0.84636 1.49562 redeclipse-data 0.72715 0.832576 0ad-data 0.540366 1.4238 libpcl1.7-dbg 0.530659 0.578442 Top 5 sorted by installed size: Package deb Size (GiB) Installed Size (GiB) ---------------------------------------------------------------------- linux-image-4.6.0-1-rt-amd64-dbg 0.409186 3.09527 linux-image-4.6.0-1-amd64-dbg 0.410594 3.09287 linux-image-4.5.0-2-amd64-dbg 0.393085 2.87757 flightgear-data-base 1.06257 1.50826 freefoam-dev-doc 0.84636 1.49562 Shell snipped I used to get this data: awk '/^Package:/ { pkg = $2; } /^Installed-Size:/ { is = $2; } /^Size:/ { print pkg, $2, is }' \ < /var/lib/apt/lists/*_debian_dists_sid_main_binary-amd64_Packages \ | sort -k3 -n \ | awk '{ print $1, $2 / 1024.0 / 1024.0 / 1024.0, $3 / 1024.0 / 1024.0 }' \ | tail -n 5 \ | tac Using a similar snippet, I could determine that there are 34 packages with deb size larger than 200 MiB in the archive at the moment; 51 larger than 150 MiB and 88 larger than 100 MiB. (This does not include -dbgsym packages in the debug section.) > (possibly be upstream can be convinced > to provide a *.bz2 tarball for maximum compression). Please don't use bz2 anymore. It's really slow and doesn't do any better than e.g. xz. (There's a reason why Debian migrated away from it.) If you convince upstream to provide a better tarball, please suggest a better algorithm. (The compression level with xz does play a role when it comes to speed, but there you can choose a reasonable trade-off. The memory requirements of xz should be irrelevant here, because I can't see the software you're describing being used on an extermely low-memory system.) > 3a) Use postinst [to download stuff] I don't think that's a good idea, simply because of the amount of data. Downloading things in postinst is OK if it's a couple of megabytes (see e.g. ttf-mscorefonts-installer in contrib), but 250 Megs? Also, I could imagine that in a Lab setup where this might be useful, you'd maybe want to have an air-gapped computer and use apt-offline to update it. And this would definitely break that kind of thing. > 3b) Inform user to call a download script manually to do not > block apt for a longer time dealing with potential download > problems. That would be the only option if you don't bundle the data with the package. In that case, maybe patch the binary to detect that situation and inform the user they should still do this? But as I said: if your package doesn't change that often, I don't think that adding 2x 250 MiB (source and arch=all, right?) is necessarily excessive. Regards, Christian