Hi zimoun, zimoun <zimon.touto...@gmail.com> writes:
> One question is how this database scales? > > For example, a quick back-to-envelop estimation leads to ~1.2GB metadata > for ~14k packages and then an increase of ~700MB per year, both with the > Ludo’s code [1]. > > [1] <http://issues.guix.gnu.org/issue/42162#11> It’s a good question. A good part of the size comes from the representation rather than the data. Compression helps a lot here. I have a database of 3,912 packages. It’s 295M uncompressed (which is a little better than your estimation). If I pass each file through Lzip, it shrinks down to 60M. That’s more like 15.5K per package, which is almost an order of magnitude smaller than the estimation you used (120K). I think that makes the numbers rather pleasant, but it comes at the expense of easy storing in Git. > As mentioned [2], should this service be part of SWH (download cooking > task)? Or project side? > > [2] <https://forge.softwareheritage.org/T2430#47486> It would be interesting to just have SWH absorb the project. Since other distros already know how to produce a “sources.json” and how to query the SWH archive, it would mean that they benefit for free (and so would Guix, for that matter). I’m open to that, but right now having the freedom to experiment is important. -- Tim