On Tue, 27 Nov 2018 16:01:15 -0500 Rich Freeman <ri...@gentoo.org> wrote:
> Our repo is a linked list being constantly manipulated from the head > backed by a hashed object store for the contents. For that use case > it is probably the ideal data structure. Since our use case is > actually the typical use case, it isn't a surprise that this was the > design that was chosen... :) > > Computers are pretty fast when you actually use the correct algorithm... There's more to it than that. If that was all it was, then imagine if it wasn't for all the compression and differencing tricks. The raw size of an uncompressed verbatim, undifferential repository for Gentoo would be phenomenal. As it is, its fortunate we don't do a lot of things that *need* raw access to non-tip commits, because doing so becomes very exhausting. And were it not for its compression techniques and the fact our use of Portage results in a vast number of highly-self-similar entries, then we'd likely be slaughtered by disk IO alone, regardless of the linked list approach. Just don't try using filter branch on a whole gentoo repository, you'll quickly learn why. ( You'll find yourself having to employ lots of tricks with git fast-export instead just to avoid projected times in weeks )
pgpVvTL7xdHAp.pgp
Description: OpenPGP digital signature