I have a question / suggestion about the distributed substitutes project: would downloads be split into uniformly sized chunks or could the sizes vary? Specifically, in an extreme case where an update introduced a single extra byte at the beginning of a file, would that result in completely new chunks?
An alternative I've been thinking about is this: find the store references in a file and split it along these references, optionally apply further chunking to the non-reference blobs. It's probably best to do this at the NAR level?? Storing reference offsets is already something that we should be doing to speed other operations up, so this could tie in nicely with that.