Re: reftable [v4]: new ref storage format

2017-08-05 Thread Shawn Pearce
On Tue, Aug 1, 2017 at 6:51 PM, Michael Haggerty wrote: > On Tue, Aug 1, 2017 at 4:27 PM, Shawn Pearce wrote: >> On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty >> wrote: >>> On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: 4th iteration of the reftable storage format. [...] >

Re: reftable [v4]: new ref storage format

2017-08-03 Thread Shawn Pearce
On Thu, Aug 3, 2017 at 3:48 PM, Michael Haggerty wrote: > I've revised the blockless reftable proposal to address some feedback: I've been thinking more about your blockless proposal. I experimentally modified my reftable implementation to omit padding between blocks, bringing it a tiny bit clos

Re: reftable [v4]: new ref storage format

2017-08-03 Thread Michael Haggerty
I've revised the blockless reftable proposal to address some feedback: * Don't omit `prefix_len` for the first ref and first child in a block. It doesn't save much but makes the reader more complicated. * Get rid of `symref_target` (the backlink from a reference to the symlink(s) that point at it)

Re: reftable [v4]: new ref storage format

2017-08-03 Thread Shawn Pearce
On Thu, Aug 3, 2017 at 11:38 AM, Michael Haggerty wrote: > On Tue, Aug 1, 2017 at 7:38 PM, Shawn Pearce wrote: >> On Tue, Aug 1, 2017 at 6:51 PM, Michael Haggerty >> wrote: >>> On Tue, Aug 1, 2017 at 4:27 PM, Shawn Pearce wrote: On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty wr

Re: reftable [v4]: new ref storage format

2017-08-03 Thread Shawn Pearce
On Wed, Aug 2, 2017 at 1:28 PM, Jeff King wrote: > On Wed, Aug 02, 2017 at 12:50:39PM -0700, Junio C Hamano wrote: > >> With the traditional "packed-refs plus loose" layout, no matter how >> many times a handful of selected busy refs are updated during the >> day, you'd need to open at most two fi

Re: reftable [v4]: new ref storage format

2017-08-03 Thread Michael Haggerty
On Tue, Aug 1, 2017 at 7:38 PM, Shawn Pearce wrote: > On Tue, Aug 1, 2017 at 6:51 PM, Michael Haggerty wrote: >> On Tue, Aug 1, 2017 at 4:27 PM, Shawn Pearce wrote: >>> On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty >>> wrote: [...] >>> A couple of other notes about your contrasting d

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Shawn Pearce writes: > On Wed, Aug 2, 2017 at 6:50 PM, Junio C Hamano wrote: > ... >>Would it benefit us if we define the sort order of bytes slightly >>different from the ASCII order, so that a slash '/' sorts between >>NUL '\000' and SOH '\001', which is the order we should have us

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Shawn Pearce
On Wed, Aug 2, 2017 at 6:50 PM, Junio C Hamano wrote: > Junio C Hamano writes: > >> I like the general idea, what the file format can represent and how >> it does so, but I am a bit uneasy about how well this "stacked" part >> would work for desktop clients. > > Two more random things before I fo

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Junio C Hamano writes: > I like the general idea, what the file format can represent and how > it does so, but I am a bit uneasy about how well this "stacked" part > would work for desktop clients. Two more random things before I forget. * I understand that you would want to allow both a ref "

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 12:50:39PM -0700, Junio C Hamano wrote: > With the traditional "packed-refs plus loose" layout, no matter how > many times a handful of selected busy refs are updated during the > day, you'd need to open at most two files to find out the current > value of a single ref (adm

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Stefan Beller
> ### Ref block format > > A ref block is written as: > > 'r' > uint24( block_len ) > ref_record+ > uint32( restart_offset )+ > uint16( restart_count ) > padding? > So I learned that your current writer is a two block pass, i.e. the block is first written into memory and th

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Shawn Pearce writes: > ### Layout > > The `$GIT_DIR/refs` path is a file when reftable is configured, not a > directory. This prevents loose references from being stored. > > A collection of reftable files are stored in the `$GIT_DIR/reftable/` > directory: > > 0001_UF4paF > 0002

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 08:17:29AM -0700, Shawn Pearce wrote: > > Just peeking at torvalds/linux, we have some objects with ~35K refs > > pointing to them (e.g., the v2.6.11 tag). > > Oy. I'll bet that every occurrence winds up in its own block due to > the layout of the namespace, and so the obj

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 08:20:44AM -0400, Dave Borowitz wrote: > >> OTOH a mythical protocol v2 might reduce the need to scan the > >> references for advertisement, so maybe this optimization will be more > >> helpful in the future? > > I haven't been following the status of the proposal, but I w

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Shawn Pearce writes: > On Wed, Aug 2, 2017 at 2:28 AM, Jeff King wrote: >> On Tue, Aug 01, 2017 at 07:38:37PM -0700, Shawn Pearce wrote: >> >>> > OBJS blocks can also be >>> > unbounded in size if very many references point at the same object, >>> > thought that is perhaps only a theoretical pro

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Shawn Pearce
On Wed, Aug 2, 2017 at 2:28 AM, Jeff King wrote: > On Tue, Aug 01, 2017 at 07:38:37PM -0700, Shawn Pearce wrote: > >> > OBJS blocks can also be >> > unbounded in size if very many references point at the same object, >> > thought that is perhaps only a theoretical problem. >> >> Gah, I missed that

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Dave Borowitz
On Tue, Aug 1, 2017 at 10:38 PM, Shawn Pearce wrote: >> Peff and I discussed off-list whether the lookup-by-SHA-1 feature is >> so important in the first place. Currently, all references must be >> scanned for the advertisement anyway, > > Not really. You can hide refs and allow-tip-sha1 so client

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Tue, Aug 01, 2017 at 07:38:37PM -0700, Shawn Pearce wrote: > > OBJS blocks can also be > > unbounded in size if very many references point at the same object, > > thought that is perhaps only a theoretical problem. > > Gah, I missed that in reftable. The block id pointer list could cause > a s

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Shawn Pearce
On Tue, Aug 1, 2017 at 6:51 PM, Michael Haggerty wrote: > On Tue, Aug 1, 2017 at 4:27 PM, Shawn Pearce wrote: >> On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty >> wrote: >>> On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: 4th iteration of the reftable storage format. [...] >

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Michael Haggerty
On Tue, Aug 1, 2017 at 4:27 PM, Shawn Pearce wrote: > On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty > wrote: >> On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: >>> 4th iteration of the reftable storage format. >>> [...] >> >> Before we commit to Shawn's reftable proposal, I wanted to

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Michael Haggerty
On Tue, Aug 1, 2017 at 1:23 PM, Shawn Pearce wrote: > On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty > wrote: >> On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: >>> 4th iteration of the reftable storage format. >>> [...] >> >> Before we commit to Shawn's reftable proposal, I wanted to

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Shawn Pearce
On Tue, Aug 1, 2017 at 4:27 PM, Shawn Pearce wrote: > On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty > wrote: >> On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: >>> 4th iteration of the reftable storage format. >>> [...] >> >> Before we commit to Shawn's reftable proposal, I wanted to

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Shawn Pearce
On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty wrote: > On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: >> 4th iteration of the reftable storage format. >> [...] > > Before we commit to Shawn's reftable proposal, I wanted to explore > what a contrasting design that is not block based wou

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Shawn Pearce
On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty wrote: > On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: >> 4th iteration of the reftable storage format. >> [...] > > Before we commit to Shawn's reftable proposal, I wanted to explore > what a contrasting design that is not block based wou

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Shawn Pearce
On Mon, Jul 31, 2017 at 4:43 PM, Shawn Pearce wrote: > On Mon, Jul 31, 2017 at 12:42 PM, Junio C Hamano wrote: >> >> As a block cannot be longer than 16MB, allocating uint32 to a >> restart offset may be a bit overkill. I do not know if it is worth >> attempting to pack 1/3 more restart entries

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Shawn Pearce
On Tue, Aug 1, 2017 at 6:54 AM, Dave Borowitz wrote: > On Sun, Jul 30, 2017 at 11:51 PM, Shawn Pearce wrote: >> - Ref-like files (FETCH_HEAD, MERGE_HEAD) also use type 0x3. > >> - Combine reflog storage with ref storage for small transactions. >> - Separate reflog storage for base refs and histor

Re: reftable [v4]: new ref storage format

2017-08-01 Thread Dave Borowitz
On Sun, Jul 30, 2017 at 11:51 PM, Shawn Pearce wrote: > - Ref-like files (FETCH_HEAD, MERGE_HEAD) also use type 0x3. > - Combine reflog storage with ref storage for small transactions. > - Separate reflog storage for base refs and historical logs. How is the stash implemented in reftable? In par

Re: reftable [v4]: new ref storage format

2017-07-31 Thread Michael Haggerty
On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: > 4th iteration of the reftable storage format. > [...] Before we commit to Shawn's reftable proposal, I wanted to explore what a contrasting design that is not block based would look like. So I threw together a sketch of such a design. It is n

Re: reftable [v4]: new ref storage format

2017-07-31 Thread Shawn Pearce
On Mon, Jul 31, 2017 at 12:42 PM, Junio C Hamano wrote: > Shawn Pearce writes: > >> ### Peeling >> >> References in a reftable are always peeled. > > This hopefully means "a record for an annotated (or signed) tag > records both the tag object and the object it refers to", and does > not include

Re: reftable [v4]: new ref storage format

2017-07-31 Thread Shawn Pearce
On Mon, Jul 31, 2017 at 12:01 PM, Stefan Beller wrote: > On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: >> 4th iteration of the reftable storage format. >> >> You can read a rendered version of this here: >> https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/refta

Re: reftable [v4]: new ref storage format

2017-07-31 Thread Junio C Hamano
Shawn Pearce writes: > ### Peeling > > References in a reftable are always peeled. This hopefully means "a record for an annotated (or signed) tag records both the tag object and the object it refers to", and does not include peeling a commit down to its tree. > ### Reference name encoding > >

Re: reftable [v4]: new ref storage format

2017-07-31 Thread Stefan Beller
On Sun, Jul 30, 2017 at 8:51 PM, Shawn Pearce wrote: > 4th iteration of the reftable storage format. > > You can read a rendered version of this here: > https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md > > Significant changes from v3: > - Incorporated Micha

Re: reftable [v4]: new ref storage format

2017-07-31 Thread Dave Borowitz
On Sun, Jul 30, 2017 at 11:51 PM, Shawn Pearce wrote: > - Near constant time verification a SHA-1 is referred to by at least > one reference (for allow-tip-sha1-in-want). I think I understated the importance of this when I originally brought up allow-tip-sha1-in-want. This is an important optim

Re: reftable [v4]: new ref storage format

2017-07-30 Thread Shawn Pearce
4th iteration of the reftable storage format. You can read a rendered version of this here: https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md Significant changes from v3: - Incorporated Michael Haggerty's update_index concept for reflog. - Explicitly docume