On 10/26/2017 08:46 AM, Junio C Hamano wrote:
> Michael Haggerty <[email protected]> writes:
>
>> Even when we are deleting references, we needn't overwrite the
>> `packed-refs` file if the references that we are deleting are not
>> present in that file. Implement this optimization as follows:
>
> This goal I can understand. files-transaction-prepare may see a
> deletion and in order to avoid a deletion of a ref from the
> file-backend to expose a stale entry in the packed-refs file, it
> prepares a transaction to remove the same ref that might exist in
> it. If it turns out that there is no such ref in the packed-refs
> file, then we can leave the packed-refs file as-is without
> rewriting.
>
>> * Add a function `is_packed_transaction_noop()`, which checks whether
>> a given packed-refs transaction doesn't actually have to do
>> anything. This function must be called while holding the
>> `packed-refs` lock to avoid races.
>
> This checks three things only to cover the most trivial case (I am
> perfectly happy that it punts on more complex cases). If an update
>
> - checks the old value,
>
> - sets a new value, or
>
> - deletes a ref that is not on the filesystem,
>
> it is not a trivial case. I can understand the latter two (i.e. We
> are special casing the deletion, and an update with a new value is
> not. If removing a file from the filesystem is not sufficient to
> delete, we may have to rewrite the packed-refs), but not the first
> one. Is it because currently we do not say "delete this ref only
> when its current value is X"?
It wouldn't be too hard to allow updates with REF_HAVE_OLD to be
optimized away too. I didn't do it because
1. We currently only create updates for that packed_refs backend with
REF_HAVE_OLD turned off, so such could would be unreachable (and
untestable).
2. I wanted to keep the patch as simple as possible in case you decide
to merge it into 2.15.
There would also be a little bit of a leveling violation (or maybe the
function name is not ideal). `is_packed_transaction_noop()` could check
that `old_oid` values are correct, and if they all are, declare the
transaction a NOOP. (It wasn't *really* a NOOP, but its OP, namely
checking old values, has already been carried out.) But what if it finds
that an `old_oid` was incorrect? Right now
`is_packed_transaction_noop()` has no way to report something like a
TRANSACTION_GENERIC_ERROR.
It could be that long-term it makes more sense for this optimization to
happen in `packed_transaction_prepare()`. Except that function is
sometimes called for its side-effects, like sorting an existing file, in
which case overwriting the `packed-refs` file shouldn't be optimized away.
So overall it seemed easier to punt on this optimization at this point.
> Also it is not immediately obvious how the "is this noop" helper is
> checking the absence of the same-named ref in the current
> packed-refs file, which is what matters for the correctness of the
> optimization.
The check is in the second loop in `is_packed_transaction_noop()`:
if (!refs_read_raw_ref(ref_store, update->refname,
oid.hash, &referent, &type) ||
errno != ENOENT) {
/*
* We might have to actually delete that
* reference -> not a NOOP.
*/
ret = 0;
break;
}
If the refname doesn't exist, then `refs_read_raw_ref()` returns -1 and
sets errno to ENOENT, and the loop continues. Otherwise we exit with a
value of 0, meaning that the transaction is not a NOOP.
There are a lot of double-negatives here. It might be easier to follow
the logic if the sense of the function were inverted:
`is_packed_transaction_needed()`. Let me know if you have a strong
feeling about it.
Michael