Michael Haggerty <mhag...@alum.mit.edu> writes:

> But now that I'm writing this, a silly question occurs to me: Do we need
> an overall option like this at all? If I demote all blob-integrity
> checks to "ignore" via the mechanism that you have added, then shouldn't
> fsck automatically detect that it doesn't have to open the blobs at all
> and enable this speedup automatically?

That's brilliant.

Just to make sure I am reading you correctly, you mean the current
overall structure:

        if (! "is the blob loadable and well-formed?") {
                report("BAD BLOB");
                ... which is equivalent to ...
                if ("is bad_blob ignored?")
                        ; /* no-op */
                else {
                        output "BAD BLOB";
                        if ("is bad_blob an error?")
                                return 1; /* error */
                }
        }
        ... other checks ...

can be turned into this structure:

        if ("is bad_blob ignored?")
                ;
        else if (! "is the blob loadable and well-formed?") {
                report("BAD BLOB");
                ... which would be equivalent to ...
                output "BAD BLOB";
                if ("is bad_blob an error?")
                        return 1; /* error */
        }
        ... other checks ...

I think that makes tons of sense.  With one minor caveat.  In the
above "rewrite" I deliberately described report() in the updated
flow to always output, but that would force all checkers to adopt
"do not unconditionally check; if we do not report, do not even
bother checking", which (1) would be a large change to what Dscho
currently have, and (2) might not apply to certain kinds of error
conditions.

But that minor caveat is easily addressed by keeping the "if we are
set to ignore this error, just return 0" in report().  Some codepaths
(like the "BAD BLOB" above) may not exercise it by bypassing the
call to report() upfront and that is perfectly fine.

I like that idea.

> So maybe
> `--(no-)?check-blob-integrity` is actually a shorthand for turning a few
> more specific checks on/off at once.
>
> As for thinking of a shorter name for the option: assuming the blob
> integrity checks can be turned on and off independently as described
> above, then I think it is reasonable to *also* add a `--quick` option
> defined as
>
> --quick: Skip some expensive checks, dramatically reducing the
>     runtime of `git fsck`. Currently this is equivalent to
>     `--no-check-blob-integrity`.
>
> In the future if we invent other expensive checks we might also add them
> to the list of things that are skipped by `--quick`.

Yes, that is doubly brilliant. Taken together with the auto-skipping
of the checks based on the report settings, it makes it unnecessary
to even introduce --[no-]check-blob-integrity or any other new
knobs.

Very well analysed.  I am happy with the "--quick" with that
definition.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in

Reply via email to