Junio C Hamano <[email protected]> writes:
> David Kastrup <[email protected]> writes:
>
>> The previous implementation uses a sorted linear list of struct
>> blame_entry in a struct scoreboard for organizing all partial or
>> completed work. Every task that is done requires going through the
>> whole list where most entries are not relevant to the task at hand.
>>
>> This commit reorganizes the data structures in order to have each
>> remaining subtask work with its own sorted linear list it can work off
>> front to back. Subtasks are organized into "struct origin" chains
>> hanging off particular commits. Commits are organized into a priority
>> queue, processing them in commit date order in order to keep most of
>> the work affecting a particular blob collated even in the presence of
>> an extensive merge history. In that manner, linear searches can be
>> basically avoided anywhere. They still are done for identifying one
>> of multiple analyzed files in a given commit, but the degenerate case
>> of a single large file being assembled from a multitude of smaller
>> files in the past is not likely to occur often enough to warrant
>> special treatment.
>> ---
>
> Sign-off?
Not while this is not fit for merging because of #if 0 etc and missing
functionality. This is just for review.
> Actually, I'd like to take my previous suggestion to add this as
> blame2 (to replace blame in the future) back. That was based on my
> fear/hope to see an implementation based on a completely different
> approach, but the basic premise of having one blame_entry record per
> the lines of the final image in the scoreboard, using diff between
> parents to the child to find common lines for passing the blame up,
> etc. have not changed at all and the change is all about organizing
> the pieces of data in a *much* *better* way to avoid needlessly
> finding what we already have computed.
Yes. Basically, the call graph outside of blame.c itself should be
pretty much the same.
> Style; please make /* and */ sit on their own line without anything
> else in multi-line comments.
Will do.
>> + struct origin *next;
>> struct commit *commit;
>> + /* `suspects' contains blame entries that may be attributed to
>> + * this origin's commit or to parent commits. When a commit
>> + * is being processed, all suspects will be moved, either by
>> + * assigning them to an origin in a different commit, or by
>> + * shipping them to the scoreboard's ent list because they
>> + * cannot be attributed to a different commit.
>> + */
>> + struct blame_entry *suspects;
>> mmfile_t file;
>> unsigned char blob_sha1[20];
>> unsigned mode;
>> + /* shipped gets set when shipping any suspects to the final
>> + * blame list instead of other commits
>> + */
>> + char shipped;
>
> Unused?
More like "added, forgotten, added under yet another name":
>> + char guilty;
I'll have to decide which one to keep.
>> + /* Should be present exactly once in commit chain */
>> + for (p = o->commit->util; p; l = p, p = p->next) {
>> + if (p == o)
>> + {
>
> Style; please have { sit with the control structure that opens the
> block, unless it is the opening brace of the function body or
> struct/union definition.
Ok.
>> +static struct blame_entry *
>> +blame_merge(struct blame_entry *list1,
>> + struct blame_entry *list2)
>
> Style; we do not do GNU-ish "function name begins a line". Even
> though I personally find it easier to use for things like 'grep',
> that is not the style we use around here.
Ok. I'm also certain to have a few "space between function name and
arguments" left and will grep for those before submitting the next
version.
Next version will at least include -M option, possibly leaving -C for
later.
--
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html