On 28/09/18 20:41, Ben Peart wrote:
>
>
> On 9/28/2018 1:07 PM, Junio C Hamano wrote:
>> Ben Peart <peart...@gmail.com> writes:
>>
>>>> Why does multithreading have to be disabled in this test?
>>>
>>> If multi-threading is enabled, it will write out the IEOT extension
>>> which changes the SHA and causes the test to fail.
>>
>> I think it is a design mistake to let the writing processes's
>> capability decide what is written in the file to be read later by a
>> different process, which possibly may have different capability. If
>> you are not writing with multiple threads, it should not matter if
>> that writer process is capable of and configured to spawn 8 threads
>> if the process were reading the file---as it is not reading the file
>> it is writing right now.
>>
>> I can understand if the design is to write IEOT only if the
>> resulting index is expected to become large enough (above an
>> arbitrary threshold like 100k entries) to matter. I also can
>> understand if IEOT is omitted when the repository configuration says
>> that no process is allowed to read the index with multi-threaded
>> codepath in that repository.
>>
>
> There are two different paths which determine how many blocks are written to
> the IEOT. The first is the default path. On this path, the number of blocks
> is determined by the number of cache entries divided by the THREAD_COST. If
> there are sufficient entries to make it faster to use threading, then it will
> automatically use enough blocks to optimize the performance of reading the
> entries across multiple threads.
>
> I currently cap the maximum number of blocks to be the number of cores that
> would be available to process them on that same machine purely as an
> optimization. The majority of the time, the index will be read from the same
> machine that it was written on so this works well. Before I added that
> logic, you would usually end up with more blocks than available threads which
> meant some threads had more to do than the other threads and resulted in
> worse performance. For example, 4 blocks across 3 threads results in the 1st
> thread having twice as much work to do as the other threads.
>
> If the index is copied to a machine with a different number of cores, it will
> still all work - it just may not be optimal for that machine. This is self
> correcting because as soon as the index is written out, it will be optimized
> for that machine.
>
> If the "automatically try to make it perform optimally" logic doesn't work
> for some reason, we have path #2.
>
> The second path is when the user specifies a specific number of blocks via
> the GIT_TEST_INDEX_THREADS=<n> environment variable or the index.threads=<n>
> config setting. If they ask for n blocks, they will get n blocks. This is
> the "I know what I'm doing and want to control the behavior" path.
>
> I just added one additional test (see patch below) to avoid a divide by zero
> bug and simplify things a bit. With this change, if there are fewer than two
> blocks, the IEOT extension is not written out as it isn't needed. The load
> would be single threaded anyway so there is no reason to write out a IEOT
> extensions that won't be used.
>
>
>
> diff --git a/read-cache.c b/read-cache.c
> index f5d766088d..a1006fa824 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2751,18 +2751,23 @@ static int do_write_index(struct index_state *istate,
> struct tempfile *tempfil
> e,
> */
> if (!nr) {
> ieot_blocks = istate->cache_nr / THREAD_COST;
> - if (ieot_blocks < 1)
> - ieot_blocks = 1;
> cpus = online_cpus();
> if (ieot_blocks > cpus - 1)
> ieot_blocks = cpus - 1;
So, am I reading this correctly - you need cpus > 2 before an
IEOT extension block is written out?
OK.
ATB,
Ramsay Jones
> } else {
> ieot_blocks = nr;
> }
> - ieot = xcalloc(1, sizeof(struct index_entry_offset_table)
> - + (ieot_blocks * sizeof(struct index_entry_offset)));
> - ieot->nr = 0;
> - ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
> +
> + /*
> + * no reason to write out the IEOT extension if we don't
> + * have enough blocks to utilize multi-threading
> + */
> + if (ieot_blocks > 1) {
> + ieot = xcalloc(1, sizeof(struct
> index_entry_offset_table)
> + + (ieot_blocks * sizeof(struct
> index_entry_offset)));
> + ieot->nr = 0;
> + ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
> + }
> }
> #endif
>
>