Re: Today's problem with GUB build

David Kastrup Wed, 15 Jul 2020 14:01:29 -0700

David Kastrup <d...@gnu.org> writes:

> Jonas Hahnfeld <hah...@hahnjo.de> writes:
>
>> Am Mittwoch, den 15.07.2020, 21:35 +0200 schrieb David Kastrup:
>>> Jonas Hahnfeld <hah...@hahnjo.de> writes:
>>> > Am Mittwoch, den 15.07.2020, 17:31 +0100 schrieb Phil Holmes:
>>> > > Here's the logfile and the ly file.
>>> > 
>>> > Could this be collisions of the random file names generated for
>>> > temporary files? The argument to backend-library.scm:248 comes
>>> > from create-file-exclusive which returns #f if the file already exists
>>> > (or could not be created).
>>> 
>>> I had commented on the respective issue without response that the
>>> parallel processes, without taking additional measures, will generate
>>> the same "random" sequence, making this no better than just using
>>> sequential numbers.
>>
>> "additional measures" are in place: multi-fork calls randomize-rand-
>> seed *after* forking. The seed is initialized based on the current
>> timestamp (might be the same) and the pid (different in the course of
>> one run). We can still have collisions, but the amount of trouble (or
>> rather the lack of reports until now) indicates that it is better than
>> sequential numbers. This was discussed (and answered) in the review.
>
> Well ok.  But only 1000000 random numbers are being used (there is
> another call using 10000000 instead, the choice appearing random).
> Let's assume we have 10 processes going through 138 files each.  The
> processes are going to switch to the next output file asynchronously, so
> with any change, there is a chance of the old number colliding with the
> other processes' numbers, and the new number colliding.  The probability
> that a new number is different from an existing set of 9 is
> 999991/1000000.  If we do this switch 1380 times, the probability of a
> collision during one run is 1-(999991/1000000)^1380, about 1 in 80.
>
> Now if I remember correctly, there were some changes in how
> lilypond-book worked that typically resulted in double the number of
> processes getting spawned than asked for which would give us 19 instead
> of 9 possibilities for collision.  That would raise the probability of a
> collision to about 1 in 40 runs.


Not using random at all but using the pid, in contrast, should be
collision-proof, assuming that we are not working on a shared file
system accessed by multiple computers with separate process id pools.
But then locking is likely to be non-working anyway.

-- 
David Kastrup

Re: Today's problem with GUB build

Reply via email to