David Kastrup <d...@gnu.org> writes: > Jonas Hahnfeld <hah...@hahnjo.de> writes: > >> Am Mittwoch, den 15.07.2020, 21:35 +0200 schrieb David Kastrup: >>> Jonas Hahnfeld <hah...@hahnjo.de> writes: >>> > Am Mittwoch, den 15.07.2020, 17:31 +0100 schrieb Phil Holmes: >>> > > Here's the logfile and the ly file. >>> > >>> > Could this be collisions of the random file names generated for >>> > temporary files? The argument to backend-library.scm:248 comes >>> > from create-file-exclusive which returns #f if the file already exists >>> > (or could not be created). >>> >>> I had commented on the respective issue without response that the >>> parallel processes, without taking additional measures, will generate >>> the same "random" sequence, making this no better than just using >>> sequential numbers. >> >> "additional measures" are in place: multi-fork calls randomize-rand- >> seed *after* forking. The seed is initialized based on the current >> timestamp (might be the same) and the pid (different in the course of >> one run). We can still have collisions, but the amount of trouble (or >> rather the lack of reports until now) indicates that it is better than >> sequential numbers. This was discussed (and answered) in the review. > > Well ok. But only 1000000 random numbers are being used (there is > another call using 10000000 instead, the choice appearing random). > Let's assume we have 10 processes going through 138 files each. The > processes are going to switch to the next output file asynchronously, so > with any change, there is a chance of the old number colliding with the > other processes' numbers, and the new number colliding. The probability > that a new number is different from an existing set of 9 is > 999991/1000000. If we do this switch 1380 times, the probability of a > collision during one run is 1-(999991/1000000)^1380, about 1 in 80. > > Now if I remember correctly, there were some changes in how > lilypond-book worked that typically resulted in double the number of > processes getting spawned than asked for which would give us 19 instead > of 9 possibilities for collision. That would raise the probability of a > collision to about 1 in 40 runs.
Not using random at all but using the pid, in contrast, should be collision-proof, assuming that we are not working on a shared file system accessed by multiple computers with separate process id pools. But then locking is likely to be non-working anyway. -- David Kastrup