Eric Bollengier <[EMAIL PROTECTED]> writes: > I will upload a patch for the patch on sourceforge.
Glad to hear it! I'm starting to suspect that there's more trouble lingering in this area, though. I was just reading through the JOBMEDIA table for the sequence of jobs I ran to test that my fix was correct, and I notice some little weirdnesses... These are the first jobs I ran, with a cleanly initialized catalog: 1: ~26GB, priority 5, fast machine 2: ~10GB, priority 5, slow machine 3: ~34GB, priority 10, slow machine ...and then a number of smaller jobs, on various machines. I have things set up to allow up to three jobs to be spooling at the same time, each using a maximum of 16GB spool space. Job 1, coming from a fast system, hit 16GB before job 2 was done. When it had spooled those 16GB out, job 2 was done, and spooled its data out. Job 1 ran to completion, and spooled out. Now look at this: jobmediaid | jobid | mediaid | firstindex | lastindex | startfile | endfile | startblock | endblock | volindex | copy ------------+-------+---------+------------+-----------+-----------+---------+------------+----------+----------+------ 1 | 1 | 1 | 1 | 5959 | 0 | 0 | 1 | 15499 | 1 | 0 2 | 1 | 1 | 5959 | 9885 | 1 | 1 | 0 | 15499 | 2 | 0 3 | 1 | 1 | 9885 | 15463 | 2 | 2 | 0 | 15499 | 3 | 0 4 | 1 | 1 | 15463 | 15673 | 3 | 3 | 0 | 15499 | 4 | 0 5 | 1 | 1 | 15673 | 15903 | 4 | 4 | 0 | 15499 | 5 | 0 6 | 1 | 1 | 15903 | 16161 | 5 | 5 | 0 | 15499 | 6 | 0 7 | 1 | 1 | 16161 | 16386 | 6 | 6 | 0 | 15499 | 7 | 0 8 | 1 | 1 | 16386 | 16642 | 7 | 7 | 0 | 15499 | 8 | 0 9 | 1 | 1 | 16642 | 16863 | 8 | 8 | 0 | 15499 | 9 | 0 10 | 1 | 1 | 16863 | 17048 | 9 | 9 | 0 | 15499 | 10 | 0 11 | 1 | 1 | 17048 | 17254 | 10 | 10 | 0 | 15499 | 11 | 0 12 | 1 | 1 | 17254 | 17472 | 11 | 11 | 0 | 15499 | 12 | 0 13 | 1 | 1 | 17472 | 17682 | 12 | 12 | 0 | 15499 | 13 | 0 14 | 1 | 1 | 17682 | 17904 | 13 | 13 | 0 | 15499 | 14 | 0 15 | 1 | 1 | 17904 | 21904 | 14 | 14 | 0 | 15499 | 15 | 0 16 | 1 | 1 | 21904 | 28550 | 15 | 15 | 0 | 15499 | 16 | 0 17 | 1 | 1 | 28550 | 28556 | 16 | 16 | 0 | 15499 | 17 | 0 18 | 2 | 1 | 1 | 4229 | 17 | 17 | 2756 | 15499 | 1 | 0 19 | 2 | 1 | 4229 | 4905 | 18 | 18 | 0 | 15499 | 2 | 0 20 | 2 | 1 | 4905 | 9033 | 19 | 19 | 0 | 15499 | 3 | 0 21 | 2 | 1 | 9033 | 11691 | 20 | 20 | 0 | 15499 | 4 | 0 22 | 2 | 1 | 11691 | 14411 | 21 | 21 | 0 | 15499 | 5 | 0 23 | 2 | 1 | 14411 | 15215 | 22 | 22 | 0 | 15499 | 6 | 0 24 | 2 | 1 | 15215 | 15942 | 23 | 23 | 0 | 15499 | 7 | 0 25 | 2 | 1 | 15942 | 16795 | 24 | 24 | 0 | 15499 | 8 | 0 26 | 2 | 1 | 16795 | 17586 | 25 | 25 | 0 | 15499 | 9 | 0 27 | 2 | 1 | 17586 | 17589 | 26 | 26 | 0 | 11094 | 10 | 0 28 | 1 | 1 | 28556 | 28558 | 17 | 20 | 0 | 3397 | 18 | 0 29 | 1 | 1 | 28558 | 28561 | 26 | 26 | 11095 | 15499 | 19 | 0 30 | 1 | 1 | 28561 | 28707 | 27 | 27 | 0 | 15499 | 20 | 0 31 | 1 | 1 | 28707 | 31670 | 28 | 28 | 0 | 15499 | 21 | 0 32 | 1 | 1 | 31670 | 31684 | 29 | 29 | 0 | 15499 | 22 | 0 33 | 1 | 1 | 31684 | 31812 | 30 | 30 | 0 | 15499 | 23 | 0 34 | 1 | 1 | 31812 | 32824 | 31 | 31 | 0 | 15499 | 24 | 0 35 | 1 | 1 | 32824 | 56664 | 32 | 32 | 0 | 15499 | 25 | 0 36 | 1 | 1 | 56664 | 56668 | 33 | 33 | 0 | 717 | 26 | 0 Note the row with jobmediaid=28. This is the record chronologically following jobmediaid=17, being flushed because the job is despooling new data. Note that both endfile and endblock are wrong. If I were to hazard a guess, I'd wonder if they might represent the current position of the tape (despooling job 2) at the time job 1 finished spooling, and hit the despool_wait state -- but that's just a wild guess. Next, other jobs were spooling and despooling, intermixed with the big job 3. Looking at JOBMEDIA, I conclude that a tape file will be closed at the end of a despooling operation, even if it's not full, unless there's another job waiting to despool, in which case it's appended. However, I don't understand this sequence: jobmediaid | jobid | mediaid | firstindex | lastindex | startfile | endfile | startblock | endblock | volindex | copy ------------+-------+---------+------------+-----------+-----------+---------+------------+----------+----------+------ 83 | 3 | 3 | 5842 | 5998 | 6 | 6 | 0 | 14303 | 36 | 0 84 | 8 | 3 | 0 | 0 | 7 | 7 | 0 | 0 | 1 | 0 85 | 7 | 3 | 1 | 101 | 7 | 7 | 0 | 335 | 1 | 0 86 | 9 | 3 | 1 | 21 | 8 | 8 | 0 | 0 | 1 | 0 87 | 10 | 3 | 0 | 0 | 8 | 8 | 0 | 1 | 1 | 0 88 | 13 | 3 | 0 | 0 | 8 | 8 | 1 | 2 | 1 | 0 89 | 11 | 3 | 1 | 43370 | 8 | 8 | 0 | 15499 | 1 | 0 Job 3 finishes, and the file is closed. Next, job 8 was an incremental, with no data to back up. No files, start and end both at 7/0. Job 7 then uses 7/0 through 7/335 for its 101 files. Then, job 9 has 21 files, in 8/0 through 8/0 -- so they all fit in a single block. But see what happens next: Job 10 has 0 files, and uses 8/0 through 8/1, and job 13, again with 0 files, uses 8/1 through 8/2. What I don't get here is why 0 files would bump it to the next block, when it's possible for 21 files not to do so. -tih -- Self documenting code isn't. User application constraints don't. --Ed Prochak ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel