On Tuesday 14 March 2006 22:30, Patrick Van der Veken wrote: > Kern Sibbald wrote: > > On Tuesday 14 March 2006 18:35, Patrick Van der Veken wrote: > >> Kern Sibbald wrote: > >>> On Tuesday 14 March 2006 15:34, Ryan Novosielski wrote: > >>>> Wanted to chime in here because after a long period of time spent > >>>> tinkering with Bacula, I have had my first successful automatic tape > >>>> rotation. I am running 1.38.5 at the moment, so I can say that tape > >>>> recycling really does NOT appear to be broken in 1.38.5 and if you are > >>>> having a problem it is probably misconfiguration. In my case, the > >>>> reason it was not working previously was because I was new to using > >>>> incremental tapes and was attempting to have them expire 1 day too > >>>> early. Perhaps the tape is due to be pruned when you run the command, > >>>> but not due a few mins earlier when the backup is scheduled? It helps > >>>> to write out everything on paper or -- even better -- a calendar. > >>> > >>> Thanks. Good point. I have never been able to reproduce the problem. > >>> One problem is that recycling is fairly sophisticated (or complicated > >>> if you want) and probably some users are forgetting that Bacula will > >>> not recycle a > >>> Volume even if it is totally pruned if the Volume Retention period has > >>> not expired ... > >> > >> Hi, > > > > If you read my comments below, which were given not to criticize or annoy > > but to indicate the kind of information that I need but that I never get, > > perhaps you will better see why there isn't much I can do. > > > >> Good to know it works for some. But it still does not explain why it > >> consistently keeps failing for us. I know *for sure* that the Volume > >> period has indeed expired. We have 18 tapes rotating daily with a 10 day > >> retention time. I have checked manually and counted (!) the days since > >> the "Last written" value by hand before inserting a tape and the > >> difference is at least 2,3 days, not seconds. > >> If the volume retention was not expired then why did the prune work > >> perfectly right before the mount request was generated? > > > > I never tried to claim that in your case the Volume retention has not > > expired, but that it is one of many factors that has to be considered and > > explicitly quantified to get to the bottom of this. > > > > Exactly what kind of pruning are you referring to? automatic, Volume, > > Job, File? How do you know it worked? In general, if is an automatic > > pruning, Bacula does not say that all records have been pruned from a > > Volume. > > > >> My suspicion is > >> that the pruning does indeed work but that the volume status update does > >> not happen correctly (or not fast enough). How else it possible that > >> this happens: > >> > >> 1. Start backup > >> 2. Check tape (in status 'Used') and prune *all* jobs automatically as > >> volume retention has expired (because Recycle Current Volume, Autoprune > >> and Recycle are set) > >> 3. Check tape (still in status 'Used') > >> 4. Generate mount request. > >> 5. Confirm manually the mount request (with the same tape and just > >> seconds later than the mount request!) > >> 6. Backup starts on the same tape. > > > > Unfortunately, the above is a bit too sketchy for me to understand. To > > understand what you are trying to say, I need to know who is doing what > > (i.e. is it Bacula checking the tape or is it you? If it is you checking > > the tape status, what is the exact command? > > > > The only thing that I think I understand is that you say that a mount > > request clears things up. There are situations depending on the OS and > > the OS version where pthreads signals seem to be lost. This results in > > the SD not being able to read a tape thinking it is in use or something. > > A mount command often clears this condition up. If this is what you are > > seeing, the problem is probably not related to pruning or Volume status. > > > > What I need is a "llist" of the Volume before and after pruning and the > > exact output of the job that pruned the volume then could not continue as > > well as some information about the job to know if the Pool and Media Type > > for the Volume in question are appropriate -- alternatively, some output > > that shows a mount or some other command deblocks the situation. > > > >> My err for calling this a "pruning bug". More likely a "volume update > >> issue"? > > > > Although this is possible, it doesn't seem too probable to me. > > > > The guaranteed way to get this fixed would be for someone to write a > > Bacula regession script (see the regress CVS section of Bacula on Source > > Forge) that demonstrates the problem. By definition, regression scripts > > are repeatable, which means if I can repeat the problem here, I can > > analyze it and fix it. > > > > Regards, Kern > > Hi Kern, > > Sorry, I did not mean to sound pedantic or arrogant if it came across > that way.
I didn't take it that way at all. My problem was understanding what you wrote because it was a bit abbreviated. > The example I described were all actions executed by bacula > except for confirming the mount request: This is transcript of the > backup run: > > 13-Mar 14:18 admt-dir: Pruned 2 Jobs on Volume "full_admt-0002" from > catalog. > 13-Mar 14:18 admt-sd: test_backup.2006-03-13_14.18.36 Warning: Director > wanted Volume "full_admt-0017". > Current Volume "full_admt-0002" not acceptable because: > 1998 Volume "full_admt-0002" status is Used, but should be Append, > Purged or Recycle (cannot automatically recycle current volume, as it > still contains unpruned data). > 13-Mar 14:18 admt-sd: Please mount Volume "full_admt-0017" on Storage > Device "DDS-5" (/dev/nst0) for Job test_backup.2006-03-13_14.18.36 > > =ran "query" to check remaining jobs/files for the media id with output: > *no results listed > > *mount > The defined Storage resources are: > 1: File > 2: DDS-5 > Select Storage resource (1-2): 2 > Connecting to Storage daemon DDS-5 at admt:9103 ... > 3001 OK mount. Device="DDS-5" (/dev/nst0) > *mess > 13-Mar 14:21 admt-dir: Recycled current volume "full_admt-0002" > 13-Mar 14:21 admt-sd: Recycled volume "full_admt-0002" on device "DDS-5" > (/dev/nst0), all previous data lost. > > However, going back over the log file I noticed the following: if I look > > at a typical tape I can see that Bacula writes 4 Files to each tape: > | MediaId | VolumeName | VolStatus | VolBytes | VolFiles | > > VolRetention | Recycle | Slot | InChanger | MediaType | > LastWritten | > > | 10 | full_admt-0010 | Used | 2,886,626,689 | 4 | > > 864,000 | 1 | 0 | 1 | DDS-5 | 2006-03-03 01:01:49 | > > | 11 | full_admt-0011 | Used | 2,903,362,231 | 4 | > > 864,000 | 1 | 0 | 1 | DDS-5 | 2006-03-04 01:01:56 | > > | 12 | full_admt-0012 | Used | 2,919,102,299 | 4 | > > 864,000 | 1 | 0 | 1 | DDS-5 | 2006-03-07 01:01:56 | > > Now we only actually run 2 backups each night both of them followed by > a VolumeToCatalog verify job. At the start of the first backup Bacula > prunes 2 jobs (see above) but not 4. Is it possible that Bacula marks > these verify jobs also a 'VolFile' and that these are blocking the > recycling mechanism? No, this is definitely not the problem. The VolFiles is the count of EOF marks on the tape, and one is put every 1GB and at the end of each Job. To see what jobs are still on the tape, you can use one of the options in the query command. Regards, Kern ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users