Hello, I regret to have to announce that there is a rather serious bug in Bacula.
Bacula bug #935 reports that during a restore, a large number of files are missing and thus not restored. This is really quite surprising because we have a fairly extensive regression test suite that explicitly tests for this kind of problem many times. Despite our testing, there is indeed a bug in Bacula that has the following characteristics: 1. It happens only when multiple simultaneous Jobs are run (regardless of whether or not data spooling is enabled). 2. It has only been observed on disk based backup, but not on tape. 3. Under the right circumstances (timing), it could and probably does happen on tape backups. 4. It seems to be timing dependent, and requires multiple clients to reproduce. 5. Analysis indicates that it happens most often when the clients are slow (e.g. doing Incremental backups). 6. It has been verified to exist in versions 2.0.x and 2.2.x. 7. It should also be in version 1.38, but could not be reproduced in testing, perhaps due to timing considerations or the fact that the test FD daemons were version 2.2.2. 8. The data is correctly stored on the Volume, but incorrect index (JobMedia) records are stored in the database. (the JobMedia record generated during the Volume change contains the index of the new Volume rather than the previous Volume). 9. You can prevent the problem from occurring by either turning off multiple simultaneous Jobs or by ensuring that while running multiple simultaneous Jobs that those Jobs do not span Volumes. E.g. you could manually mark Volumes as full when they are sufficiently large. 10. If you are not running multiple simultaneous Jobs, you will not be affected by this bug. 11. If you are running multiple simultaneous Jobs to tapes, I believe there is a reasonable probability that this problem could show up when Jobs are split across tapes. 12. If you are running multiple simultaneous Jobs to disks, I believe there is a high probability that this problem will show up when Jobs are split across disks Volumes. I have uploaded patches to bug #935 (bugs.bacula.org) that will correct version 2.2.0, 2.2.1, and 2.2.2. The patch has been tested only on version 2.2.2 and passes all regression tests as well as the specific test that reproduced the problem. After a little more testing, I plan to release version 2.2.3 probably on Monday the 10th or Tuesday. At this time, I do not have a patch for 2.0.x versions, and unless there is some really compelling reason to create one, I would prefer not -- it would not be a huge effort to back port the patch, but it would require rather extensive testing. Though it is hard to make a specific recommendation, I believe that it probably will be the wisest and simplest to either patch version 2.2.x if that is what you are currently running, or upgrade to version 2.2.3 when it is released. It *could* be possible to manually correct the bad JobMedia records in the catalog, but it is not something that I would personally recommend. If you *really* need data off an old tape, I recommend first trying a restore. Sometime tomorrow, I will provide more detailed instructions on several ways how to correct the problem if necessary -- all of them are somewhat painful. Kern ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users