Hello Kern! <[EMAIL PROTECTED]> aka Kern Sibbald schrieb mit Datum Tue, 26 Feb 2008 21:57:10 +0100 in m2n.bacula.devel:
|Yes, there are some problems with migration, but you don't explicitly mention |what deficiencies you expect to run into. Ups... should I? To be honest, I am a little bit worried, because I do not at all want to frustrate You. And I am in a dilemma: on one hand Bacula is something like a dream-come-true: I often thought of fetching a copy of IBM's TSM for my installation (I think I could get an employees evaluation copy), but 1.) it is just too bulky for a couple of home-computers, and 2.) it doesnt run on FreeBSD. So now I have what I wished to have... On the other hand, when trying to get the most out of Bacula, I find many of these small things that might still need a little fix-up. Now, if I do report all of these, then I am the guy who is always criticising. :-/ But ok, now just four things that come to my mind: 1. Everytime when a job is migrated, the "Run=" directives in the job ressource are executed again. This is almost never what one wants to happen, and in fact tends to disrupt backup cycles severely. 2. This is the thing that I have been worrying the most about. I have been following various theories about what might happen there, yet to no avail. The last of my theories was that it might have to do with the migrations, but currently I tend to dismiss this theory also. In fact, I am still clueless. What happens is that the Director puts all jobs (and all newly started jobs) into either "waiting on max Storage jobs" or "waiting execution", while there is no job running on any client and no job running on the SD. It just does nothing and has to be restarted. What I have learned from reading bacula-users, is that most people do not run such quantities of jobs as I do. So maybe this is the reason. 3. When running a migration that will move multiple jobs, there is a kind of "envelope" job: the "g" job that is started first will start all the other "g" jobs that are needed. After this, this "envelope" job itself will also do one of the migrations. But occasionally this job just disappears silently and it's activity is not to be found in the logfile. On one occasion it gave me a sig-11, which might give some hint at what is going on there. From the logfile: 25-Feb 08:56 BxDir JobId 9595: The following 163 JobIds were chosen to be migrated: 7705,7714,7723,7732,7741,7750,7759,... 25-Feb 08:56 BxDir JobId 9595: Job queued. JobId=9596 25-Feb 08:56 BxDir JobId 9595: Migration JobId 9596 started. 25-Feb 08:56 BxDir JobId 9595: Job queued. JobId=9597 25-Feb 08:56 BxDir JobId 9595: Migration JobId 9597 started. .. The interesting thing here is that this output is not retained until job 9595 would finish, instead it is dropped to the logfile immediately at start of the job. And it ends in the middle of a line: 25-Feb 08:57 BxDir JobId 9595: Migration JobId 9742 started. 25-Feb 08:57 BxDir JobId 9595: Job queued. JobId=9743 25-Feb 08:57 BxDir Jo25-Feb 08:57 BxDir JobId 9773: The following 163 JobIds were chosen to be migrated: 7706,7715,7724,7733,... 25-Feb 08:57 BxDir JobId 9773: Job queued. JobId=9774 25-Feb 08:57 BxDir JobId 9773: Migration JobId 9774 started. 25-Feb 08:57 BxDir JobId 9773: Job queued. JobId=9775 25-Feb 08:57 BxDir JobId 9773: Migration JobId 9775 started. .. The remaining part of the log of job 9595 follows a couple of hours later: 25-Feb 10:52 BxDir: Fatal Error because: Bacula interrupted by signal 11: Segmentation violation bId 9595: Migration JobId 9743 started. 25-Feb 08:57 BxDir JobId 9595: Job queued. JobId=9744 25-Feb 08:57 BxDir JobId 9595: Migration JobId 9744 started. 25-Feb 08:57 BxDir JobId 9595: Job queued. JobId=9745 25-Feb 08:57 BxDir JobId 9595: Migration JobId 9745 started. .. At that point I decided that there is some problem, but that it is not all too easy to find and fix. So I decided that for now to postpone the issue (indefinitely), and instead redesign my schedules so that they would create a lesser amount of jobs. (I was saving database redo-logs via a Bacula schedule, which means to check every quarter of an hour if there are any to save - which every time does create an empty job that will qualify for later migration - and that will nicely disappear during that migration. Now I have allowed the database to call bconsole on demand only after it has batched up a couple of logs.) 4. When migrating from disk to tape, there should be no need to do SD data spooling - as the data is already packed up, it will flow quickly to the tape, and data spooling would only slow down the process. But in that case it is likely possible that multiple jobs write simultanously to the tape. When later restoring such jobs, each job must be restored by a separate restore command, which can make the process very slow. If not, that is, if multiple jobs that have intermingled on tape are restored by one and the same restore command, then the names of the restored files will all be correct, but the sizes may be wrong and the contents may be garbage. So, this is more or less the background which led me to my statement that pervasive use of migration would currently show some deficiencies... I hope You understand... best regards, PMc ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel