Hello Ryan,

  sorry this got a little delayed. I copy to Your email.

<[EMAIL PROTECTED]> aka Ryan Novosielski  schrieb
mit Datum Sun, 10 Feb 2008 22:37:54 -0500 in m2n.bacula.users:

|> |To the technical points you discuss, you will discover that tape
|> |mirroring is a problem, as currently choosing which media from which to
|> |restore is not something that would work with Bacula the way things are
|> |now. 
|> 
|> Hm, I'm not sure if I get Your point.
|> As far as I see it does already work, it is only not yet 
|> supported or present in the configuration options etc.etc. 
|> Actually Your post inspired me to experiment a bit, and (besides
|> finding another bug in the migration code - patch will follow) 
|> I could create identical mirrors of already completed backup jobs,
|> and move them from storage to storage.
|
|Really? How is this done? My understanding is that this was not
|currently possible, at least not inside the software. Sure, you could

It depends how You define "possible". It seems not to be currently 
supported, true. But for all practical matters, a "migration" as 
currently done in Bacula *is* a copy. 
When you migrate a backup-job from one storage-pool to another,
you will have all the data present in the new storage pool - but
the identical data in the old storage pool is also still present 
and untampered - obviousely, since that tape or disk did only
get read and not deleted.
The only thing that has changed is two values in the catalog
which mark the old job as migrated and the new job as the target
of that migration. Now if one would someway change these two values
back to default, then one would have two jobs with different
job-ids but the same content. Certainly, some piece of the Bacula
code might be incompatible with such an action - as I said, it seems
currently not supported. But there is not much work needed to get
it implemented.

So, if I were in the need of such functionality, and as the Bacula
license allows us to make modifications to it as we seem appropriate,
I would just make it working - on my own risk, on my own
responsibility. (Respectively one could hire somebody who is willing
and able to take the responsibility.)

|> Restore is not my concern, because I would direct such a mirror
|> into a Pool with "CatalogFiles=no", so there will be no restore
|> from it.
|
|Fair enough. What is the rationale behind that?

Less catalog size.

|> In the event that the primary backup will get lost/destroyed, 
|> that media could be replaced and the mirror migrated back to 
|> the new media - this step will automatically (actually 
|> as a side-effect) recreate the filelists in the catalog.
|
|Is that definitely the case? That sounds more like using bscan than a
|migration job. Can it be migrated without catalog information?

No, not completely without catalog information. But it should work 
without the files records. In the database we have records for
each job, and we have records for each file in any job. (The latter
is that what makes the database grow big.) It is possible to do
a backup without creating the file records in the database (thats
what the "CatalogFiles=no" option in the Pool resource is good for) 
to keep the database smaller. The consequence is that you cannot 
restore individual files from such a backup - you can only restore
all the files in a job at once.

Now the migration job does *not* just update the file records to
the new job-id - it deletes them and creates them anew. (I recognized
this from the huge amount of database redo-logs that a migration would
produce, so I looked what was happening there).
And, since bscan can get the file information, it is obviousely
present in the backup data itself. So the migration job can get it
just as like.
You can try it out. Create a pool with "CatalogFiles=no", save
some data into it, and You will see there is no file information
present in the catalog. Then migrate this data to another pool 
that has not set the "CatalogFiles=no", and see Your file 
information appear. :)
As I said: Migration is where the real fun starts...

Now, I cannot say if this is a supported operation - it is just how
the software currently works.

|> Surely, there is still some way to go until this can be used
|> in production, but the road should be more or less clear.
|
|Have you been reading the code, and that's why you know this is in
|there, or are you using a currently documented procedure in the current
|version?

Well, I am not caring much about "documented procedures" in
Freeware/GPL software - it doesn't pay off as there is nobody
to sue anyway ;) I just try to figure out what the piece actually
does. And as it comes to Bacula, I found it most helpful to
watch my database and what is happening in there. Because the
database entries are perfectly normalized and very self-explanatory.
And yes, if something happens that does not really make sense to
me, then I also read the code - and try to make it work the
way I need it.

|> So my biggest concern is not how to do it. My biggest concern is:
|> how good does Bacula handle severe concurrent activity? When doing 
|> backup to disk to tape, there should be at least 8 concurrent jobs 
|> on a disk storage object, and they should do something sensible...
|
|Are you talking about the case of 8 individual concurrent jobs running,
|or one job running and then subsequently splitting it somehow into a
|number of copies (either manually copying them or some sort of migration
|trick, etc.)?

No - I'm speaking about the problem of getting the daily amount of
data thru some drives R/W heads within 24 hours. There are sites
where *this* is the real problem, where the backup system runs 
continuously on saturated network and where you calculate the number
of needed tapedrives as the amount of daily data divided by the
drive's throughput divided by 24 hours (or better 16, for practical
purposes) and such are sites where you cannot avoid concurrency. 
While I would currently not suggest Bacula as a possible solution 
for that kind of site, I think it a good idea to keep that scenario
in mind.

With this background, the eight jobs are just the minimum building
block necessary in order to get ever near to such a scenario:
i want at least two clients to safe onto one storage object 
(because if each client must run separately then i just do not 
need a network backup solution), i have long- and shortrunning 
jobs that may overlap (for instance when saving away database 
redo-logs, this must happen timely, even during a running 
full-backup), so this makes 4. Then I have the migration jobs 
that read from the disk storage object, so add another two. 
And then some folks may want to restore something at any time, 
so add another two.
And I have not yet found a way to design a disk storage object
in Bacula that would be able to cope with that in a reliable 
manner.

If I would get that together, then that should be the building 
block for everything else, no matter how large.

rgds,
PMc

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to