On Tuesday 04 October 2005 15:15, David Boyes wrote:
>  I scoped the problem as two major projects:
> > > 1) implementation of "copy pools" -- where files written to a pool
> > > were automatically also written to up to 3 additional pools
> >
> > using the
> >
> > > same volume selection criteria as exist now (essentially
> >
> > getting the
> >
> > > SD to act as a FD to more than one FDs to ensure
> >
> > synchronous updates,
> >
> > > or creating a SD-mux function to split the FD data stream to N SDs).
> >
> > In a certain sense this is already implemented in 1.37.  1.37
> > permits a job to start a second job in a way that makes the
> > second job duplicate the "since"
> > date/time.  This effectively allows making identical copies
> > (except for files that change during the backup).
>
> Hmm. Does the second job transfer the data from the FD again? If so, then
> that doesn't (IMHO) quite do what I want to do here. I really want to
> transfer the data only once (the only guarantee we have of getting the same
> data on all the copies) and create the replicas on the server side.

Yes, it starts a second job.  The disadvantage of this is that the data is not 
100% identical if anything is changing on the FD.  The advantage is that it 
avoids a whole bunch of complications that I have not logically resolved 
concerning having two backups of the same thing in the same job.

>
> > > 2) implementation of pool to pool migration as discussed on the list
> > > previously.
> >
> > Pool to Pool, or direct writing by the SD to several devices
> > within one Job
> > both require a few more changes to the SD.  All the basic
> > data structures
> > exist so that the SD can have multiple I/O packets open, but
> > as of 1.37, the
> > SD only has a single I/O packet per job.  In older versions
> > of Bacula, there
> > were only DEVICE stuructures, one per physical device.  Now,
> > there are DCR
> > (device control records) that sit "above" the DEVICE
> > structure.  There can be
> > multiple DCRs that use the same DEVICE in 1.37 -- this is no
> > problem.
> > However, the next generalization, and most of the code is in,
> > is to allow a
> > job to have multiple DCRs.  The job must be taught to open
> > multiple devices
> > (and thus create multiple DCRs), and then either read from
> > one DCR and write
> > to another (copy), and/or write to multiple DCRs.
>
> (As a side issue, I'm beginning to wonder if overall we need a more
> generalized job manager. This is sort of sounding like we need something
> like JCL, and then this could all be handled in a more systematic way.
> That's a much bigger project, though.)

Perhaps if I were starting to design Bacula with the knowledge I have today, I 
would have a different structure.  However, I have to live with the current 
code, and at the current time, I am, unfortunately, the only one who 
understands it and who is continuously working on the project.  Making any 
major design changes is not something I can handle without a team of 
programmers.  By myself, I can continue the same path I have taken over the 
years -- slowly evolve it to provide all the functionality we want.

My next email on "Project management" will discuss the above point a bit more 
in detail.

>
> OK, I can see how that could work.
>
> > The only part of this implementation that I have not worked
> > out in my head is
> > how to deal with the catalog.  If there are two identical
> > backups of a Job on
> > two different Volumes, the Director is not currently aware of
> > it.  There
> > needs to be some way of flagging two copies as being
> > identical, then using
> > only one of those copies for any given restore.
>
> What I was thinking with the copypool idea was to have multiple volume
> records for a file, and sorting by copypool order, eg pool A has copypools
> B and C. During backup, the file is stored on a volume in pool A, and also
> stored on volumes selected from pool B and C (finite number of these, to
> avoid impacting performance significantly). The database would reflect 1
> file record with multiple volume records pointing to the volumes selected
> from pool A, B, and C, with an indication of a pool priority (based on the
> sequence of the copypools) to indicate which version to try first.

This could be a way to do it, but it doesn't fit in with the current Bacula 
scheme.  Any restore can have Volumes from multiple pools (typically not from 
a single job).  Many users separate their Volumes into Full, Diff, Inc pools.

So, IMO, unless I am missing something you are saying, a Pool is not a good 
way to separate multiple copies.  I do have a database column designed to 
indicate what copy a particular Volume record is from (I also have a stripe 
database column).  Since they are not yet fully implemented, they are not yet 
stored in the DB to conserve space, but this info is passed from the SD to 
the DIR.

>
> When traversing the file table for a restore, you can retrieve the list of
> volume records containing that file, and iterate through them either in
> copypool priority sequence or try them at random, prioritizing for volumes
> currently mounted and available. If the volume you try is in the changer
> and available, use it, else try the next one.
>
> >  Also, there
> > needs to be a
> > bit more work in implementing some better user interface for
> > "archiving" i.e.
> > taking Volumes (possible identical copies) out of the Volume
> > pool, and then
> > later re-introducing them.
>
> With the approach above, just taking the volumes in and out of the changers
> does the job for you. No new wheels needed.

Yes, this would work for big shops where everything is in the changer, but for 
the other 99.9% of us who either don't have changers or who are obligated to 
remove Volumes from the changers, it would leave the problem of deciding what 
Volume to take, and how to tell Bacula in a user friendly way that certain 
Volumes may be offsite.


-- 
Best regards,

Kern

  (">
  /\
  V_V


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to