Re: [Bacula-users] [Bacula-devel] Storage definitions

David Boyes Fri, 24 Nov 2006 12:07:24 -0800

> Thanks for your thoughts.  I think there are several points that you
have
> minimized or overlooked in your response:
> 
> 1. Bacula currently permits specifying multiple Media Types in a Pool.
> 2. Bacula currently permits Storage devices to be specified in the Job
> resource
> 3. Bacula currently permits both Pool and Storage overrides in Run
> resources.
> 
> None of the above can be removed or changed without causing total
disaster
> to
> a very large Bacula community.


All understood. I'm still thinking about how to get to that future state
where these things are controlled by Bacula, and the user doesn't have
to concern themselves with it. Since migration is a significant new
feature, it strikes me as a very good opportunity to start heading in
that direction. 

> For version 1.39.x, you can run Bacula exactly as you want using
Storage
> resources in Pools and in the Next Pool.  Both methodologies cooexist
in
> Bacula (though they probably don't function very well together).

Yes. It's a tradeoff of one-time setup complexity for pain-free normal
operation. The bigger problem (as you pointed out) is how to get there
from here. 


> > > Migration job:
> > >  read storage: job storage resource
> >
> > Shouldn't the migration job always be using the Pool resource being
> > migrated
> 
> One can do that, but it is not the only way of doing things.  The read
> part
> remains compatible with the existing code.

I guess I'm not seeing why it's useful to override device selection for
reads from pooled volumes -- I'm sure there's a reason, but I can't
figure it out. The volumes (and thus the jobs you are migrating) are in
a pool and are in a changer -- otherwise they wouldn't be eligible for
migration -- thus they are associated implicitly with a SD managing that
pool. Other than academic completeness, why complicate operations by
allowing/encouraging the user to mix into Bacula's device selection and
management process outside of real total disaster? That seems to be the
core of the problem -- Bacula having to cope with things outside it's
control or users doing things that are dumb. Should we use the
opportunity to try to remedy the possibility of that? Migration is a
*major* change in capability -- I guess I'm thinking that it's also the
beginning of the kinds of changes I suggested in order to scale up. 

I know we can't break the world as it is now. It might be worth breaking
migration to work this way and start things in the direction we want to
go in the future. 

> > For migration, we're looking at
> > volumes in pools, not individual volumes (other than the case where
we
> > have deliberately limited the selection criteria to a specific
volume
> > within a pool via the selection keyword),
> 
> No, that is not really correct.  For migration, as it is currently
> implemented, we are looking at Jobs.  Everything breaks down to a Job.
> This
> has certain constraints, but I could see no other way of implementing
> Migration in the current Bacula otherwise.

I think this is one of those which-perspective-do-you-start-from issues.
At the lowest level, that's true - jobs are the most granular thing in
Bacula, and that's the units that the migration code deals with. From
the larger-scale storage manager's perspective, jobs are the *contents*
of a volume, and what they're interested is managing the availability of
volumes (empty, partially-full, full). The migration code identifies and
moves the jobs contained on volumes or the volumes contained within a
pool, but what I care about is if I have enough available space on
volumes to meet the requirements. 

> > so there should be no reason
> > to specify individual devices for migration jobs.
> 
> You have to be able to properly get to the original data, and I don't
> think it
> would be wise to have one syntax for restores and a different one to
find
> the
> read side of a Migration.

I probably didn't explain this point well. I intend the syntax to be the
same in both cases, but in the case of a total disaster (where the
bacula configuration and/or database is lost or otherwise unavailable),
you would have to ADD parameters identifying the SD and device to the
normal syntax. In a normal case, the job would get that data from the
Pool definitions (and their associated Storage definitions), and the
user should let Bacula do it's job and select appropriate devices.


> > >  read storage: pool storage resource
> >
> > Correct.
> >
> > >  read storage: run override resource
> >
> > See above. Migration is about pools, not devices.
> 
> No, it is about Jobs, with the caveat that Next Pool does define what
set
> of
> devices it can be migrated to.

See above discussion on perspective. The mechanics in the code are about
jobs, but that's more detail than the average storage manager cares
about. 

> 
> > If you need to migrate
> > to volumes that are not in a pool then you set up a temporary pool
and
> > draw it's volumes from the scratch pool.
> 
> All Volumes by definition are in one and only one pool, and as I said,
we
> are
> migrating jobs not volumes and not pools.

OK, my bad explanatory skills again. I was thinking of the case where I
need to dump a bunch of data to tapes that have not been previously used
with Bacula. Since I typically label and assign volumes to specific
pools for specific reasons, a bunch of "generally available" volumes
aren't part of that scheme. If I define a pool for that purpose (say
"Hurricane-Coming-Mass-Export" and allow that pool to draw from the
scratch pool (rather than specifically assigning tapes to it), then
that's what I would use as the NextPool for the migration job. I would
be clearing all jobs on my normal volumes onto the new tapes in the
Hurricane-Coming-Mass-Export pool so I could put the tapes in the
station wagon and flee. In that case *I* don't care about the individual
jobs; I care about whether I have copies of all the data on the normal
volumes -- it's the job of the Migration code (in my view) to worry
about those details while I worry about the physical things I can see
and manage. 

> > >  write storage: pool next pool storage resource
> > >    (yea, try to figure out the above in C it is
> > >      job->pool->next_pool->storage
> >
> > Correct as I understand the implementation. See comments above.
> 
> Yes, but it is terribly complicated, and IMO the "average" user
doesn't
> understand Pools so is going to break his brain on this.

On the other hand, it boils down to: 

Initial setup of Bacula: 

1) Define your devices in a Storage resource. 
2) Define a Pool resource to hold your normal backup media and specify
which Storage resource will contain the media for this pool.
3) Define your normal backup media in the pool. 
4) Define a backup job that references the pool of media by Pool name
5) run backup job. 

Implementing migration (or replacing the current spooling code): 

1) If necessary, define a new Storage resource (if your migration pool
uses different hardware than the normal stuff)
2) Define a new Pool to hold the media you want to use for
migration/consolidation
3) Define new media volumes in the new Pool. 
4) Update the original Pool with a Next Pool = resource specifying the
new Pool. 
5) Define a migration job for the original Pool with appropriate
selection criteria. 
6) run your migration job periodically. 

I think that's pretty clear (or could be made to be that way). Basic job
description specifies the Pool name and Bacula takes it from there. If
you need to go directly to tape for some reason, you override the Pool
resource in the job with the name of the Pool that is associated with
the Storage resource that has the tapes. User doesn't get involved with
devices at all

It might be interesting to think about making a change to the default
configuration that gets initially installed with the .debs or .rpms to
configure a disk changer and disk pool by default, and then use the
enabling-migration process above to add tape capability. That would make
the Bacula installation immediately useable on install (and with
disk-based backup becoming commonplace, this would be a very handy
thing, particularly when the Windows daemons become mainstream), and
propagate the changes forward w/o breaking the world for the existing
users. One could easily write a tool to do the definition steps for the
tape enablement. 

Maybe 1.4-ish timeframe to change the initial config? 


> > > Backup Job:
> > >  write storage: job storage resource
> >
> > See above for discussion of pools. If pools drive the selection of
> > volumes, the pool definition will clearly define what SD should be
used,
> > unambiguously and consistently across all actions in Bacula.
> 
> Unfortunately the above, however good it may be, is not compatible
with
> the
> existing Bacula (if you force it on the user), and probably more than
50%
> of
> beginning users will not understand how pools work.   Any big
enterprise
> backup expert will, but not the majority of Bacula users, IMO.

I understand that there are migration issues. Consider this a discussion
of future direction -- there will need to be some explaining done. 

> > > Restore Job:
> > >   read storage: who knows, probably the same as
> > >      Migration.  To be checked.
> >
> > In a normal case, the pool definition of the pool containing the
volume
> > should determine the SD to use.
> This is now possible, but it is not *enforced* so you just need to use
a
> little discipline in writing your .conf file.

OK. I'll see if I can come up with a cookbook install doc that reflects
this approach and we can kick it around here. As outlined above, it's
(IMHO) pretty straightforward, and would be more amenable to
autogeneration and long term manageability. 

> > See comments above wrt to Migration. The same cases exist; migration
> > just makes this problem more visible.
> 
> Yes, it is all a bit too much complicate.  Perhaps over time, I can
> deprecate
> features that add to the complexity such as Run overrides, but that is
not
> something that will happen any time soon.

Understood. I don't think I expected that you'd do it right now; I tend
to think about the "release after next", so I get ahead of the gory
details. 
I do think it's worth the discussion, though, especially if we end up
with a much simpler "default" configuration, or a way to autogenerate
same. 

-- db


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] [Bacula-devel] Storage definitions

Reply via email to