On Sunday 06 March 2011 13:37:31 James Harper wrote:
> > Hello James,
> >
> > On Sunday 06 March 2011 09:07:04 James Harper wrote:
> > > Is there any work in progress to detect a low disk space condition
>
> for
>
> > > disk based media?
> >
> > No mainly because detecting low disk space is very system dependent,
>
> and I am
>
> > not really sure what Bacula can do about it.  It isn't very simple.
>
> That's what I figured.
>
> > > Currently when I run out of space I end up with Bacula
> > > using up tiny bits and pieces all over the place which mucks up my
>
> 'one
>
> > > volume = one job' assumption and requires a bit of cleaning up pain.
> >
> > I would need more information to be able to respond.  In general when
>
> Bacula
>
> > runs out of disk space all jobs would then fail.  Why is it creating
>
> little
>
> > pieces everywhere?
>
> I have several jobs running concurrently to different storage 'devices'
> which are just different directories on the same disk. Fragmentation is
> bad as we have discussed previously but does not impact performance to
> the point of being problematic, and it's useful because the disks are at
> least as fast as the combined data streams of the 3 concurrent jobs so
> the overall speed is faster.
>
> So if we consider the case where jobs A and B are running concurrently:
>
> . Job A tries to write but the api returns an error because the disk is
> full, so it marks the volume as full and moves on to the next one. The
> next volume is truncated, freeing up a small amount of space.
> . Jobs A and B are now competing over the available space, until one or
> the other errors and marks its volume as full, and moves onto the next
> one.
> . Each volume only gets a little bit of data written to it before
> becoming full, until all recyclable volumes are used up.

Thanks.  That makes it pretty clear what is happening -- not very good.

>
> > > Three ways I can think of solving this:
> > > 1. a "don't span volumes" option in the job resource
> > > 2. a "maximum volumes per job" option in the job resource
> > > 3. a way to get the storage daemon to hold the job if disk space is
>
> less
>
> > > than some amount, eg "minimum space = 5GB"
> > >
> > > #1 would work nicely in my case. #2 is just a more general version
>
> of
>
> > > #1. In this case the job would just fail if the media became full, a
> > > message would be sent to me, and I'd fix the problem and rerun
>
> whatever
>
> > > jobs were necessary.
> > >
> > > #3 would send a message and then just cause the sd to hang and stop
> > > accepting further data to that device until the disk space increased
> > > again. I could fix the problem and the job would continue again.
> >
> > I am not sure how one would implement any of the above.  There are a
>
> lot of
>
> > questions -- what does not spanning volumes do when it wants to span
>
> volumes?
>
> > Fail the job, abort the SD?
>
> Failing the job would be sufficient in my case. I never want a job to
> occupy more than one volume. The volume sizes are limited only by the
> size of the disk, and there is only one disk. The only reason bacula
> would ever progress to another volume is if it ran out of space, and if
> it ran out of space then going to another volume is an exercise in
> futility anyway... a bit of space will be freed up when the next volume
> is purged, but that mucks up my retention as I have a finite number of
> volumes and they should only become ready for recycling a day or so
> before they will be required. Failing the jobs ensures the least amount
> of 'damage' is done.

That would be relatively simple to do.  It probably just needs a "Fail on disk 
full" or some other Job directive and a little code to transmit the new 
variable from the Dir to SD, and a small amount of code in the SD.  Instead a 
Job directive, it could also be a Device directive which would be *much* 
simpler, but would give less control to the Bacula admin.

>
> > The SD doesn't currently have the concept of suspending a device, nor
>
> does it
>
> > have the concept of "holding" jobs, so this would be a whole new
>
> concept to
>
> > design and implement.  I guess I would need a more complete design to
> > understand what is supposed to happen in every case, then we could
>
> examine
>
> > the pros and cons.
>
> I was imagining that it just wouldn't proceed. It would keep responding
> to heartbeats etc but otherwise just act like a device that has blocked.
> Maybe there are internal timeouts to prevent this sort of thing though.
>
> > A simpler solution might be to be able to reserve a specified space on
>
> a newly
>
> > created volume.  This would guarantee that the space was available.
>
> If the
>
> > space could not be obtained, the Volume creation process would fail,
>
> and
>
> > Bacula would simply ask for a new Volume.  I am not 100% sure how to
>
> reserve
>
> > space for a volume without writing in it, which would be a bit
>
> inefficient,
>
> > but we could probably figure out something.
>
> The question is how much space to reserve? I do 1 full backup a week and
> then 3 incremental backups per day. This all gets done to the same pool
> so the volume sizes vary from a few hundred kb to a few hundred gb.
>
> One thought... if I separated the full and incremental backups into
> different pools (something I had considered doing anyway) then the full
> backups at least are fairly consistent in size, each one being a little
> bit larger than the one before. What if when overwriting a volume,
> Bacula seeked to the start of the volume without truncating it, and only
> truncated it at the end of the last write of the job. That would solve a
> few problems (including fragmentation) and would ensure that there
> wasn't massive amounts of reallocation going on. It would need to be a
> job resource switch or something though as this option wouldn't be
> useful for anyone writing more than one job to a volume, or writing
> volumes that vary wildly in size. It also wouldn't completely solve my
> out of space problem.

Yes, we could do that.  The difficulty is to be able to mark the end of the 
Volume so that Bacula knows where it is.  Currently, it seeks to the end of 
the volume when it "opens" it and compares the address with what is in the 
catalog to ensure that they are synchronized.  So we would need a new way to 
seek to the end.  This is something that would be relatively trivial to do 
with the new block aligned Volume format I am planning to add ...

>
> > A variation on the above would be to require a given amount of free
>
> space
>
> > (again this is a very system dependent function to determine the free
>
> space)
>
> > before creating a volume.
>
> Not useful in my case unless we invent some fancy heuristics to figure
> out how large the next volume is probably going to be.
>
> Thanks for thinking about it anyway. The running out of space issue is
> something that should never have happened (nothing to do with bacula - I
> just wasn't monitoring the space closely enough), I was just concerned
> at the mess Bacula made of my volumes when it did run out of space.

OK.

Kern

>
> James



------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to