Re: [Bacula-users] RunClientAfterJob behavior in bacula 2

Kern Sibbald Fri, 09 Feb 2007 01:27:16 -0800

On Friday 09 February 2007 05:21, Eric Bollengier wrote:
> Hi,
>
> > > > When I recently upgraded to 2.0 I found out the hard way that the
> > > > RunClientAfterJob directive semantics changed quite a bit with the
> > > > new version: while in 1.38 the script was run after all activities
> > > > requiring a connection with the client were over (including data and
> > > > attribute despooling), in 2.0 the script is run as soon as data
> > > > spooling is complete. I did not find any mention of this fact in the
> > > > release notes, but from a couple of posts on the devel list I got the
> > > > impression that this was a deliberate design decision made in order
> > > > to allow restarting applications on the client as soon as possible.
> > >
> > > Yes, exactly.
> >
> > Eric, I was not aware of this change.  Can you explain why it was done?
>
> we discussed that (not on bacula-devel) on 11/07/2006, and we
> had thought that it was good a idea. I had never thought of using this kind
> of thing with this option (it's a good idea to save money)
>
> At that time, the update of the catalogue could take several hours, and
> so applications have  to wait for nothing.


We discussed several things, but apparently we were not talking about the same 
things.  One, we discussed keeping RunClientScript compatible, and for that 
you modified the code.  I understood that after that everything with the old 
code would be compatible.

The second thing that I think you are mixing up is that we talked about the FD 
doing a quick disconnect.  This I have implemented but NOT in version 2.0.x.  
It is currently in 2.1.2, but for the moment I have no intention of back 
porting it to 2.0.x as I have done with many of the other new code, simply 
because I want to ensure good testing of this new feature to ensure that it 
does not cause compatibility problems.

The definition of RunClientAfterJob (as well as RunScript) should be that the 
script is run AFTER the job, not while it is still waiting on the SD.  That 
means it is run when the File daemon has terminated the job.  This means that 
for the FD, the job is not terminated until the SD releases the FD.  So 
running the RunClientAfterJob should run only once the FD is in the final 
stages of termination.

Now one of the subtileties of the 2.1.2 code is that the FD is released 
sooner.  However, I think it is still quite acceptable to run the 
RunClientAfterJob  after the FD job, even if in the overall evolution of the 
job from the standpoint of the Director and the SD it will be earlier.

>
> > I see no reason to modify the behavior of an existing directive when you
> > have provided a new mechanism that permits (or should permit) the user to
> > choose exactly when the RunScript is executed.
>
> I will look how to do this. I think i could have some new options like
> RunScript {
>     RunsSoon = yes|no      # yes by default and set to no on
> ClientRunAfterJob }
> But with the new quick release fd feature, i don't know if it's very
> useful.
>
> Or i can add this in the 1.3X/2.X compatibility layer. ie if you are using
> old protocol, your command will be run at the end of job.

I'm not too worried about the details of how to "fix" this, since you are the 
programmer, but the default is that the old RunClientJobAfter should be run 
only after the SD signals that it got all the data.  I do think that there is 
no need to implement the RunsSoon because the 2.1.2 code will cause the SD to 
give an OK to the FD sooner, and then the FD can terminate the job.  In 
otherwords, you can most likely simply move the call to the script to after 
the SD final OK.  In the short term, for 2.0.x the script will run a bit 
later, because the final SD OK comes later, but I don't see that as a big 
problem.

>
> > If we make things like RunClientAfterJob incompatible, it will break a
> > lot of programs.   It seems that this is not something to be documented,
> > but rather a bug to be fixed.
>
> ok
>

I think the documentation that you added can remain, but please change it 
slightly.  You make reference to "data spooling", but in the context of the 
File daemon, there is no such thing as data spooling (spooling is a term that 
is very similar to caching).  

The documentation should simply say that the script is run once the File 
daemon has sent all the data to the Storage daemon AND the SD has 
acknowledged that it successfully received that data.  If you want to clarify 
it further, you can say that the script is run at the end of the FD job, 
which may not be the final end of the job in the SD and Dir since they may 
have more work to do (i.e. put remaining data on the Volume and add 
attributes to the catalog).

Best regards,

Kern

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] RunClientAfterJob behavior in bacula 2

Reply via email to