Hello,

For the list, we have been discussing how to handle releasing the new batch DB 
insert code, mostly offlist because a large part of the discussion was in 
French, which I don't consider appropriate on the regular devel list. 

However, since this is an important feature, and since I now have a proposal 
on how to proceed, I would like to do this final part on list to ensure that 
there are no misunderstandings.  I've copied the bacula-users on this email, 
but prefer that all future emails (any responses to this) are sent only to 
the bacula-devel list.

First, to explain, the new batch DB insert code is a totally new algorithm for 
inserting the attributes into the database.  The basic concept, without going 
into any detail, is that the attributes under the new code are inserted into 
a new "temporary" table in the database. This is done without global locking 
on the DB, which permits multiple simultaneous inserts.  The final part of 
inserting this temporary table into the database involves running a few SQL  
statements that deal with the filename and path records then do a batch 
insert of the table.  The overall performance may not be significantly 
improved for a single job.  However, for multiple simultaneous jobs running 
on multi-processor systems (as most are today) can be significantly faster.

Now the main purpose of this email is to list the items I feel necessary to 
resolve prior to releasing this code in production, and to clearly define who 
is responsible for correcting those items and the timeframe.

The issues as I see them are the following:

1. With the batch insert code turned on there are a number of regression tests 
that fail.  They must all pass without errors prior to production release.
Responsible: Eric
Deadline: Roughly the end of March

2. I am very concerned that the new batch insert code will under certain 
circumstances fail with an out of memory condition because load placed on the 
SQL engine will be too large. Marc points out that the SQL engine should 
handle this, but my experience is that most users (including myself) install 
MySQL or PostgreSQL and do not understand the details tuning the engine, and 
in its default mode, it can have a catastrophic failure (out of memory).  
This currently never happens within Bacula code because of a very 
conservative design, but it does happen in dbcheck.  This happens when *all* 
the data is maintained in the engine.  The particular command that fails is:

"SELECT JobMedia.JobMediaId,Job.JobId FROM JobMedia "
                "LEFT OUTER JOIN Job ON (JobMedia.JobId=Job.JobId) "
                "WHERE Job.JobId IS NULL LIMIT 300000";

and it only fails if I remove the "LIMIT 3000000".

The bottom line is that I believe that there are three measures that we should 
take to ensure that this does not happen, and if it does, the user will have 
a way to easily workaround the problem without needing two users of training 
as a DBA.

- Write some test code inside Bacula that will create 10 million batch insert 
records (this item would be nice, but it is not required) so that we can test 
it on "default" database installations.
- First, have a default limit of the number of records that will be inserted 
in any one batch request.  This should guarantee that an out of memory 
problem will not normally occur.
- Second, add a new directive to the Catalog resource that allows the user to 
change the default batch size (by setting the value to 0 or very large, you 
can effectively allow the batch to be arbitrarily large).
- Third (this is optional, but very desirable), implement a new directive that 
allows the user to enable/disable the batch insert mechanism.  This would 
require a bit of change in the subroutine names in the cats directory so that 
we can enable the new code and the old code at the same time, but select 
which is used at runtime for each DB based on the directive value.  If the 
first and second items are implemented, the batch insert would be enabled by 
default, otherwise it will be turned off.
Responsiblity for above 3 (4) points: Eric (and possibly Marc)
Deadline: rougly the end of March

3. Documentation
Before putting this into production, I would like to see a bit more 
documentation about the new algorithms -- this is documentation that would be 
placed in the code.  Marc has offerred to answer questions and write some 
documentation, I offer to integrate it into the code, and continue to ask 
questions until it is clear.  This item should be no problem to resolve.
Responsible: Kern + Marc  (with help from Eric if he wants)
Deadline: Kern understands the algorthm by mid April.

Comments?

Best regards,

Kern


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to