Hello, For the list, we have been discussing how to handle releasing the new batch DB insert code, mostly offlist because a large part of the discussion was in French, which I don't consider appropriate on the regular devel list.
However, since this is an important feature, and since I now have a proposal on how to proceed, I would like to do this final part on list to ensure that there are no misunderstandings. I've copied the bacula-users on this email, but prefer that all future emails (any responses to this) are sent only to the bacula-devel list. First, to explain, the new batch DB insert code is a totally new algorithm for inserting the attributes into the database. The basic concept, without going into any detail, is that the attributes under the new code are inserted into a new "temporary" table in the database. This is done without global locking on the DB, which permits multiple simultaneous inserts. The final part of inserting this temporary table into the database involves running a few SQL statements that deal with the filename and path records then do a batch insert of the table. The overall performance may not be significantly improved for a single job. However, for multiple simultaneous jobs running on multi-processor systems (as most are today) can be significantly faster. Now the main purpose of this email is to list the items I feel necessary to resolve prior to releasing this code in production, and to clearly define who is responsible for correcting those items and the timeframe. The issues as I see them are the following: 1. With the batch insert code turned on there are a number of regression tests that fail. They must all pass without errors prior to production release. Responsible: Eric Deadline: Roughly the end of March 2. I am very concerned that the new batch insert code will under certain circumstances fail with an out of memory condition because load placed on the SQL engine will be too large. Marc points out that the SQL engine should handle this, but my experience is that most users (including myself) install MySQL or PostgreSQL and do not understand the details tuning the engine, and in its default mode, it can have a catastrophic failure (out of memory). This currently never happens within Bacula code because of a very conservative design, but it does happen in dbcheck. This happens when *all* the data is maintained in the engine. The particular command that fails is: "SELECT JobMedia.JobMediaId,Job.JobId FROM JobMedia " "LEFT OUTER JOIN Job ON (JobMedia.JobId=Job.JobId) " "WHERE Job.JobId IS NULL LIMIT 300000"; and it only fails if I remove the "LIMIT 3000000". The bottom line is that I believe that there are three measures that we should take to ensure that this does not happen, and if it does, the user will have a way to easily workaround the problem without needing two users of training as a DBA. - Write some test code inside Bacula that will create 10 million batch insert records (this item would be nice, but it is not required) so that we can test it on "default" database installations. - First, have a default limit of the number of records that will be inserted in any one batch request. This should guarantee that an out of memory problem will not normally occur. - Second, add a new directive to the Catalog resource that allows the user to change the default batch size (by setting the value to 0 or very large, you can effectively allow the batch to be arbitrarily large). - Third (this is optional, but very desirable), implement a new directive that allows the user to enable/disable the batch insert mechanism. This would require a bit of change in the subroutine names in the cats directory so that we can enable the new code and the old code at the same time, but select which is used at runtime for each DB based on the directive value. If the first and second items are implemented, the batch insert would be enabled by default, otherwise it will be turned off. Responsiblity for above 3 (4) points: Eric (and possibly Marc) Deadline: rougly the end of March 3. Documentation Before putting this into production, I would like to see a bit more documentation about the new algorithms -- this is documentation that would be placed in the code. Marc has offerred to answer questions and write some documentation, I offer to integrate it into the code, and continue to ask questions until it is clear. This item should be no problem to resolve. Responsible: Kern + Marc (with help from Eric if he wants) Deadline: Kern understands the algorthm by mid April. Comments? Best regards, Kern ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users