On Fri, 7 Sep 2012, Reinier Olislagers wrote:
For my Dutch postcode program https://bitbucket.org/reiniero/postcode with an embedded Firebird 2.5 database, I allow users to read in a CSV file with new or updated postcode data. I use sqldb, FPC x86. I'd like to get your suggestions on speed improvements. I try to get the data into a temporary table as quickly as possible. Later on, a stored procedure will normalize the data and insert to/update various tables (with postcode, city, street information, etc). Because I also allow querying information, I set up 2 connections+transactions: for reading and writing in my database class constructor, and destroy them in the destructor. However, (currently) my application controls the database and I know that querying and bulk inserts at the same time is impossible. The write transaction has this code: FWriteTransaction.Params.Add('isc_tpb_concurrency'); FWriteTransaction.Params.Add('isc_tpb_write'); FWriteTransaction.Params.Add('isc_tpb_no_auto_undo'); //disable transaction-level undo log, handy for getting max throughput when performing a batch update My code loads an ANSI CSV file into a csvdocument in memory (about 50meg), then goes through it, and calls an insert procedure for each record (converting the field contents to UTF8): FDBLayer.BulkInsertUpdateRecord( SysToUTF8(Postcodes.Cells[ProvinceField,LineNum]), SysToUTF8(Postcodes.Cells[CityField,LineNum]), SysToUTF8(Postcodes.Cells[PostcodeField,LineNum]), SysToUTF8(Postcodes.Cells[StreetField,LineNum]), StrToInt(Postcodes.Cells[NumberLowestField,LineNum]), StrToInt(Postcodes.Cells[NumberHighestField,LineNum]), Even, Latitude, Longitude); Relevant snippets from the insert procedure: QuerySQL='INSERT INTO BULKINSERTDATA '+ '(PROVINCENAME,CITYNAME,POSTCODE,STREETNAME,LOW,HIGH,EVEN,LATITUDE,LONGITUDE) '+ 'VALUES ( '+ ':PROVINCENAME,:CITYNAME,:POSTCODE,:STREETNAME,:LOW,:HIGH,:EVEN,:LATITUDE,:LONGITUDE)'; then the transaction is started (if it is inactive) and the query parameters are filled (using Query.Params.ParamByName, but I don't suppose that would be a big slowdown??); finally the SQL is executed. The transaction is left open.
Do you prepare the query before you start the batch ? If not, it is prepared on every insert, which is inherently slower.
Currently, after every 100 records, the transaction is committed: if (linenum mod 100=0) then FDBLayer.BulkInsertCommit(false); IIRC, advice on the Firebird list is to play with this interval; any suggestions? Given the aggressive nature of the transaction parameters, I might even dispense with it.
I once did tests with that (600.000 records) and did not notice any influence of the transaction control. Michael. _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal