Hi I've been doing some testing to see what effects might be seen from getting rid of the base64-encoded `lstat' field, instead storing the values as native database fields.
As I already posted, there's no significant change in table size under PostgreSQL. I needed to test this for the other bacula-supported databases too, though. Taking my `file' table from my working Bacula instance, I get the following table sizes in the various supported database systems for the original table with base64-encoded `lstat' field and the converted version with expanded lstat fields: DBMS Size (b64) Size (expanded) Increase % PostgreSQL 1667596288 1670381568 2785280 +0.15 MySQL5-MyISAM 1335287664 1236507328 -98780336 -7.39 MySQL5-InnoDB 1720713216 1774190592 53477376 +3.10 SQLite3 1403427840 1246750720 -156677120 -11.16 So ... in table size terms, it's actually a win for SQLite and MyISAM. There's no significant change for PostgreSQL, and a small loss for InnoDB. Details on testing methods used at end of post. I'll follow up shortly with INSERT times in a later post. I'm actually wondering, though, if Bacula shouldn't be using the various database's bulk-insert facilities - multi-valued inserts or csv bulk loading. For example, when using PostgreSQL you can use COPY to do bulk-inserts: PQexec(conn, "COPY foo FROM STDIN"); PQputline(conn, "3\thello world\t4.5\n"); PQputline(conn,"4\tgoodbye world\t7.11\n"); ... PQputline(conn,"\\.\n"); PQendcopy(conn); It's probably possible to do something similar with libmysqlclient, implementing LOAD DATA INFILE ... LOCAL to feed MySQL the data. and in both MySQL and PostgreSQL you can use multi-valued inserts, so for the table: CREATE TABLE a (x integer); you can insert (say) rows: INSERT INTO a (x) values (1),(2),(3),(4),(5); Unfortunately, multi-valued inserts are not supported by SQLite3. **** NOTES ON TESTING **** Pg table size obtained with "select pg_total_relation_size(file)". MySQL table sizes obtained with "SHOW TABLE STATUS". SQLite table size obtained with `ls -l' on the .sqlite file, which has only one table. The same schema definition was used for all tests; the only change was appending "ENGINE=MyISAM" or "ENGINE=InnoDB" for the MySQL tests. The schema was: CREATE TABLE file ( fileid bigint NOT NULL, fileindex integer DEFAULT 0 NOT NULL, jobid integer NOT NULL, pathid integer NOT NULL, filenameid integer NOT NULL, markid integer DEFAULT 0 NOT NULL, lstat text NOT NULL, md5 text NOT NULL ); Indexes were omitted for the purposes of this test; there are no indexes on any of the tables in any of the tested databases. Since none of the fields we're changing the format of are indexed, they don't matter, and getting rid of them makes dumps/loads a LOT quicker. All tables were populated from the same tab-separated dump generated with "\copy file to 'file.tsv'" in psql and loaded with "mysqlimport" (MySQL) and ".mode tabs; .import" (SQLite3). I converted the `lstat' field to native database types with the following command in Pg: SELECT fileid, fileindex, jobid, pathid, filenameid, markid, (decode_stat(lstat)).*, md5 INTO file2 FROM file; ( "decode_stat" being the C extension to PostgreSQL I posted here earlier, that gives PostgreSQL the ability to decode bacula's pseudo-base64-encoded lstat field ). -- Craig Ringer ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users