Well, the time is nigh. Our SQLite database has hit 1.3GB and creates a 900MB dump file, which is big enough to cause our bacula Xen virtual machine to run out of space when running a catalog backup.
We have a Postgres (7.4) server with plenty of space, so I thought I'd try importing the catalog into that.. things didn't go so smoothly, so I ended up knocking up a script to do the import. I haven't actually finished it yet, and I know there are some optimisations which could be made, but I hope this is useful to someone: What could be better: * Automatic reorganisation of CREATE statements so that they execute in the correct order * Commandline parameters * Handling of arbitrary conversions (Postgres -> MySQL, MySQL -> Postgres, SQLite -> MySQL, MySQL -> SQLite, Postgres -> SQLite) Note that it would be a good idea to turn off the fsync stuff on Postgres while you do the import - I haven't benchmarked, but it should make things go a good bit quicker. I haven't actually tested this yet, so it's probably broken and riddled with bugs. Still, enjoy :) -- Russell Howe [EMAIL PROTECTED]
#!/bin/bash # Import an SQLite dump of a Bacula catalog into Postgres # Designed for v1.63.3 (as found on Debian sarge) # # v0.5 # # Copyright (c) 2006 Russell Howe <[EMAIL PROTECTED]> # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS # OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER # DEALINGS IN THE SOFTWARE. FILE=bacula.sql # Tables, in order of size TABLES=(File Filename Path Job Media Pool CDImages Counters Version Client FileSet JobMedia NextId UnsavedFiles BaseFiles) # Tables, in insert order TABLESINSERT=(Pool CDImages Client Counters FileSet Filename Job Media JobMedia NextId Path File UnsavedFiles Version BaseFiles) DBNAME=bacula LOGFILE="/var/tmp/sqlite2pgsql.$$.log" importdata() { if [ "x" == "x$1" ]; then echo "Error: importdata() called without an argument. Aborting." exit 1 fi SQLFILE="$1" if [ ! -r "$SQLFILE" ]; then echo "Error: Cannot read from $SQLFILE. Aborting." exit 1 fi echo -n "Loading $SQLFILE into database $DBNAME..." psql -d "$DBNAME" -f "$SQLFILE" || ( echo "Failed to load $SQLFILE into database $DBNAME. psql exited with return code $?. Aborting." exit 1 ) } # Go through each of the table names, splitting the INSERT statements off # into seperate files for table in [EMAIL PROTECTED]; do SRC="$FILE.other" if [ ! -f "$FILE.other" ]; then SRC="$FILE" fi PATTERN="^INSERT INTO $table " if [ ! -f "$FILE.data.$table" ]; then echo -n "Separating $table table from database dump..." echo "BEGIN TRANSACTION" > "$FILE.data.$table.tmp" grep "$PATTERN" "$SRC" >> "$FILE.data.$table.tmp" echo "COMMIT" >> "$FILE.data.$table.tmp" mv "$FILE.data.$table.tmp" "$FILE.data.$table" echo "done. ($FILE.data.$table)" echo -n "Stripping matched lines from the source file to speed up the next round..." grep -v "$PATTERN" "$SRC" > "$FILE.other.tmp" mv "$FILE.other.tmp" "$FILE.other" echo "done." else echo "$FILE.data.$table already exists. Assuming this table has already been split" echo "off from the main dump. Not regenerating." fi done echo "Seperating DDL statements from INSERT statements" grep -v "^INSERT" "$FILE.other" > "$FILE.ddl" echo "DDL statements are now in $FILE.ddl" grep "^INSERT" "$FILE.other" > "$FILE.data.other" echo "Any remaining INSERT statements are now in $FILE.data.other" echo "Fixing up datatypes used in the DDL..." sed -e 's/TINYINT/SMALLINT/g' \ -e 's/DATETIME/TIMESTAMP/g' \ -e 's/INTEGER UNSIGNED/INTEGER/g' \ -e 's/BIGINT UNSIGNED/BIGINT/g' \ -e 's/INTEGER AUTOINCREMENT/SERIAL/g' \ -e s/\ DEFAULT\ \"\"/\ DEFAULT\ \'\'/g \ -e s#\ TIMESTAMP\ DEFAULT\ 0#\ TIMESTAMP\ DEFAULT\ \'1/1/1970\'#g "$FILE.ddl" > "$FILE.ddl.postgres" echo "Fixing Pool table..." sed -e 's/,0,0);$/,NULL,NULL);/' "$FILE.data.Pool" > "$FILE.data.Pool.fixed" echo "OK, we should be ready to import data into PostgreSQL now. DDL first..." echo "This will probably fail the first time. You will have to edit $FILE.other" echo "and rearrange the CREATE TABLE statements so that the tables are created" echo "in the correct order." echo "After editing $FILE.other, simply rerun this script and it will carry on" echo "where it left off." importdata "$FILE.ddl.postgres" for table in [EMAIL PROTECTED] other; do IMPORTFILE="$FILE.data.$table" if [ -f "$FILE.data.$table.fixed" ]; then IMPORTFILE="$FILE.data.$table.fixed" fi importdata "$IMPORTFILE" 2>&1 |tee -a "$LOGFILE" done echo "All done! Check $LOGFILE for errors."