Now that I've managed to get the data saved to the DBs and the gnucash's exitted, it is time to check the data in the DBs.
As a check on the DB data, I created an awk script to count the numbers of accounts, transactions, and splits in the xml data file. Here is the awk script: BEGIN { nAccounts = 0; nTransactions = 0; nSplits = 0; } /^<gnc:account/ { nAccounts ++ } /^<gnc:transaction/ { nTransactions ++ } /<trn:split>/ { nSplits ++ } END { printf("Number of Accounts %d\n", nAccounts); printf("Number of Transactions %d\n", nTransactions); printf("Number of Splits %d\n", nSplits); } Here is the output from the script: Number of Accounts 523 Number of Transactions 3663 Number of Splits 11487 The following are the results for each DB: (editted for clarity) SQLITE: sqlite> select count(*) from accounts; 522 sqlite> select count(*) from transactions; 3665 sqlite> select count(*) from splits; 11494 sqlite> select count(*) from slots; 1637 MySQL: mysql> select count(*) from accounts; | 522 | mysql> select count(*) from transactions; | 3665 | mysql> select count(*) from splits; | 11494 | mysql> select count(*) from slots; | 1637 | PostgreSQL: gnucash_db=# select count(*) from accounts; 522 gnucash_db=# select count(*) from transactions; 3665 gnucash_db=# select count(*) from splits; 11494 gnucash_db=# select count(*) from slots; 1637 The good news is that now all three DBs have the same number of records. However, it does not agree with the output of my awk script. Since I have one extra account showing in the XML file, I checked to see if the DBs have the ROOT account in them. They do not. QUESTION: Is this a problem? The next important question I had to look at is: why do I have 2 extra transactions and 7 extra splits in the DBs? (Most likely those scheduled transactions from when I opened the XML files.) Comparison of the guids for transactions and splits in each DB revealed 3 transactions guids were different in each DB. The ones from MySql were not present in the XML. Similarly there were 7 split guids that did not match in the different DBs. I did not check for them in the XML. This is consistent with 3 scheduled transactions being created with 7 splits. Now, I wish I'd looked closer at that scheduled transaction dialog when I first opened the XML yesterday.... Despite this, I believe that these three are scheduled transactions. QUESTION: Why do two of them have a entered_date in 2005? Checking one against the XML shows that the last couple of entered_dates were in 2005, but before that they looked normal. However, there was an excess of TWO transactions in the DBs, but THREE differing guids. I was able to track this to a missing template transaction for a scheduled transaction. Here is the corresponding entry in the PostgreSQL error log: ERROR: null value in column "currency_guid" violates not-null constraint STATEMENT: INSERT INTO transactions (guid, currency_guid, num, post_date, enter_date, description) VALUES ('16097c3be93a538e66e1de61ad743b0a', NULL, '', '2005-04-27', '2005-04-27', 'Telus') Here is the first portion of the XML: <gnc:transaction version="2.0.0"> <trn:id type="guid">16097c3be93a538e66e1de61ad743b0a</trn:id> <trn:date-posted> <ts:date>2005-04-27 00:00:00 -0600</ts:date> </trn:date-posted> <trn:date-entered> <ts:date>2005-04-27 12:26:51 -0600</ts:date> </trn:date-entered> <trn:description>Telus</trn:description> <trn:splits> Here is the first portion of another template transaction: <gnc:transaction version="2.0.0"> <trn:id type="guid">df0c8dbca68109b354d2b5cb035bf623</trn:id> <trn:currency> <cmdty:space>ISO4217</cmdty:space> <cmdty:id>CAD</cmdty:id> </trn:currency> <trn:date-posted> <ts:date>2005-09-08 00:00:00 -0600</ts:date> </trn:date-posted> <trn:date-entered> <ts:date>2005-09-08 12:10:41 -0600</ts:date> </trn:date-entered> <trn:description>Primerica</trn:description> <trn:splits> NOTE the missing <trn:currency> tag in the former. PROBLEM: a missing XML tag has led to a failed data insert. While looking at the PostgreSQL error log, I also noted the following error: ERROR: relation "books" does not exist STATEMENT: INSERT INTO books (guid, root_account_guid, root_template_guid) VALUES ('be5d5b80c1a8f36ec33139cb27440ca7', '8f436420bffa2ba45515192c895b083b', '2c54584a7995ce449f2fd371771ef562') PROBLEM: the books table is not created. I checked all 3 DBs. Also, from the PostgreSQL error log: ERROR: null value in column "fullname" violates not-null constraint STATEMENT: INSERT INTO commodities (guid, namespace, mnemonic, fullname, cusip, fraction, quote_flag, quote_source, quote_tz) VALUES ('d6d52116cd6e092ea4f525ede20f38d0', 'template', 'template', NULL, NULL, 0, 0, NULL, '') There were many similar entries. I'm not sure what this means. The XML data does not match this guid. All three DBs have 30 commodities. The XML file has the following near the beginning: <gnc:count-data cd:type="commodity">29</gnc:count-data> PROBLEM: spurious commodities inserts. I found the following in the XML file: <gnc:account version="2.0.0"> <act:name/> <act:id type="guid">6b516dc592018e5b3968bcbd5625c7cf</act:id> <act:type>ROOT</act:type> <act:commodity> <cmdty:space>template</cmdty:space> <cmdty:id>template</cmdty:id> </act:commodity> <act:commodity-scu>1</act:commodity-scu> <act:parent type="guid">dd5fa0a84ddbdfc6fde7b6bcb810a5b7</act:parent> </gnc:account> This is inside the <gnc:template-transactions> tag, so I suppose it is related to scheduled transactions. I guess it is the source of the spurious commodities inserts. I found a commodity in the DBs with namespace, mnemonic and fullname all set to "template". This appears to the the reason for the differing counts between the XML and the DBs. QUESTION: should this be a commodity? If so, why does gnucash-gda attempt to insert a bunch more copies of it? The above is quite long. I've highlight "PROBLEM"s and "QUESTION"s with those strings respectively. This is now a tremendous improvement over the previous situation. Mark _______________________________________________ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel