John Ralls writes:

"I think that the first step is to work through all of the code and make an ERD for the existing model, documenting the use and structure of KVP. (Pretend for the purpose of this exercise that every use of KVP is a separate entity). Then we can normalize it into a good relational model and work out a transition path."

This, to my ears, is great news!

Getting the code objects to more closely agree with business objects and the database representation of those objects to more closely represent the structure of the objects is a valuable step in a direction that, in my opinion means:

* A more comprehensible and testable code base -- which indirectly means that becoming a contributor is a lot easier * A database structure that allows the SQL layer to optimise queries, rather than having all kinds of "custom" search/select logic in the code
* A database structure that reports can be generated against
* A database structure that can allow multi-user access (be it exclusive or opportunistic locking of objects, rather than the whole database)

To me, this is a critical "next step" even if/when the languages of choice are changed.

Perhaps another topic that will need to be discussed is the "end-user contract." As I understand it, previously it was "A data file from version Y will be usable an older version X, although all data may not be understood." To me, that sort of locks one in the past (case in point, budgets, which have some significant structural problems right now, such as being unaware of the difference between data-storage sign and UI sign). The "way around" that, as I understand it, is to stuff anything new into KVPs, which, as John points out, doesn't work well with relational databases.

I think it is reasonable to:

1) Restrict any major non-backwards-compatible-for-read changes to objects or database representation to major releases (2.4, 2.6,...) -- Namely reporting and data-extraction tools that work against the database for 2.4.0 should work for 2.4.x without changes

2) Require database upgrade triggers to be run for any release, major or minor -- This means that once you upgrade to Version Y, you're done with Version X

3) Violate "Rule 1" when "critical bugs" related to data integrity or security dictate

4) Drop XML file support -- If people want a lightweight, single-file transport/backup approach, then SQLite is a great option. (Without this, GNUCash is either just using the database as a data store, or would need to maintain two versions of search/select logic.)

Part of the proposed end-user contract is that the end user "must" only use the matching version of the app to /write/ to the database. However, they may use other apps to /read/ from it for their own needs.

Yes, "Rule 2" means that you need to decide which releases you are going to take and that adoption of the "latest and greatest" may be a bit slower. It also means that code and data structures can be refactored as appropriate. In the widely-used Open-Source world, Wordpress does this with great success. My "day job" is administration and billing systems for large insurance carriers and this approach works for them as well.



On 01/02/2011 05:23 PM, John Ralls wrote:
[...]
We need to re-think KVP entirely: It doesn't match up very well with the 
relational model.

A couple of examples:

Splits use KVP to store memos. Good, because not everyone uses them on every 
split, and there's no point wasting the space. But we can provide a split-memo 
table with a foreign key into the splits table (or vice-versa). That will be 
much faster to query (no WHERE name= clause in the join) and the data design 
will be clearer.

The HBCI (online banking) setup, on the other hand, is contained entirely in a 
hierarchy of KVPs. This makes some amount of sense in XML, but it's insane in 
an RDB. RDBs don't like recursion, and there's no way to do arbitrary 
hierarchies without recursion. HBCI needs its own tables.

I think that the first step is to work through all of the code and make an ERD 
for the existing model, documenting the use and structure of KVP. (Pretend for 
the purpose of this exercise that every use of KVP is a separate entity). Then 
we can normalize it into a good relational model and work out a transition path.

I have some more Gtk stuff to do over the next couple of weeks, but I'll start 
on the ERD after that.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel

Reply via email to