[elephant-devel] Status of Elephant Unstable Branch

Ian Eslick Sun, 23 Mar 2008 07:53:06 -0700

My fellow Elephants,

Unstable isn't unstable anymore! All BDB tests, including migration,for BDB/Mac/Allegro and BDB/Mac/SBCL are green as of today's checkin.


All major new features are implemented, including:
- Instance map, class schema evolution and MOP compliance
- New slot types
  - Cached read, write-through slots
  - Hierarchical indexed slots
  - Virtual, hierarchical derived indices
  - Set-valued slots
  - Many-to-1 and many-to-many slot associations
- Trivial query interface example (query.lisp)
- Migration and upgrade

- Partial test suite (basic association, indexing, migration, basicschema-evolution)

There are definitely holes in the test suite that need to be pluggedand I'm sure that this will uncover bugs, particularly in the schemaevolution, upgrade or association infrastructure. The steps needed toprepare this branch for the next release are:


- Integrate patches from the main repository

(Leslie's patch is the only one that I haven't already integratedinto unstable, I think)


- Evaluate multi-threading issues for schema evolution

(only one thread should be able to manipulate class objects at atime)


- Upgrade Postmodern and CLSQL data stores
  - Support btrees with duplicate keys
  - Some minor API additions for upgrade & bootstrapping

- Testing

- Expand testing for schema evolution (most complex/subtle bugswere there)

  - Validate upgrade procedure 0.9.1 -> 0.9.2

- Verify referential integrity (delete object, what happens tostale refs?)

  - Standard tests for new features

- Documentation of new features

I am tied up with work for the next two weeks. I'm happy to supportbug fixes, lisp compatibility issues, etc - but progress will only bemade for the remainder of March and early April if others step in tohelp.

Robert and I hope to integrate this work into another 0.9.x release inlate April. I think this new functionality makes Elephantsufficiently feature-rich and robust that after some burn-in time weshould consider packaging this into a 1.0 release that we can committo support for the longer term. We can have a 1.1 development branchin which add major new features like an all-lisp data store or a querycompiler as longer term projects.

There are a few features that could use attention that could, but neednot, make it into the upcoming release:


- Online GC strategy

Now that we have an oid table that maintains information for eachobject and is used to de-serialize a reference, we can implementfacilities such as forwarding pointers, counts or marks that makes itpossible to build an online persistent heap GC facility without anoverly significant cost or code impact.


- Query language/interpreter

Daniel Salama is thinking about the query syntax and is motivatedto help implement something there. I'd be psyched to see aninterpreter that extends my sketch to take good advantage of indicesand associations.


- System-level schema evolution

Robert is thinking through some system-level schema versioning andevolution ideas akin to the Postgresql notion of schemas, but neitherof us has the bandwidth to implement this right now. The basic ideais to group a set of class schemas into a version set and to use theseversion tags to dispatch a generic-function that can override thedefault transformation of an instance from one schema version to thenext. This would allow you to connect to an old DB with new code,call a global upgrade fn, and have everything converted in one go.

This would be an independent application layer so would not impactan upcoming release either way.


Regards,
Ian

PS - I did some profiling of the unstable branch on BDB/Mac to seewhat effects different query strategies might have. It though some ofyou would be interested in this. This is preliminary and not wellcontrolled, but the order of magnitude should be about right.

The objects described below are 5-slot objects with a mix of indexed,cached, transient, etc.


Persistent object creation: 3000 objects per second

Persistent object reference deserialization w/ object instantiation:10k per second

Persistent object reference deserialization of oids only: 40k per second

This last # would be the key factor in handling queries over largeobject databases. Since we can instantiate using only an oid, we onlyneed to instantiate objects we need. This should make things likecounts and paging pretty efficient for moderately sized databases.Indexing, of course, will have a significant impact on the performanceof query by reducing the number of manipulated OIDs.

_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

[elephant-devel] Status of Elephant Unstable Branch

Reply via email to