On Nov 10, 2011, at 5:33 PM, pallav wrote:

> Thank you. I also appreciate you putting your app.yaml on the code
> repository - it helped me get started today (before seeing what you
> had done, my own had not been customized for wsgi, just Python 2.7).
> 
> I'm definitely interested in learning the relationship between the
> DAL, 2.7, and transactions. I would like to help improve GAE support
> in the DAL, so if there is anything I can learn from your experiences,
> maybe we can use it to improve web2py code. I believe this is the
> killer feature of web2py - being able to deploy to GAE easily while
> still being quick to start, and portable.
> 
> Any interest?

Yes.

The underlying SQL transaction mechanisms (caveat: I'm no expert on this 
subject) are select for update and begin transaction. The Google Datastore has 
neither of these as such. What it does have is run_in_transaction(), to which 
you pass a function, and it runs that function as a transaction.

The Datastore doesn't lock, at least not at a level that's visible to us. The 
way run_in_transaction() works is that if two such transactions collides, one 
of them succeeds and the other is aborted, but automatically retried. So 
whatever you put in run_in_transaction() must be idempotent and moreover not 
have external side effects. 

(Side note: it seems to me that this makes the relationship between GAE 
transactions and memcache problematic, since I assume that memcache puts inside 
a transaction are not visible to the transaction logic, and therefore 
constitute an undesirable side effect, leading to possible inconsistencies 
between the cache and the Datastore.)

Python 2.7 complicates this because Google has decreed that the use of 2.7 
requires the use of the High Replication Datastore. The HR Datastore says that 
any query in a transaction must be an ancestor query, and the DAL doesn't know 
anything about ancestors and entity groups.

So what do you do if you (like me) want to use GAE with 2.7 and need 
transaction support? I see three possibilities.

1. Bypass the DAL and use the Datastore API directly. I have a very simple 
model that maps into the Datastore entity-group model quite naturally, so 
that's what I did.

2. Use MySQL instead of the Datastore (assuming that this works from 2.7): 
http://googleappengine.blogspot.com/2011/10/google-cloud-sql-your-database-in-cloud.html

3. Teach the DAL enough about entity groups that it can support transactions.

More about option (3) follows.

What I have in mind is a decorator for a function in your controller that you 
want run as a transaction. In the GAE case, it'd use run_in_transaction() to 
wrap the function; in the standard SQL case it'd execute BEGIN TRANSACTION (in 
whatever flavor the db requires) before calling the function.

This won't work today, partly because we don't have portable BEGIN TRANSACTION 
support, and partly because our GAE support doesn't have entity groups (I 
think). I propose to address *that* somewhat crudely, by creating an entity 
that represents the database, and is the parent of a second-level entity that 
represents each table. A table entity would be the parent of all its rows.

Then any query within a single table would use the table entity as the 
ancestor, and a query that used >1 table would use the db entity as the 
ancestor.

Three caveats: I don't actually know how the DAL maps SQL to the Datastore. 
GQL, maybe? So I don't know how well the structure I've described could be 
supported by the DAL. Second, I'm enough of a novice using the Datastore that I 
might be missing something about its mechanisms that could make this more or 
less difficult than I'm implying. Third, I'm implying, effectively, something 
like table-level or database-level locking; that could have performance issues. 
(Maybe each row should have its own parent entity, so we can "lock" a row as 
well.)

Last: I've addressed my own situation by bypassing the DAL, and while I'm more 
than happy to participate in the discussion, I don't have the time or expertise 
to contribute much to the implementation of (3).

Reply via email to