I've been thinking about something like this as well. Instead of a separate select_raw() method, maybe we can just add a raw=True|False argument to the existing select() method. I like the namedtuple idea as well (I think some adapters already provide that as an option -- e.g., psycopg2).
Anthony On Thursday, February 9, 2012 3:04:41 PM UTC-5, nick name wrote: > > Yes, that is the basis of what I am suggesting. > > There is not currently such a thing; there is something called > 'select_raw' implemented in the GoogleDataStore adapter, but not in > anything else, and it isn't exactly what I am proposing. > > To elaborate: > > Assume the table is defined as follows: > > reftable = db.define_table('reftable', Field('a', string)) > table = db.define_table('table', Field('b', reftable)) > > In my case, I need to pull all the records (60,000) from the database to > compute some aggregation which I cannot compute using sql. There are two > alternatives here: > > r1 = db().select(table.ALL) # takes > 6 seconds > > r2 = db.executesql(db._select(table.ALL)) # takes ~0.1sec > > The records returned in the first instance are much richer; they have > record chasing (e.g. I can do r1[0].b.a to select through the foreign key), > they have methods like r1[0].update_record() and r1[0].delete_record(), and > other nice stuff. > > However, for this use, I don't need the additional records, and I do need > the speed, so I would rather use r2. However, r2 is not a direct > replacement -- it doesn't have the column names. If I use > > r3 = db.executesql(db._select(table.ALL), as_dict=True) # still takes > ~0.1sec > > I can do r3[0]['b'] but I cannot do r3[0].b; and it takes a lot more > memory than r2. > > A suggestion: add another parameter, processor=... which, if available, > will be called with the db.connection.cursor, returning a function, through > which each routine will be passed; example > > def named_tuple_process(name, description): > from collections import namedtuple > fields = ' '.join([x[0] for x in description]) > return namedtuple(name, fields) > > r4 = db.executesql(db._select(table.ALL), process=lambda x: > named_tuple_process('tablerec', x)) > > r4[0].b # will now work; not a full replacement, but good enough for many > uses. > > In fact, you can do that externally - > > r4 = db.executesql(db._select(table.ALL)) > f = named_tuple_process('tablerec', db._adapter.cursor.description) > r4 = [f(x) for x in r4] > > But this requires reaching into the internals of the db adapter. > > Finally, I propose to define x.raw_select(*args) to do: > db.executesql(x._select(*args)) > > which would make this a relatively clean replacement. >