Good to know. The difference between {{rows = dataSet.select()}} and {{rows = db.executesql(dataSet._select())}} is that the former calls the function "parse" which loops over the 175K rows and converts them into objects.
On Aug 26, 12:08 am, HughBarker <hbar...@gmail.com> wrote: > Thanks for the reply. > > {{rows = db.executesql(dataSet._select())}} executes in 0.16s. > > On Aug 26, 2:09 pm, Massimo Di Pierro <massimo.dipie...@gmail.com> > wrote: > > > > > > > > > Try by replacing > > > {{rows = dataSet.select()}} > > > with > > > {{rows = db.executesql(dataSet._select())}} > > > and tell me what get. > > > Anyway, your raw sql select is only feting a third of the rows that > > that web2py is fetching. > > web2py is also doing a lot of extra work converting the list of > > records from tuples to dictionaries. > > > On Aug 25, 9:48 pm, HughBarker <hbar...@gmail.com> wrote: > > > > Hi, > > > I'm new to web2py - first of all, thanks for such an excellent > > > framework. It's certainly a breath of fresh air after writing J2EE > > > webapps. > > > > Anyway, I'm building an application, part of which plots large > > > datasets (using the JS Highstocks library). I've defined a table like > > > this: > > > > db.define_table('Underway_Data', > > > Field('UTCtime', 'datetime'), > > > Field('waterTemp', 'double'), > > > Field('waterTempQC', 'integer'), > > > Field('latitude', 'double'), > > > Field('latitudeQC', 'integer'), > > > Field('longitude', 'double'), > > > Field('longitudeQC', 'integer')) > > > > and populated it with data, about 175k rows (postgres on ubuntu, > > > web2py is residing on the same server and setup using the 'one step > > > production deployment' script) > > > > Running the database query: > > > > {{resolution=5}} > > > {{query = (db.Underway_Data.id%resolution==0)}} > > > {{dataSet = db(query)}} > > > {{tic = time.clock()}} > > > {{rows = dataSet.select()}} > > > {{toc = time.clock()}} > > > > will take about 4.5 seconds (the resolution variable subsamples the > > > data for performance reasons). > > > > While it is a fair amount of data to be pushing around, 4.5 seconds > > > seems slow to me. I wrote a simple benchmark in pure python: > > > > import psycopg2, time > > > tic = time.clock() > > > conn = psycopg2.connect("host=ubuntu-geospatial-server user=postgres > > > password= dbname=geospatial") > > > cur = conn.cursor() > > > cur.execute("SELECT Underway_Data.id, Underway_Data.UTCtime, > > > Underway_Data.waterTemp, Underway_Data.waterTempQC, > > > Underway_Data.latitude, Underway_Data.latitudeQC, > > > Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data > > > WHERE ((Underway_Data.id % 5) = 0);") > > > rows = cur.fetchall() > > > toc = time.clock() > > > print toc-tic > > > cur.close() > > > conn.close() > > > > which runs in ~0.13 seconds. > > > > (the query I use is the output of db(query)._select() ) > > > > Am I missing something here? Is there anything I can do to increase > > > performance? > > > > Regards, > > > Hugh.