Good to know.

The difference between
 {{rows = dataSet.select()}}
and
 {{rows = db.executesql(dataSet._select())}}
is that the former calls the function "parse" which loops over the
175K rows and converts them into objects.

On Aug 26, 12:08 am, HughBarker <hbar...@gmail.com> wrote:
> Thanks for the reply.
>
> {{rows = db.executesql(dataSet._select())}} executes in 0.16s.
>
> On Aug 26, 2:09 pm, Massimo Di Pierro <massimo.dipie...@gmail.com>
> wrote:
>
>
>
>
>
>
>
> > Try by replacing
>
> >  {{rows = dataSet.select()}}
>
> > with
>
> >  {{rows = db.executesql(dataSet._select())}}
>
> > and tell me what get.
>
> > Anyway, your raw sql select is only feting a third of the rows that
> > that web2py is fetching.
> > web2py is also doing a lot of extra work converting the list of
> > records from tuples to dictionaries.
>
> > On Aug 25, 9:48 pm, HughBarker <hbar...@gmail.com> wrote:
>
> > > Hi,
> > > I'm new to web2py - first of all, thanks for such an excellent
> > > framework. It's certainly a breath of fresh air after writing J2EE
> > > webapps.
>
> > > Anyway, I'm building an application, part of which plots large
> > > datasets (using the JS Highstocks library). I've defined a table like
> > > this:
>
> > > db.define_table('Underway_Data',
> > >     Field('UTCtime', 'datetime'),
> > >     Field('waterTemp', 'double'),
> > >     Field('waterTempQC', 'integer'),
> > >     Field('latitude', 'double'),
> > >     Field('latitudeQC', 'integer'),
> > >     Field('longitude', 'double'),
> > >     Field('longitudeQC', 'integer'))
>
> > > and populated it with data, about 175k rows (postgres on ubuntu,
> > > web2py is residing on the same server and setup using the 'one step
> > > production deployment' script)
>
> > > Running the database query:
>
> > >       {{resolution=5}}
> > >       {{query = (db.Underway_Data.id%resolution==0)}}
> > >       {{dataSet = db(query)}}
> > >       {{tic = time.clock()}}
> > >       {{rows = dataSet.select()}}
> > >       {{toc = time.clock()}}
>
> > > will take about 4.5 seconds (the resolution variable subsamples the
> > > data for performance reasons).
>
> > > While it is a fair amount of data to be pushing around, 4.5 seconds
> > > seems slow to me. I wrote a simple benchmark in pure python:
>
> > > import psycopg2, time
> > > tic = time.clock()
> > > conn = psycopg2.connect("host=ubuntu-geospatial-server user=postgres
> > > password= dbname=geospatial")
> > > cur = conn.cursor()
> > > cur.execute("SELECT Underway_Data.id, Underway_Data.UTCtime,
> > > Underway_Data.waterTemp, Underway_Data.waterTempQC,
> > > Underway_Data.latitude, Underway_Data.latitudeQC,
> > > Underway_Data.longitude, Underway_Data.longitudeQC FROM Underway_Data
> > > WHERE ((Underway_Data.id % 5) = 0);")
> > > rows = cur.fetchall()
> > > toc = time.clock()
> > > print toc-tic
> > > cur.close()
> > > conn.close()
>
> > > which runs in ~0.13 seconds.
>
> > > (the query I use is the output of db(query)._select() )
>
> > > Am I missing something here? Is there anything I can do to increase
> > > performance?
>
> > > Regards,
> > > Hugh.

Reply via email to