Re: paged query slow when fetching big lists

Andrus Adamchik Fri, 22 Jun 2007 08:10:39 -0700

Hi Marcin,

* fetch only Pk columns and create all ObjectIds at once, get ridof the iterating process if possible* use already existing method resolveInterval() to fault therequired range of records

This strategy was discussed in the May thread with Ari (the one thatMichael Gentry mentioned). My vote is +0, meaning that before we makethis change, I want to confirm first that it has a visible impact onperformance. Could you possibly make such change locally and see ifit helps? (Look at SelectQuery.addCustomDbAttribute() to only includePK; if you have problems making this change, ping me via the dev list- I'll try my best to help).

If the creation of ObjectId and getting the results from ResultSetcannot be speed up (because it simply has to happen, and it doesnot depend on the way it is done), the only choice will be toimplement some more complex solution using sql LIMIT statement.

I'd love to avoid that, as the data you get may as well be differentthe next time you resolve a page, so you may end up with duplicatesor skipped records. If we ever go this way, we'll probably need tomake it a user choice (use LIMIT vs. IncrementalFaultList).


Andrus



On Jun 22, 2007, at 2:35 AM, Marcin Skladaniec wrote:

Hi
Recently we have found that fetching a list of 100,000 recordsusing ROP with paging and no cache takes a long time, about 50seconds in our case. We have profiled the cpu usage and the resultshows that 99% of time is spent in IncrementalFaultList, within thefillIn() method.
The fillIn method works (in my opinion) in a bit strange fashion:it does execute query at once, stores the query result injava.sql.ResultSet, and than iterates through the result eithercreating the whole DataRow or just ObjectId. If there is a need theDataRows are faulted at the end of the method.
From our testing it came up that this bit of code :

while (it.hasNextRow()) {
        elements.add(it.nextObjectId(entity));
}
is where all the time is spent. Each iteration in this loop takesabout 0.5ms, which multiplied by 100,000 takes almost 50 seconds.nextObjectId method consists of two parts: fetching the next resultfrom ResultSet and creating a ObjectId, but I was unable to checkwhich one takes the most time, anyway I think that this approach issomewhat wrong, since always 99% of the records will be fetched asObjectId and never faulted, so my ideas to enhance this are:* fetch only Pk columns and create all ObjectIds at once, get ridof the iterating process if possible* use already existing method resolveInterval() to fault therequired range of recordsIf the creation of ObjectId and getting the results from ResultSetcannot be speed up (because it simply has to happen, and it doesnot depend on the way it is done), the only choice will be toimplement some more complex solution using sql LIMIT statement.
I would like to mention that we are using some DataContextdecorators and life-cycle callbacks, but I don't believe those areimportant factors in this case.
Whatever is the solution, i think it is pretty crucial that it willbe implemented soon, since the usability of the ROP without fastpaging is rather low.
With regards
Marcin
-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001

Re: paged query slow when fetching big lists

Reply via email to