Hi Marcin,

* fetch only Pk columns and create all ObjectIds at once, get rid of the iterating process if possible * use already existing method resolveInterval() to fault the required range of records

This strategy was discussed in the May thread with Ari (the one that Michael Gentry mentioned). My vote is +0, meaning that before we make this change, I want to confirm first that it has a visible impact on performance. Could you possibly make such change locally and see if it helps? (Look at SelectQuery.addCustomDbAttribute() to only include PK; if you have problems making this change, ping me via the dev list - I'll try my best to help).


If the creation of ObjectId and getting the results from ResultSet cannot be speed up (because it simply has to happen, and it does not depend on the way it is done), the only choice will be to implement some more complex solution using sql LIMIT statement.

I'd love to avoid that, as the data you get may as well be different the next time you resolve a page, so you may end up with duplicates or skipped records. If we ever go this way, we'll probably need to make it a user choice (use LIMIT vs. IncrementalFaultList).

Andrus



On Jun 22, 2007, at 2:35 AM, Marcin Skladaniec wrote:

Hi
Recently we have found that fetching a list of 100,000 records using ROP with paging and no cache takes a long time, about 50 seconds in our case. We have profiled the cpu usage and the result shows that 99% of time is spent in IncrementalFaultList, within the fillIn() method.

The fillIn method works (in my opinion) in a bit strange fashion: it does execute query at once, stores the query result in java.sql.ResultSet, and than iterates through the result either creating the whole DataRow or just ObjectId. If there is a need the DataRows are faulted at the end of the method.
From our testing it came up that this bit of code :

while (it.hasNextRow()) {
        elements.add(it.nextObjectId(entity));
}

is where all the time is spent. Each iteration in this loop takes about 0.5ms, which multiplied by 100,000 takes almost 50 seconds. nextObjectId method consists of two parts: fetching the next result from ResultSet and creating a ObjectId, but I was unable to check which one takes the most time, anyway I think that this approach is somewhat wrong, since always 99% of the records will be fetched as ObjectId and never faulted, so my ideas to enhance this are: * fetch only Pk columns and create all ObjectIds at once, get rid of the iterating process if possible * use already existing method resolveInterval() to fault the required range of records If the creation of ObjectId and getting the results from ResultSet cannot be speed up (because it simply has to happen, and it does not depend on the way it is done), the only choice will be to implement some more complex solution using sql LIMIT statement.

I would like to mention that we are using some DataContext decorators and life-cycle callbacks, but I don't believe those are important factors in this case.

Whatever is the solution, i think it is pretty crucial that it will be implemented soon, since the usability of the ROP without fast paging is rather low.

With regards
Marcin
-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001




Reply via email to