Hi Joe, I've never benchmarked/explored it, but my hunch is the number of pages would have to be pretty low (say, less than 3-5). Of course, if you are trying to figure out in-memory vs hit the database again, that logic just adds a bit more complexity to your code and is more to maintain. I think I'd try always hitting the database and see how that plays out instead of trying to pre-optimize it.
mrg On Fri, Mar 30, 2012 at 8:57 AM, Joe Baldwin <jfbald...@earthlink.net> wrote: > OK, good analysis, that makes a lot of sense. So the thing to remember with > post-query ordering on large result sets: it is more efficient to re-submit > the query to the database. > > At which point (presumably list size) do you think that orderList() becomes > reasonable to use on a result set? > > Thanks, > Joe > > > On Mar 30, 2012, at 8:33 AM, Michael Gentry wrote: > >> Hi Joe, >> >> I think you are on the right path using setPageSize() to limit the >> result set, but even 50k results is a lot of PKs to return (although >> not a ludicrous amount, depending on your users). >> >> The first approach I would take is if they do a post-query sort then >> send another query to the database with the new sort ordering in place >> and let the database do it. >> >> Let's say you have 50,000 results and a page size of 20. 50,000 / 20 >> = 2500 pages. For you to to sort in-memory using that paginated list >> would require 2499 trips to the database to fetch the records and make >> Java objects out of them (I'm assuming you showed them the first page >> already and they sorted then, so the first page is "free"). 2499 >> queries will be a killer latency-wise, even if each query is quick. I >> think it'll be significantly faster to let the database do it, >> especially if you have good indexes in place. >> >> mrg >> >> >> On Thu, Mar 29, 2012 at 8:29 PM, Joe Baldwin <jfbald...@earthlink.net> wrote: >>> I am interested in the best practice wrt ordering in a certain context. >>> >>> Lets say the context is a store with 1000 - 50000 items. I understand that >>> if I can either include the "order by" directive pre-query or post query. >>> Also, I understand that if I have a opportunity, then passing the "order >>> by" to the DBMS is Cayenne-preferred (because, as I understand it) Cayenne >>> can retrieve the result set ordered, and minimize the triggering of faults >>> resulting in minimum memory usage. I think I understand this scenario, and >>> have tried to code to this strategy. >>> >>> The second scenario is a tad more complex and really the subject of my >>> concern. Note, if the result set is small, I believe that this is of no >>> concern. However, web-users do not often care about setting off "totally >>> insane queries". So if a query is initiated with 'give me everything you >>> have', then once they see the result set (which I am trying to manage with >>> setPageSize()), they might *then* want to sort the result set (i.e. >>> triggering a "orderList()" which is an in-memory ordering which is not as >>> efficient). >>> >>> So given the post-query sorting requirement, a large result set, using >>> setPageSize(), and possibly multiple different post-sorts being initiated, >>> what is the best-practice? Should I resubmit these queries with the >>> ordering, or simply use the orderList() method on the result set? Or, is >>> there another strategy for optimizing coding for these requirements? >>> >>> (My intuition, base on reading about Cayenne is that I just need to rely on >>> orderList() and buy more server memory. - but I am not sure) >>> >>> Thanks Joe >>> >