Re: Best Practice Question

Michael Gentry Fri, 30 Mar 2012 06:40:54 -0700

Hi Joe,

I've never benchmarked/explored it, but my hunch is the number of
pages would have to be pretty low (say, less than 3-5).  Of course, if
you are trying to figure out in-memory vs hit the database again, that
logic just adds a bit more complexity to your code and is more to
maintain.  I think I'd try always hitting the database and see how
that plays out instead of trying to pre-optimize it.


mrg


On Fri, Mar 30, 2012 at 8:57 AM, Joe Baldwin <jfbald...@earthlink.net> wrote:
> OK, good analysis, that makes a lot of sense. So the thing to remember with 
> post-query ordering on large result sets: it is more efficient to re-submit 
> the query to the database.
>
> At which point (presumably list size) do you think that orderList() becomes 
> reasonable to use on a result set?
>
> Thanks,
> Joe
>
>
> On Mar 30, 2012, at 8:33 AM, Michael Gentry wrote:
>
>> Hi Joe,
>>
>> I think you are on the right path using setPageSize() to limit the
>> result set, but even 50k results is a lot of PKs to return (although
>> not a ludicrous amount, depending on your users).
>>
>> The first approach I would take is if they do a post-query sort then
>> send another query to the database with the new sort ordering in place
>> and let the database do it.
>>
>> Let's say you have 50,000 results and a page size of 20.  50,000 / 20
>> = 2500 pages.  For you to to sort in-memory using that paginated list
>> would require 2499 trips to the database to fetch the records and make
>> Java objects out of them (I'm assuming you showed them the first page
>> already and they sorted then, so the first page is "free").  2499
>> queries will be a killer latency-wise, even if each query is quick.  I
>> think it'll be significantly faster to let the database do it,
>> especially if you have good indexes in place.
>>
>> mrg
>>
>>
>> On Thu, Mar 29, 2012 at 8:29 PM, Joe Baldwin <jfbald...@earthlink.net> wrote:
>>> I am interested in the best practice wrt ordering in a certain context.
>>>
>>> Lets say the context is a store with 1000 - 50000 items. I understand that 
>>> if I can either include the "order by" directive pre-query or post query.  
>>> Also, I understand that if I have a opportunity, then passing the "order 
>>> by" to the DBMS is Cayenne-preferred (because, as I understand it) Cayenne 
>>> can retrieve the result set ordered, and minimize the triggering of faults 
>>> resulting in minimum memory usage.  I think I understand this scenario, and 
>>> have tried to code to this strategy.
>>>
>>> The second scenario is a tad more complex and really the subject of my 
>>> concern.  Note, if the result set is small, I believe that this is of no 
>>> concern.  However, web-users do not often care about setting off "totally 
>>> insane queries".  So if a query is initiated with 'give me everything you 
>>> have', then once they see the result set (which I am trying to manage with 
>>> setPageSize()), they might *then* want to sort the result set (i.e. 
>>> triggering a "orderList()" which is an in-memory ordering which is not as 
>>> efficient).
>>>
>>> So given the post-query sorting requirement, a large result set, using 
>>> setPageSize(), and possibly multiple different post-sorts being initiated, 
>>> what is the best-practice?  Should I resubmit these queries with the 
>>> ordering, or simply use the orderList() method on the result set?  Or, is 
>>> there another strategy for optimizing coding for these requirements?
>>>
>>> (My intuition, base on reading about Cayenne is that I just need to rely on 
>>> orderList() and buy more server memory. - but I am not sure)
>>>
>>> Thanks Joe
>>>
>

Re: Best Practice Question

Reply via email to