regarding http://nucular.sourceforge.net
On Oct 11, 12:32 pm, Paul Rubin <http://[EMAIL PROTECTED]> wrote: > > How many items did each query return? When I refer to large result > sets, I mean you often get queries that return 10k items or more (a > pretty small number: typing "python" into google gets almost 30 > million hits) and you need to actually examine each item, as opposed > to displaying ten at a time or something like that (e.g. you want to > present faceted results). I can't give a detailed report. I think 10k result sets were not unusual or noticably slower. Of the online demos, looking at http://www.xfeedme.com/nucular/gut.py/go?FREETEXT=w (w for "web") we get 6294 entries which takes about 500ms on a cold index and about 150ms on a warm index. This is on a very active shared hosting machine. You are right that you might want to use more in-process memory for a really smart, multi-faceted relevance ordering or whatever, but you have to be willing to pay for it in terms of system resources, config/development time, etcetera. If you want cheap and easy, nucular might be good enough, afaik. Regarding the 30 million number -- I bet google does estimations and culling of some kind (not really looking at all 10M). I'm pretty sure of this because in some cases I've looked at all results available and it turned out to be a lot smaller than the estimate on the first page. I'm not interested in really addressing the "google" size of data set at the moment. > http://www.newegg.com/Product/Product.aspx?Item=N82E16820147021 holy rusty metal batman! way-cool! thanks, -- Aaron Watters === less is more -- http://mail.python.org/mailman/listinfo/python-list