Marvin Humphrey wrote:
Greets,
If you needed to know not just the total number of hits, but the
number of hits in each "category", how would you handle that?
For instance, a search for "egg" would have to produce the 20 most
relevant documents for "egg", but also a list like this:
Holiday & Seasonal / Easter 75
Books / Cooking 52
Miscellaneous 44
Kitchen Collectibles 43
Hobbies / Crafts 17
[...]
It seems to me that you'd have to retrieve each hit's stored fields
and examine the contents of a "category" field. That's a lot of
overhead. Is there another way?
Statistical sampling of results and estimation, that's what I use ...
for large result sets works reasonably well, for small result sets I
just pay the penalty of retrieving all docs.
Also, take a look at the following:
http://www2005.org/cdrom/docs/p245.pdf, "Sampling Search-Engine Results",
A. Anagnostopoulos, A. Z. Broder, D. Carmel .
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]