Marvin Humphrey wrote:
Greets,

If you needed to know not just the total number of hits, but the number of hits in each "category", how would you handle that?

For instance, a search for "egg" would have to produce the 20 most relevant documents for "egg", but also a list like this:

    Holiday & Seasonal / Easter     75
    Books / Cooking                 52
    Miscellaneous                   44
    Kitchen Collectibles            43
    Hobbies / Crafts                17
    [...]

It seems to me that you'd have to retrieve each hit's stored fields and examine the contents of a "category" field. That's a lot of overhead. Is there another way?

Statistical sampling of results and estimation, that's what I use ... for large result sets works reasonably well, for small result sets I just pay the penalty of retrieving all docs.

Also, take a look at the following: http://www2005.org/cdrom/docs/p245.pdf, "Sampling Search-Engine Results",
A. Anagnostopoulos, A. Z. Broder, D. Carmel .

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to