Similarity

2007-08-14 Thread Enis Soztutar
Similarity (such as Similarity#lengthNorm()) but not all of them. Does anybody know the reason for this? Thanks. Enis Soztutar - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to keep user search history and how to turn it into information?

2007-08-13 Thread Enis Soztutar
Behavior Improving Web Search Ranking by Incorporating User behaviour Information Learning User Interaction Models for Predicting Web Search Result Preferences Optimizing_search_engines_using_clickthrough_data Query Chains: Learning to Rank from Implicit Feedback Lukas On 8/10/07, Enis Soz

Re: How to keep user search history and how to turn it into information?

2007-08-10 Thread Enis Soztutar
Lukas Vlcek wrote: Hi Enis, Hi again, On 8/10/07, Enis Soztutar <[EMAIL PROTECTED]> wrote: Hi, Lukas Vlcek wrote: Hi, I would like to keep user search history data and I am looking for some ideas/advices/recommendations. In general I would like to talk about m

Re: How to keep user search history and how to turn it into information?

2007-08-10 Thread Enis Soztutar
Hi, Lukas Vlcek wrote: Hi, I would like to keep user search history data and I am looking for some ideas/advices/recommendations. In general I would like to talk about methods of storing such data, its structure and how to turn it into valuable information. As for the structure: ==

Re: multiple tokens at the same position

2007-05-25 Thread Enis Soztutar
On 5/25/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : Yes, indeed we could but it brings other problems, for example increasing : the index size, and extending the query to search for multiple fields, etc. 1) if you index both teh raw and stemmed forms your index is going to grow to roughly

Re: multiple tokens at the same position

2007-05-25 Thread Enis Soztutar
Yes, indeed we could but it brings other problems, for example increasing the index size, and extending the query to search for multiple fields, etc. On 5/25/07, Steven Rowe <[EMAIL PROTECTED]> wrote: Hi Enis, Enis Soztutar wrote: > In nutch we have a use case in which we need to sto

multiple tokens at the same position

2007-05-25 Thread Enis Soztutar
Hi, In nutch we have a use case in which we need to store tokens with their original text plus their stemmed form plus their canonical form(through some asciifization). From my understanding of lucene, it makes sense to write a tokenstream which generates several tokens for each "word", but p

Re: Indexing Open Office documents

2007-05-17 Thread Enis Soztutar
These is a parser for open office in Nutch. It is a plugin called parse-oo. You can find more information in the nutch mailing lists. On 5/17/07, jim shirreffs <[EMAIL PROTECTED]> wrote: Anyone know how to add OpenOffice document to a Lucene index? Is there a parser for OpenOffice? thanks in