Re: Multi-field IDF

2016-11-18 Thread Will Martin
In this work, we aim to improve the fi eld weighting for structured doc- ument retrieval. We fi rst introduce the notion of fi eld relevance as the generalization of fi eld weights, and discuss how it can be estimated using relevant documents, which eff ectively implements relevance feedback for f

Re: Multi-field IDF

2016-11-18 Thread Ahmet Arslan
Hi Nicholas, Aha, I see that you are into field-based scoring, which is an unsolved problem. Then, you might find BlendedTermQuery and SynonymQuery relevant. Ahmet On Friday, November 18, 2016 12:22 AM, Nicolás Lichtmaier wrote: That depends on what you want. In this case I want to use a

Re: Multi-field IDF

2016-11-17 Thread Will Martin
are you familiar with pivoted normalized document length practice or theory? or croft's recent work on relevance algorithms accounting for structured field presence? On 11/17/2016 5:20 PM, Nicolás Lichtmaier wrote: That depends on what you want. In this case I want to use a discrimination po

Re: Multi-field IDF

2016-11-17 Thread Nicolás Lichtmaier
That depends on what you want. In this case I want to use a discrimination power based in all the body text, not just the titles. Because otherwise terms that are really not that relevant end up being very high! El 17/11/16 a las 18:25, Ahmet Arslan escribió: Hi Nicholas, IDF, among others,

Re: Multi-field IDF

2016-11-17 Thread Ahmet Arslan
Hi Nicholas, IDF, among others, is a measure of term specificity. If 'or' is not so usual in titles, then it has some discrimination power in that domain. I think it's OK 'or' to get a high IDF value in this case. Ahmet On Thursday, November 17, 2016 9:09 PM, Nicolás Lichtmaier wrote: IDF