RE: Custom lucene scoring - Dot product between field boost and query boost

2012-02-21 Thread Yuval Kesten
Hi Em, 1. Regarding the performances - the similarity class (And my subtype as well) gets the IDF and TF and SQUARED SUMS calculations as inputs - they just factor them differently. Even though I ignore the values they are being computed. 2. I have written this code: static { Similari

Re: [Bulk] can I make incremental index/search more efficient?

2012-02-21 Thread Ganesh
You need to follow the second method.. Loop over all the available docs, check if it is there in the index, if not Index it. Perform search on the list of words you have. Add Document name and its modified date time as part of the index. This helps you could search only the particular document,

can I make incremental index/search more efficient?

2012-02-21 Thread Ilya Zavorin
I have a fairly straightforward task: I have a collection of N documents and a set of "hot" words. I need to find all occurrences of these words in all the docs. The original use case was that I would get all the docs at once. In this case, I: 1. Create a single index for all the docs 2. Lo

JTRES 2012 Call for Paper

2012-02-21 Thread Martin Schoeberl
== CALL FOR PAPERS The 10th Workshop on Java Technologies for Real-Time and Embedded Systems JTRES 2009 Technical University of Den

Highlighting does not work with PayloadTermQueries

2012-02-21 Thread Nitin Arora
Hi, I'm using SOLR and Lucene in my application for search. I'm facing an issue of highlighting using FastVectorHighlighter not working when I use PayloadTermQueries as clauses of a BooleanQuery. After Debugging I found that In DefaultSolrHighlighter.Java, fvh.getFieldQuery does not return any t

Re: Question about CustomScoreQuery

2012-02-21 Thread Dominika Puzio
Thanks for the help! I feel this performance advice about FieldCache in ctor saved me a lot of time :) I've done what you said and it works. -- Dominika On 21.02.2012 10:46, Uwe Schindler wrote: It looks like you already implemented a CustomScoreProvider. You are retrieving the FieldCache on e

Re: Custom lucene scoring - Dot product between field boost and query boost

2012-02-21 Thread Em
Hi Yuval, > 1. Performances: I am calculating all the TF/IDF stuff and NORMS for > nothing... You aren't calculating that much, since you declared all those values as constants. What are you worried about? > 2. The score I get from the TopScoreDocCollector is not the same as I get from the Explan

RE: Custom lucene scoring - Dot product between field boost and query boost

2012-02-21 Thread Yuval Kesten
The same question is formatted nicer here: http://stackoverflow.com/questions/9380188/custom-lucene-scoring-dot-product-between-field-boost-and-query-boost Thanks! -Original Message- From: Yuval Kesten [mailto:ykes...@yahoo-inc.com] Sent: Tuesday, February 21, 2012 5:18 PM To: java-user@

Custom lucene scoring - Dot product between field boost and query boost

2012-02-21 Thread Yuval Kesten
Hi, I want to use Lucene with the following scoring logic: When I index my documents I want to set for each field a score/weight. When I query my index I want to set for each query term a score/weight. I will NEVER index or query with many instances of the same field - In each query (document) th

Re: Can I just add ShingleFilter to my nalayzer used for indexing and searching

2012-02-21 Thread Paul Taylor
On 21/02/2012 14:37, Steven A Rowe wrote: Hi Paul, Lucene QueryParser splits on whitespace and then sends individual words one-by-one to be analyzed. All analysis components that do their work based on more than one word, including ShingleFilter and SynonymFilter, are borked by this. (There

RE: Can I just add ShingleFilter to my nalayzer used for indexing and searching

2012-02-21 Thread Steven A Rowe
Hi Paul, Lucene QueryParser splits on whitespace and then sends individual words one-by-one to be analyzed. All analysis components that do their work based on more than one word, including ShingleFilter and SynonymFilter, are borked by this. (There is a JIRA issue open for the QueryParser pr

Can I just add ShingleFilter to my nalayzer used for indexing and searching

2012-02-21 Thread Paul Taylor
Trying out ShingleFIlter and the way it is documented it implys that you can just add it to your anaylzer and that's it with no side-effects except a larger index, but I read other implying you have to modify the way you parse user queries, could anyone confirm/deny. Also is there an easy way

RE: Question about CustomScoreQuery

2012-02-21 Thread Uwe Schindler
It looks like you already implemented a CustomScoreProvider. You are retrieving the FieldCache on every document, which slows down immense (it's a sycronized cache lookup). The correct way is: Override CSQ.getCustomScoreProvider and return your own CSP there. The CSP itself should get the FieldCa

Re: Question about CustomScoreQuery

2012-02-21 Thread Dominika Puzio
Thanks for your answer. I checked what explain() says about my queries, and: MatchAllDocsQuery: 1.0 = (MATCH) MatchAllDocsQuery, product of: 1.0 = queryNorm FieldScoreQuery: 0.5 = (MATCH) float(ratio), product of: 0.5 = float(ratio)=0.5 1.0 = boost 1.0 = queryNorm CustomScoreQuery: 0.24