too many boolean clauses

2012-01-31 Thread Praveen Yarlagadda
Hi all, I have been using lucene with Hibernate to index the data. Each document is indexed with two fields: id and content. Each document corresponds to a record in the database. In my usecase, search needs to work like this: 1. Fetch records from the database based on some criteria 2. Search fo

Re: Null scorer constructed by TermQuery

2012-01-31 Thread Michael Kazekin
Uwe, I looked them up in Luke, these fields are present, are named the same, and have proper values, so the problem seems to be somewhere else :(( But anyway, thanks for your help! On 01/31/2012 12:50 PM, Uwe Schindler wrote: Hi, As this was originally a Solr index, are you sure, that the ter

Re: Phrase Queries vs. SpanTermQueries exact phrases vs. stop words

2012-01-31 Thread Doron Cohen
Hi, Code here ignores PhraseQuery (PQ) 's positions: int[] pp = PQ.getPositions(); These positions have extra gaps when stop words are removed. To accommodate for this, the overall extra gap can be added to the slope: int gap = (pp[pp.length] - pp[0]) - (pp.length - 1); // (+/- bounda

Re: using character '%' in queries (Lucene v3.1.0)

2012-01-31 Thread Gal Mainzer
I tried to use escaping but it didn't work as well (and % is not in the list). my field analyzer is ngram (min=1 max=15) and i'm writing the query using QueryBuilder API rather than string so it's not being parsed. any ideas? Thanks, Gal On Tue, Jan 31, 2012 at 9:36 PM, Erick Erickson wrote: >

Re: Why read past EOF

2012-01-31 Thread superruiye
Does it means I only to ensure reopen readers before deleted.I use default IndexDeletionPolicy: KeepOnlyLastCommitDeletionPolicy.And another two IndexDeletionPolicy,SnapshotDeletionPolicy and PersistentSnapshotDeletionPolicy,I am watching now.Are they useful to this problem? -- View this message i

Re: Lucene appears to use memory maps after unmapping them

2012-01-31 Thread Trejkaz
On Wed, Feb 1, 2012 at 1:14 PM, Robert Muir wrote: > > No, I don't think you should use close at all, because your problem is > you are calling close() when its unsafe to do so (you still have other > threads that try to search the reader after you closed it). > > Instead of trying to fix the bugs

Re: Lucene appears to use memory maps after unmapping them

2012-01-31 Thread Robert Muir
On Tue, Jan 31, 2012 at 8:32 PM, Trejkaz wrote: > On Wed, Feb 1, 2012 at 11:30 AM, Robert Muir wrote: >> the problem is caused by searching indexreaders after you closed them. >> >> in general we can try to add more and more safety, but at the end of the day, >> if you close an indexreader while

Re: Lucene appears to use memory maps after unmapping them

2012-01-31 Thread Trejkaz
On Wed, Feb 1, 2012 at 11:30 AM, Robert Muir wrote: > the problem is caused by searching indexreaders after you closed them. > > in general we can try to add more and more safety, but at the end of the day, > if you close an indexreader while a search is running, you will have problems. > > So be

Re: Lucene appears to use memory maps after unmapping them

2012-01-31 Thread Robert Muir
the problem is caused by searching indexreaders after you closed them. in general we can try to add more and more safety, but at the end of the day, if you close an indexreader while a search is running, you will have problems. So be careful to only close indexreaders that are no longer in use!

Lucene appears to use memory maps after unmapping them

2012-01-31 Thread Trejkaz
Hi all. I've found a rather frustrating issue which I can't seem to get to the bottom of. Our application will crash with an access violation around the time when the index is closed, with various indications of what's on the stack, but the common things being SegmentTermEnum.next and MMapIndexIn

RE: best query for one-box search string over multiple types & fields

2012-01-31 Thread Paul Allan Hill
> -Original Message- > short of it: i want "queen bohemian rhapsody" to return that song named > "Bohemian Rhapsody" by > the artist named "Queen", rather than songs with titles like "Bohemian > Rhapsody (Queen Cover)". Have you looked in MultiFieldQueryParser and its use of extra boosts

RE: Searching a string using lucene

2012-01-31 Thread Dave Seltzer
Hi Stephen, That is precisely what I was looking for! Thanks very much! -Dave -Original Message- From: Stephen Howe [mailto:] Sent: Tuesday, January 31, 2012 11:00 AM To: java-user@lucene.apache.org Subject: Re: Searching a string using lucene Have you taken a look at the MemoryIndex? h

Phrase Queries vs. SpanTermQueries exact phrases vs. stop words

2012-01-31 Thread Paul Allan Hill
In Lucene, 3.4 I recently implemented "Translating PhraseQuery to SpanNearQuery" (see Lucene in Action, page 220) because I wanted _order_ to matter. Here is my exact code called from getFieldsQuery once I know I'm looking at a PhraseQuery, but I think it is exactly from the book. static Q

Lucene 3.5 Payloads

2012-01-31 Thread Stephen Howe
Working with Lucene 3.5, I'd like to append a payload to a specific field in the index, at indexing time. To get that, I use the following code to produce a token stream with the Standard Analyzer for the field I want to place the payload on. public static TokenStream tokenStream(final String fiel

Re: using character '%' in queries (Lucene v3.1.0)

2012-01-31 Thread Erick Erickson
Depending on your analyzer, this could well be stripped from the input. Perhaps try using Luke to examine the actual values in the index to see if it's there. And the escape character for Lucene is the backslash.. See: http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#Escaping Special Cha

using character '%' in queries (Lucene v3.1.0)

2012-01-31 Thread Gal Mainzer
Hi, I’m using lucene on Hebrew MySql tables. I used ngram (1-15 gram sizes) in my name analyzer and the only thing that doesn’t work for me is when I try to use ‘%’ in my parsing string (didn’t find any match). I tried escaping it, using double character (“%%”) but nothing worked. Thanks, Ga

Lucene Site Feedback (and gift cards)?

2012-01-31 Thread Abhishek Rakshit
Hey Guys, As you might know, we have been working hard on building a site that would help users use and understand Lucene. We have been playing around with a number of features and are looking for people interested in participating in a 30 minute user study about the effectiveness of the website f

RE: Searching a string using lucene

2012-01-31 Thread Uwe Schindler
MemoryIndex only allows *one* document! So it is mostly for lookup if a term is contained in a document and where (used internally by the highlighter). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Ste

Re: Searching a string using lucene

2012-01-31 Thread Stephen Howe
Have you taken a look at the MemoryIndex? http://lucene.apache.org/java/3_5_0/api/all/index.html See in particular: > Typically, it is about 10-100 times faster than RAMDirectory. Note that > RAMDirectory has particularly large efficiency overheads for small to > medium sized texts, both in time a

Re: Find similar documents of different types

2012-01-31 Thread Pedro Lacerda
For the first strategy i'm using MoreLikeThis to generate one query (from Doc terms) for each analyzed field (from type1 and type2), applying boosts and searching with TermsFilter to select only documents of type2. For the second I construct an map where boost is the tf-idf of Doc (using searcher

Re: Boost term according to phonetic representation

2012-01-31 Thread Ian Lea
If all you have indexed are three identical terms there will be no way to make Markus come top. You could index the normalized version and the original (with maybe StandardAnalyzer to get downcasing etc) and do a search across both fields, boosting whichever makes sense for you. normalized:Markus

RE: Null scorer constructed by TermQuery

2012-01-31 Thread Uwe Schindler
Hi, As this was originally a Solr index, are you sure, that the term is exactly in *that* spelling (including case) in the index? You should open the index with the Luke desktop tool and inspect the term index! Solr uses an analyzer when indexing or searching, so depending on the Solr config, it m

Re: Null scorer constructed by TermQuery

2012-01-31 Thread Michael Kazekin
Uwe, thank you very much for such verbose answer! I tried the code you mentioned ( searcher.createNormalizedWeight(query) ), but it doesn't work on Lucene 3.5 for me either :( My Solr server returns the document correctly on specified term (field and value), field is indexed and stored. I'm rea