Fuzzy phrase matching using SpanQuery?

2009-09-28 Thread Viksit Gaur
Hi all, I'm trying to achieve the following, and wondered if I could get feedback on how best to achieve it. Given an example phrase P - "Squeamish Ossifrage Monster", I'd like to search a corpus such that in a list of results, - Docs with all 3 words in the phrase are ranked at the top -

IndexSearcher close() and search() called concurrently by different threads?

2009-09-28 Thread shaoxianyang
Hi, I am new to Lucene. Hope the question is not too naive. >From Lucene FAQ, i know that IndexSearcher instance shall be shared by threads, rather than opening one for each thread. However, after index rebuild, we need to create a new IndexSearcher instance, and call close() on the old indexS

Re: LegthFilter

2009-09-28 Thread Erdinc Yilmazel
Thanks Simon, It turned out to be a simple mistake that I made... I was creating an index with the LengthFilter applied, however reading from another index directory because of a configuration error on my ide. (The working directory paths were wrong in Run configs...) Sorry about wasting your time.

Re: Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th

2009-09-28 Thread Bradford Stephens
Hello everyone! Don't forget that the Meetup is THIS Wednesday! I'm looking forward to hearing about Hive from the Facebook team ... and there might be a few other interesting talks as well. Here's the details in the wiki: http://wiki.apache.org/hadoop/PNW_Hadoop_%2B_Apache_Cloud_Stack_User_Group

Re: PrefixQuery vs wildcardquery

2009-09-28 Thread Simon Willnauer
Ha! I need to get used to the fact that 2.9 is out there already :) Thanks mark for the addition. simon On Mon, Sep 28, 2009 at 11:18 PM, Mark Miller wrote: > Though in 2.9 this is not much of a concern - the multi term queries are > smart - if it matches few enough terms it will rewrite to a c

Re: PrefixQuery vs wildcardquery

2009-09-28 Thread Mark Miller
Though in 2.9 this is not much of a concern - the multi term queries are smart - if it matches few enough terms it will rewrite to a constant score booleanquery - if it matches a lot of terms it will rewrite to a constantscore query - using a filter underneath. So maxclause issues should no

Re: LegthFilter

2009-09-28 Thread Simon Willnauer
I don't see a reason why this shoul not work though. are you sure you have indexed all fields with this analyzer or do you iterate over terms of another field not being analyzed with the an analyzer using the length filter?! simon On Mon, Sep 28, 2009 at 1:06 PM, Erdinc Yilmazel wrote: > Sorry i

Re: PrefixQuery vs wildcardquery

2009-09-28 Thread Simon Willnauer
Depending on your usecase you might want to use the PrefixFilter instead of PrefixQuery which can be way more efficient than a query. With a filter you have the possibility to cache it very easily and you are not exposed to issued related to the length of the prefix. If you have a very short prefix

Re: PrefixQuery vs wildcardquery

2009-09-28 Thread entdeveloper
John Seer wrote: > > Hello, > > Is there any benefit of using one or other for "start with query"? > > Which one is faster? > > > Regards > It seems that you've answered your own question. If you want a "start with query", this is exactly what a PrefixQuery is for. WildcardQuery gives yo

Re: PrefixQuery vs wildcardquery

2009-09-28 Thread Mark Miller
John Seer wrote: > Hello, > > Is there any benefit of using one or other for "start with query"? > > Which one is faster? > > > Regards > Prefix query is a bit more efficient - not sure what it turns into realworld, but prefix just checks if the term's start with the prefix - wildcard has a bi

PrefixQuery vs wildcardquery

2009-09-28 Thread John Seer
Hello, Is there any benefit of using one or other for "start with query"? Regards -- View this message in context: http://www.nabble.com/PrefixQuery-vs-wildcardquery-tp25649045p25649045.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --

LegthFilter

2009-09-28 Thread Erdinc Yilmazel
Sorry if this is a stupid question. I want my index to contain terms that are at least 4 characters long. So I wrote a simple analyzer and applied the LengthFilter. When I open the index and get a TermEnum from the directory, I can still see terms that are less than 4 characters... What do you thi

Re: Search with wild-cards by words with forward-slash ("/")

2009-09-28 Thread coldserenity
Hello Eric, Thank you for help. We did use Luke and all the examples I've provided are from there. The issue has been unexpectedly solved :) As I've already mentioned, we use Lucene with Compass and following configuration was used to specify the field as "untokenized"