RE: Searching for keywords .net,c#,...

2013-02-25 Thread x10179
Ok, got this working with one small caveat If the token starts with a comma, e.g. - ,dummy I'd like to remove the comma like so public override bool IncrementToken() { else if (bufferLength > 1 && buffer[0] == ',' ) { // strip the starting

Re: Lucene filter questions

2013-02-25 Thread Wei Wang
Thank you, Mike. I will try it out. On Mon, Feb 25, 2013 at 4:01 PM, Michael McCandless wrote: > On Mon, Feb 25, 2013 at 2:19 PM, Wei Wang wrote: >> Cool. Thanks, Ian. >> >> I will try FieldCacheTermsFilter. >> >> A related question. Occasionally, we would like only use filtering >> conditions i

Re: Lucene filter questions

2013-02-25 Thread Michael McCandless
On Mon, Feb 25, 2013 at 2:19 PM, Wei Wang wrote: > Cool. Thanks, Ian. > > I will try FieldCacheTermsFilter. > > A related question. Occasionally, we would like only use filtering > conditions instead of ranking/scoring. Right now I use > MatchAllDocsQuery with TermsFilter. Based on the code of > M

RE: Searching for keywords .net,c#,...

2013-02-25 Thread x10179
I did search google on TokenFilter lucene example and found this link http://sujitpal.blogspot.com/2011/07/lucene-token-concatenating-tokenfilter_ 30.html which seems to override incrementToken() ( guess as I don't know java ) however using lucene.net 3.0.3, I can override public overrid

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Paul Taylor
On 20/02/2013 11:28, Paul Taylor wrote: Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests that use NormalizeCharMap for replacing characters in the anyalzers are not working. Below Ive created a self-contained test case, this is the output when I run it --term=and

Re: Lucene filter questions

2013-02-25 Thread Wei Wang
Cool. Thanks, Ian. I will try FieldCacheTermsFilter. A related question. Occasionally, we would like only use filtering conditions instead of ranking/scoring. Right now I use MatchAllDocsQuery with TermsFilter. Based on the code of MatchAllDocsQuery, it seems the class enumerates all doc IDs up t

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 12:19 PM, Thomas Matthijs wrote: > On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > >> >> On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: >> >>> On 20/02/2013 11:28, Paul Taylor wrote: >>> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems m

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > > On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > >> On 20/02/2013 11:28, Paul Taylor wrote: >> >>> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >>> that use NormalizeCharMap for replacing characters in t

Re: Do you still have to override QueryParser to allow numeric range searches in Lucene 4.1

2013-02-25 Thread Michael McCandless
You do still need to override QP in 4.x, but you should create a NumericRangeQuery/Filter, not a TermRangeQuery (even in 3.6 as well). NumericRangeQuery is much more efficient (visits far fewer terms) than TermRangeQuery. Mike McCandless http://blog.mikemccandless.com On Mon, Feb 25, 2013 at 5:

Do you still have to override QueryParser to allow numeric range searches in Lucene 4.1

2013-02-25 Thread Paul Taylor
In my 3.6 code I was adding numeric field to my index as follows: public void addNumericField(IndexField field, Integer value) { addField(field, NumericUtils.intToPrefixCoded(value)); } but I've chnaged it to (work in progress) public void addNumericField(IndexField field, Integer

Re: SpanQuery.getSpans() with document sorting

2013-02-25 Thread Michael McCandless
And DocValues (new in 4.x). In 4.2 DocValues can be loaded via the FieldCache API. Mike McCandless http://blog.mikemccandless.com On Mon, Feb 25, 2013 at 5:19 AM, Ian Lea wrote: > FieldCache? > > > -- > Ian. > > > On Sun, Feb 24, 2013 at 4:46 PM, Igor Shalyminov > wrote: >> A slightly more sp

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > On 20/02/2013 11:28, Paul Taylor wrote: > >> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >> that use NormalizeCharMap for replacing characters in the anyalzers are not >> working. >> >> bump, anybody I thought a s

Re: Min/max support in Lucene

2013-02-25 Thread Ian Lea
TermsEnum will give you the first, and the last if you loop through to the end. Generally pretty fast. Or skip through with seekCeil() - might be faster. -- Ian. On Wed, Feb 20, 2013 at 11:31 PM, Vitaly Funstein wrote: > I know that general questions about aggregate functions have been asked

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Paul Taylor
On 20/02/2013 11:28, Paul Taylor wrote: Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests that use NormalizeCharMap for replacing characters in the anyalzers are not working. bump, anybody I thought a self contained testcase would be enough to pique somebodys interest, a

Re: SpanQuery.getSpans() with document sorting

2013-02-25 Thread Ian Lea
FieldCache? -- Ian. On Sun, Feb 24, 2013 at 4:46 PM, Igor Shalyminov wrote: > A slightly more specific question: > Is it possible to load in RAM a single stored field for all the documents in > the index via some Lucene data structures? > > -- > Best Regards, > Igor Shalyminov > > 23.02.2013,

Re: Lucene filter questions

2013-02-25 Thread Ian Lea
I'm sure that Filters are thread safe. Lucene doesn't have a global caching mechanism as such. But see FieldCache - you might get better performance from FieldCacheTermsFilter than from TermsFilter. See also CachingWrapperFilter and QueryWrapperFilter. -- Ian. On Mon, Feb 25, 2013 at 1:16 AM