date:20121103

Re: "read past EOF" when merge

2012-11-03 Thread Mark Miller

Can you file a JIRA Markus? This is probably related to the new code that uses Directory for replication. - Mark On Nov 2, 2012, at 6:53 AM, Markus Jelsma wrote: > Hi, > > For what it's worth, we have seen similar issues with Lucene/Solr from this > week's trunk. The issue manifests itself w

using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org

hi, I want to make sure that every comma (,) and semi-colon (;) is followed by a space prior to tokenizing. the idea is to then use a WhitespaceTokenizer which will keep commas but still split the phrase in a case like: "I bought red apples,green pears,and yellow oranges" I'm thinking

Re: using CharFilter to inject a space

2012-11-03 Thread Robert Muir

On Sat, Nov 3, 2012 at 7:35 PM, Igal @ getRailo.org wrote: > hi, > > I want to make sure that every comma (,) and semi-colon (;) is followed by a > space prior to tokenizing. > > the idea is to then use a WhitespaceTokenizer which will keep commas but > still split the phrase in a case like: > >

Re: using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org

I considered it, and it's definitely an option. but I read in the book "Lucene In Action" that MappingCharFilter is inefficient and I'm not sure that I need that. if implementing my own involves a lot of coding then I might resort to it as I don't have large data sets to index at this time.

Re: using CharFilter to inject a space

2012-11-03 Thread Robert Muir

On Sat, Nov 3, 2012 at 7:47 PM, Igal @ getRailo.org wrote: > I considered it, and it's definitely an option. > > but I read in the book "Lucene In Action" that MappingCharFilter is > inefficient and I'm not sure that I need that. if implementing my own > involves a lot of coding then I might reso

Re: using CharFilter to inject a space

2012-11-03 Thread Robert Muir

On Sat, Nov 3, 2012 at 7:47 PM, Igal @ getRailo.org wrote: > I considered it, and it's definitely an option. > > but I read in the book "Lucene In Action" that MappingCharFilter is > inefficient and I'm not sure that I need that. if implementing my own > involves a lot of coding then I might reso

Re: using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org

hi Robert, thank you for your replies. I couldn't find much documentation/examples of this, but this is what I came up with (below). is that the way I'm supposed to use the MappingCharFilter? also, if that is the correct way, wouldn't it make sense to return a reference to "this" from Norm

Re: using CharFilter to inject a space

2012-11-03 Thread Robert Muir

On Sat, Nov 3, 2012 at 8:32 PM, Igal @ getRailo.org wrote: > hi Robert, > > thank you for your replies. > > I couldn't find much documentation/examples of this, but this is what I came > up with (below). is that the way I'm supposed to use the MappingCharFilter? > You don't need to extend anythi

Re: Using new similarities in Lucene 4.0

2012-11-03 Thread Robert Muir

On Tue, Oct 30, 2012 at 10:20 AM, parnab kumar wrote: > Hi all, > > Lucene 4 has introduced several state of the art ranking functions. I > was wondering how could i make use of those similarities . IndexSearcher.setSimilarity(new XYZSimilarity()); > These models > obviously uses some more

Re: using CharFilter to inject a space

2012-11-03 Thread Igal Sapir

You're right. I'm not sure what I was thinking. Thanks for all your help, Igal On Nov 3, 2012 5:44 PM, "Robert Muir" wrote: > On Sat, Nov 3, 2012 at 8:32 PM, Igal @ getRailo.org > wrote: > > hi Robert, > > > > thank you for your replies. > > > > I couldn't find much documentation/examples of

Re: using CharFilter to inject a space

2012-11-03 Thread Erick Erickson

So I've gotta ask... _why_ do you want to inject the spaces? If it's just to break this up into tokens, wouldn't something like LetterTokenizer do? Assuming you aren't interested in leaving in numbers Or even StandardTokenizer unless you have e-mail & etc. Or what about PatternReplaceCharFilt

Re: using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org

well, my main goal is to use a ShingleFilter that will only take shingles that are not separated by commas etc. for example, the phrase: "red apples, green tomatoes, and brown potatoes" should yield the shingles "red apples", "green tomatoes", "and brown", "brown potatoes"; but not "apple

Re: "read past EOF" when merge

using CharFilter to inject a space

Re: using CharFilter to inject a space

Re: using CharFilter to inject a space

Re: using CharFilter to inject a space

Re: using CharFilter to inject a space

Re: using CharFilter to inject a space

Re: using CharFilter to inject a space

Re: Using new similarities in Lucene 4.0

Re: using CharFilter to inject a space

Re: using CharFilter to inject a space

Re: using CharFilter to inject a space

12 matches

Site Navigation

Mail list logo

Footer information