date:20140609

Re: searching with stemming

2014-06-09 Thread Jack Krupansky

Please do file a Jira. I'm sure the discussion will be interesting. -- Jack Krupansky -Original Message- From: Jamie Sent: Monday, June 9, 2014 9:33 AM To: java-user@lucene.apache.org Subject: Re: searching with stemming Jack Thanks. I figured as much. I'm modifying each analyzer wi

Re: SpanQuery not working as expected

2014-06-09 Thread Darin McBeath

Hi Tim. Thanks for your help. I had a friend provide me some code (some snippets below) that could dump the supposed matching spans (this provided some more insight). Perhaps, some of my findings could help someone potentially fix the bug. So, I added my 2 documents public static String []

RE: SpanQuery not working as expected

2014-06-09 Thread Allison, Timothy B.

Darin, I confirmed the behavior you reported. This is probably the same bug that was reported in LUCENE-5331. The trigger there seems to be multiple examples of the same token (which you have plenty of). I tested with just this: [[darin fulford]~100 sauthor]!~0,0 darin fulford (non-directio

RE: Reading a v2 index in v4

2014-06-09 Thread Uwe Schindler

Hi, there is a way to make this work (which is the "official way" to do it): Your application software is already on Lucene 3.6, so why not simply use the IndexUpgrader class, which is shipped with Lucene 3.6? This class will upgrade the existing indexes (back to version 1.0) of your users to t

Re: searching with stemming

2014-06-09 Thread Jamie

Jack Thanks. I figured as much. I'm modifying each analyzer with constructors that take a Stem argument: public enum Stem { AGGRESSIVE, LIGHT, NONE }; This is obviously, not ideal, 20 or more Lucene classes must be updated. I now need to maintain each analyzer. Regards Jamie On 2014/0

Re: searching with stemming

2014-06-09 Thread Jack Krupansky

I find the weak Javadoc even more troubling: http://lucene.apache.org/core/4_8_0/analyzers-common/org/apache/lucene/analysis/en/EnglishAnalyzer.html The real bottom line is that you are expected to "roll your own" in this area. Feel free to file a Jira suggesting your suggested improvement. -

Re: Reading a v2 index in v4

2014-06-09 Thread Trejkaz

On Mon, Jun 9, 2014 at 10:17 PM, Adrien Grand wrote: > Hi, > > It is not possible to read 2.x indices from Lucene 4, even with a > custom codec. For instance, Lucene 4 needs to hook into > SegmentInfos.read to detect old 3.x indices and force the use of the > Lucene3x codec since these indices don

Re: Reading a v2 index in v4

2014-06-09 Thread Adrien Grand

Hi, It is not possible to read 2.x indices from Lucene 4, even with a custom codec. For instance, Lucene 4 needs to hook into SegmentInfos.read to detect old 3.x indices and force the use of the Lucene3x codec since these indices don't expose what codec has been used to write them. On Mon, Jun 9

Re: searching with stemming

2014-06-09 Thread Jamie

Benson. Thanks. I was just hoping to avoid a whole bunch of boilerplate. On 2014/06/09, 1:07 PM, Benson Margulies wrote: Analyzer classes are optional; an analyzer is just a factory for a set of token stream components. you can usually do just fine with an anonymous class. Or in your case, the o

Re: searching with stemming

2014-06-09 Thread Benson Margulies

Analyzer classes are optional; an analyzer is just a factory for a set of token stream components. you can usually do just fine with an anonymous class. Or in your case, the only thing different for each language will be the stop words, so you can have one analyzer class with a language parameter.

Re: searching with stemming

2014-06-09 Thread Jamie

I am not using Solr. I am using the default analyzers... On 2014/06/09, 12:59 PM, Benson Margulies wrote: Are you using Solr? If so you are on the wrong mailing list. If not, why do you need a non- -anonymous analyzer at all. On Jun 9, 2014 6:55 AM, "Jamie" wrote: To me, it seems strange that

Re: searching with stemming

2014-06-09 Thread Benson Margulies

Are you using Solr? If so you are on the wrong mailing list. If not, why do you need a non- -anonymous analyzer at all. On Jun 9, 2014 6:55 AM, "Jamie" wrote: > To me, it seems strange that these default analyzers, don't provide > constructors that enable one to override stemming, etc? > > On 201

Re: searching with stemming

2014-06-09 Thread Jamie

To me, it seems strange that these default analyzers, don't provide constructors that enable one to override stemming, etc? On 2014/06/09, 12:39 PM, Trejkaz wrote: On Mon, Jun 9, 2014 at 7:57 PM, Jamie wrote: Greetings Our app currently uses language specific analysers (e.g. EnglishAnalyzer,

Reading a v2 index in v4

2014-06-09 Thread Trejkaz

Hi all. The inability to read people's existing indexes is essentially the only thing stopping us upgrading to v4, so we're stuck indefinitely on v3.6 until we find a way around this issue. As I understand it, Lucene 4 added the notion of codecs which can precisely choose how to read and write th

Re: searching with stemming

2014-06-09 Thread Trejkaz

On Mon, Jun 9, 2014 at 7:57 PM, Jamie wrote: > Greetings > > Our app currently uses language specific analysers (e.g. EnglishAnalyzer, > GermanAnalyzer, etc.). We need an option to disable stemming. What's the > recommended way to do this? These analyzers do not include an option to > disable stem

Re: searching with stemming

2014-06-09 Thread Jamie

Benson Yes, I can of course do this, as far I can see I would have to override each analyzer. This is a pain. Regards Jamie On 2014/06/09, 12:29 PM, Benson Margulies wrote: You should construct an analysis chain that does what you need. Read the source of the relevant analyzer and pick the to

Re: searching with stemming

2014-06-09 Thread Benson Margulies

You should construct an analysis chain that does what you need. Read the source of the relevant analyzer and pick the tokenizer and filter(s) that you need, and don't include stemming. On Mon, Jun 9, 2014 at 5:57 AM, Jamie wrote: > Greetings > > Our app currently uses language specific analyser

searching with stemming

2014-06-09 Thread Jamie

Greetings Our app currently uses language specific analysers (e.g. EnglishAnalyzer, GermanAnalyzer, etc.). We need an option to disable stemming. What's the recommended way to do this? These analyzers do not include an option to disable stemming, only a parameter to specify a list words for w

Re: searching with stemming

Re: SpanQuery not working as expected

RE: SpanQuery not working as expected

RE: Reading a v2 index in v4

Re: searching with stemming

Re: searching with stemming

Re: Reading a v2 index in v4

Re: Reading a v2 index in v4

Re: searching with stemming

Re: searching with stemming

Re: searching with stemming

Re: searching with stemming

Re: searching with stemming

Reading a v2 index in v4

Re: searching with stemming

Re: searching with stemming

Re: searching with stemming

searching with stemming

18 matches

Site Navigation

Mail list logo

Footer information