RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-11 Thread Martin O';Shea
Ahmet, Yes that is quite true. But as this is only a proof of concept application, I'm prepared for things to be 'imperfect'. Martin O'Shea. -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID] Sent: 11 Nov 2014 18 26 To: java-user@lucene.ap

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-11 Thread Martin O';Shea
t this one allows to handle that: You should make stop-filter case insensitive (there is a boolean to do this): StopFilter(boolean enablePositionIncrements, TokenStream input, Set stopWords, boolean ignoreCase) Uwe > Martin O'Shea. > -Original Message- > From: Uwe Schindler [mail

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Martin O';Shea
ilter(boolean enablePositionIncrements, TokenStream input, Set stopWords, boolean ignoreCase) Uwe > Martin O'Shea. > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: 10 Nov 2014 14 06 > To: java-user@lucene.apache.org > Subject: RE: How to disa

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Martin O';Shea
dTokenizer,...); stopFilter = new StopFilter(standardFilter,...); snowballFilter = new SnowballFilter(stopFilter,...); But ignore LowerCaseFilter. Does this make sense? Thanks Martin O'Shea. -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: 10 Nov 2014 14 0

How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Martin O';Shea
I realise that 3.0.2 is an old version of Lucene but if I have Java code as follows: int nGramLength = 3; Set stopWords = new Set(); stopwords.add("the"); stopwords.add("and"); ... SnowballAnalyzer snowballAnalyzer = new SnowballAnalyzer(Version.LUCENE_30, "English", stopWords);

RE: Using stop words with snowball analyzer and shingle filter

2012-09-20 Thread Martin O';Shea
add in the code to invoke StopFilter.setEnablePositionIncrements the way StopFilterFactory does. -- Jack Krupansky -Original Message- From: Martin O'Shea Sent: Wednesday, September 19, 2012 4:24 AM To: java-user@lucene.apache.org Subject: Using stop words with snowball analyzer and shingle fil

RE: Using a Lucene ShingleFilter to extract frequencies of bigrams in Lucene

2012-09-06 Thread Martin O';Shea
: 05 Sep 2012 01 53 To: java-user@lucene.apache.org Subject: Re: Using a Lucene ShingleFilter to extract frequencies of bigrams in Lucene On Tue, Sep 4, 2012 at 12:37 PM, Martin O'Shea wrote: > > Does anyone know if this can be used in conjunction with other > analyzers to return the

Using a Lucene ShingleFilter to extract frequencies of bigrams in Lucene

2012-09-04 Thread Martin O';Shea
If a Lucene ShingleFilter can be used to tokenize a string into shingles, or ngrams, of different sizes, e.g.: "please divide this sentence into shingles" Becomes: shingles "please divide", "divide this", "this sentence", "sentence into", and "into shingles" Does anyone know

Combining analyzers in Lucene

2011-03-05 Thread Martin O';Shea
Hello I have a situation where I'm using two methods in a Java class to implement a StandardAnalyzer in Lucene to index text strings and return their word frequencies as follows: public void indexText(String suffix, boolean includeStopWords) { StandardAnalyzer analyzer = null;

FW: Use of hyphens in StandardAnalyzer

2010-10-24 Thread Martin O';Shea
t the version to 3.1 or higher in the constructor. Steve > -Original Message- > From: Martin O'Shea [mailto:app...@dsl.pipex.com] > Sent: Sunday, October 24, 2010 3:59 PM > To: java-user@lucene.apache.org > Subject: Use of hyphens in StandardAnalyzer > > Hello >

RE: Use of hyphens in StandardAnalyzer

2010-10-24 Thread Martin O';Shea
n the constructor. Steve > -Original Message- > From: Martin O'Shea [mailto:app...@dsl.pipex.com] > Sent: Sunday, October 24, 2010 3:59 PM > To: java-user@lucene.apache.org > Subject: Use of hyphens in StandardAnalyzer > > Hello > > > > I have a Standar

Use of hyphens in StandardAnalyzer

2010-10-24 Thread Martin O';Shea
I've tried combinations of: addDoc(w, "lucene \"Lawton-Browne\" Lucene"); And single quotes but without success. Thanks Martin O'Shea.

RE: Using a TermFreqVector to get counts of all words in a document

2010-10-20 Thread Martin O';Shea
0, at 2:53 PM, Martin O'Shea wrote: > Uwe > > Thanks - I figured that bit out. I'm a Lucene 'newbie'. > > What I would like to know though is if it is practical to search a single > document of one field simply by doing this: > > IndexReader trd

RE: Using a TermFreqVector to get counts of all words in a document

2010-10-20 Thread Martin O';Shea
-- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Martin O'Shea [mailto:app...@dsl.pipex.com] > Sent: Wednesday, October 20, 2010 8:23 PM > To: java-user@lucene.apache.org > Subject: U

Using a TermFreqVector to get counts of all words in a document

2010-10-20 Thread Martin O';Shea
"Lucene for Dummies"); And the queryString being used is simply "dummies". Thanks Martin O'Shea.

RE: Use of Lucene to store data from RSS feeds

2010-10-15 Thread Martin O';Shea
ed upon the length of time required. > > This can be done as a database table and hashmaps used to calculate word > frequencies. But can I do this in Lucene to > this degree of granularity at all? If so, would each feed form a Lucene > doc