Re: analyzer context during search

2018-04-13 Thread Chris Tomlinson
Hi, Thanks for the thoughts. I agree a combinatorial explosion of fields and index size would “solve” the problem but the cost is rather absurd. Hence, I posed the problem to prompt some discussion about what a plausible/reasonable solution might be. It has seemed to be for some time that ther

Re: analyzer context during search

2018-04-12 Thread Michael Sokolov
I think you can achieve what you are asking by having a field for every possible combination of pairs of input and output. Obviously this would explode the size of your index, so it's not ideal. Another alternative would be indexing all variants into a single field, using different analyzers for d

Re: Analyzer is not called upon executing addDocument()

2018-01-09 Thread Armins Stepanjans
Thanks a lot for the help. Regards, Armīns On Tue, Jan 9, 2018 at 6:46 PM, Uwe Schindler wrote: > Hi, > > StringField is not analyzed. You need to use TextField. StringField is > indexed as is as a single unmodified token in index (e.g., to be used for > identifiers or facets). > > Uwe > >

RE: Analyzer is not called upon executing addDocument()

2018-01-09 Thread Uwe Schindler
Hi, StringField is not analyzed. You need to use TextField. StringField is indexed as is as a single unmodified token in index (e.g., to be used for identifiers or facets). Uwe - Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Mes

Re: Analyzer for supporting hyphenated words

2015-07-23 Thread Diego Socaceti
Hi Alessandro, after talking to our customer: Yes, it needs to be a mix of classic and quoted queries in one userCriteria. Before we look into the details of the QueryParser. I'm currently using org.apache.lucene.queryparser.classic.QueryParser of 5.2.1. Is this the right QueryParser to use? Th

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Alessandro Benedetti
Yes what I meant is that you actually can use your analyser when the query is not in the quotes. When in the quotes you can directly build a term Query out of it. Now of course it is not so simple scenario, do you think quoted query and not quoted query parts are 2 different set of queries, which

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
sorry little code refactoring typo: curTokenProcessed should be userCriteriaProcessed ... public static final String EXACT_SEARCH_FORMAT = "\"%s\""; public static final String MULTIPLE_CHARACTER_WILDCARD = "*"; ... if (isExactCriteriaString(userCriteria)) { String userCriteriaEscaped = St

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
Hi Alessandro, sorry, that i forgot the important part. Here it is: ... public static final String EXACT_SEARCH_FORMAT = "\"%s\""; public static final String MULTIPLE_CHARACTER_WILDCARD = "*"; ... if (isExactCriteriaString(userCriteria)) { String userCriteriaEscaped = String.format(EXACT

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Alessandro Benedetti
I read briefly, correct me if I am wrong, but that is to parse the content within the quotes " . But we are still at a String level. I want to see how you build the phraseQuery :) Taking a look to the code the PhraseQuery allow you to add as many terms you want. What you need to do, it's to not to

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
Hi Alessandro, i guess code says more than worlds :) ... public static final String EXACT_SEARCH_FORMAT = "\"%s\""; public static final String MULTIPLE_CHARACTER_WILDCARD = "*"; ... if (isExactCriteriaString(userCriteria)) { String userCriteriaEscaped = String.format(EXACT_SEARCH_FORMAT,

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Alessandro Benedetti
As a start Diego, how do you currently parse the user query to build the Lucene queries ? Cheers 2015-07-22 8:35 GMT+01:00 Diego Socaceti : > Hi Alessandro, > > yes, i want the user to be able to surround the query with "" to run the > phrase query with a NOT tokenized phrase. > > What do i have

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
Hi Alessandro, yes, i want the user to be able to surround the query with "" to run the phrase query with a NOT tokenized phrase. What do i have to do? Thanks and Kind regards On Tue, Jul 21, 2015 at 2:47 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Hey Jack, reading the doc

Re: Analyzer for supporting hyphenated words

2015-07-21 Thread Alessandro Benedetti
Hey Jack, reading the doc : " Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text. NOTE: this behavior may not be suitable for all languages. Set to false if phrase queries should only be generated when surround

Re: Analyzer for supporting hyphenated words

2015-07-21 Thread Jack Krupansky
If you don't explicitly enable automatic phrase queries, the Lucene query parser will assume an OR operator on the sub-terms when a white space-delimited term analyzes into a sequence of terms. See: https://lucene.apache.org/core/5_2_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserB

Re: Analyzer for supporting hyphenated words

2015-07-21 Thread Alessandro Benedetti
Hi Diego, let me try to help : I find this a little bit confused : "For our customer it is important to find the word - *wi-fi* by wi, *fi*, wifi, wi-fi - jean-pierre by jean, pierre, jean-pierre, jean-*" But : " The (exact) query "*FD-A320-REC-SIM-1*" returns FD-A320-REC-SIM-1 MIA-*FD-A320-REC-

Re: Analyzer: Access to document?

2015-02-04 Thread Ahmet Arslan
Hi Ralf, Does following code fragment work for you? /** * Modified from : http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/analysis/package-summary.html */ public List getAnalyzedTokens(String text) throws IOException { final List list = new ArrayList<>(); try (TokenStream ts = analy

Re: Analyzer Does Not Works As Accepted

2014-02-28 Thread Furkan KAMACI
Hi; I have put StopFilter after lower case filter. Thanks; Furkan KAMACI 28 Şub 2014 12:06 tarihinde "pravesh" yazdı: > >>I fixed the problems. StopFilter was not working as accepted because of > letter cases. > > Alternatively, you could have moved StopFilter above the > WordDelimiterFilter >

Re: Analyzer Does Not Works As Accepted

2014-02-28 Thread pravesh
>>I fixed the problems. StopFilter was not working as accepted because of letter cases. Alternatively, you could have moved StopFilter above the WordDelimiterFilter in your analysis chain. Regards Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/Analyzer-Does-Not-W

Re: Analyzer Does Not Works As Accepted

2014-02-26 Thread Furkan KAMACI
Hi; I fixed the problems. StopFilter was not working as accepted because of letter cases. I've changed the flags of WordDelimiter. Also I've changed TokenStream to TokenFilter. Thanks; Furkan KAMACI 2014-02-26 20:05 GMT+02:00 Furkan KAMACI : > Hi; > > I have impelented that custom Analyzer: >

Re: Analyzer classes versus the constituent components

2013-10-08 Thread Michael Sokolov
There are some Analyzer methods you might want to override (initReader for inserting a CharFilter, stuff about gaps), but if you don't need that, it seems to be mostly about packaging neatly, as you say. -Mike On 10/8/13 10:30 AM, Benson Margulies wrote: Is there some advice around about when

Re: Analyzer in QueryParser behaves differently from IndexWriter

2013-01-13 Thread Igal @ getRailo.org
thanks Erik. I tried putting the query in "double quotes" and it made some difference but still not exactly what I'm looking for. so what's my best solution? to avoid using the QueryParser and instead "parse" the query myself? is there a different (better) query parser for this situation?

Re: Analyzer in QueryParser behaves differently from IndexWriter

2013-01-13 Thread Erik Hatcher
The analyzer through QueryParser is invoked for each "clause" and thus in your example it's invoked 4 times and thus each invocation only sees one word/term. Erik On Jan 13, 2013, at 2:13, "Igal @ getRailo.org" wrote: > hi, > > I've created an Analyzer that performs a few filtering tasks

Re: Analyzer on query question

2012-08-03 Thread Jack Krupansky
-- Jack Krupansky -Original Message- From: Bill Chesky Sent: Friday, August 03, 2012 5:35 PM To: java-user@lucene.apache.org Subject: RE: Analyzer on query question Thanks for the help everybody. We're using 3.0.1 so I couldn't do exactly what Simon and Jack suggested. Bu

RE: Analyzer on query question

2012-08-03 Thread Bill Chesky
Jack Krupansky [mailto:j...@basetechnology.com] Sent: Friday, August 03, 2012 4:03 PM To: java-user@lucene.apache.org Subject: Re: Analyzer on query question Simon gave sample code for analyzing a multi-term string. Here's some pseudo-code (hasn't been compiled to check it) to analyze a

Re: Analyzer on query question

2012-08-03 Thread Ian Lea
gt;BytesRef bytes = termAtt.getBytesRef(); >>return new Term(BytesRef.deepCopyOf(bytes)); >> } else >>return null; >> // TODO: Close the StringReader >> // TODO: Handle terms that analyze into multiple terms (e.g., embedded >> punctuation) >> } &

Re: Analyzer on query question

2012-08-03 Thread Robert Muir
ew Term(BytesRef.deepCopyOf(bytes)); > } else >return null; > // TODO: Close the StringReader > // TODO: Handle terms that analyze into multiple terms (e.g., embedded > punctuation) > } > > -- Jack Krupansky > > -----Original Message- From: Bill Chesky > Sent

Re: Analyzer on query question

2012-08-03 Thread Jack Krupansky
ll Chesky Sent: Friday, August 03, 2012 2:55 PM To: java-user@lucene.apache.org Subject: RE: Analyzer on query question Ian/Jack, Ok, thanks for the help. I certainly don't want to take a cheap way out, hence my original question about whether this is the right way to do this. Jack, you

RE: Analyzer on query question

2012-08-03 Thread Bill Chesky
this I'd greatly appreciate it. regards, Bill -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Friday, August 03, 2012 1:22 PM To: java-user@lucene.apache.org Subject: Re: Analyzer on query question Bill, the re-parse of Query.toString will work pro

Re: Analyzer on query question

2012-08-03 Thread Jack Krupansky
cause term analysis such as stemming) becomes unnecessary and risky if you are not very careful or very lucky. -- Jack Krupansky -Original Message- From: Ian Lea Sent: Friday, August 03, 2012 1:12 PM To: java-user@lucene.apache.org Subject: Re: Analyzer on query question Bill

Re: Analyzer on query question

2012-08-03 Thread Ian Lea
; So I don't see the advantage to doing it this way over the original method. > I just don't know if the original way I described is wrong or will give me > bad results. > > thanks for the help, > > Bill > > -Original Message- > From: Ian Lea [mailto:i

RE: Analyzer on query question

2012-08-03 Thread Bill Chesky
PhraseQuery, etc. So I don't see the advantage to doing it this way over the original method. I just don't know if the original way I described is wrong or will give me bad results. thanks for the help, Bill -Original Message- From: Ian Lea [mailto:ian@gmail.com] Sent:

RE: Analyzer on query question

2012-08-03 Thread Bill Chesky
rong to do it the way I described in my original email? Will it give me incorrect results? Bill -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Friday, August 03, 2012 9:33 AM To: java-user@lucene.apache.org Subject: Re: Analyzer on query question Bill,

Re: Analyzer on query question

2012-08-03 Thread Jack Krupansky
ease. We really do need a wiki page for Lucene term analysis. -- Jack Krupansky -Original Message- From: Bill Chesky Sent: Friday, August 03, 2012 9:19 AM To: simon.willna...@gmail.com ; java-user@lucene.apache.org Subject: RE: Analyzer on query question Thanks Simon, Unfortunately,

Re: Analyzer on query question

2012-08-03 Thread Ian Lea
Term("title", "foo")); > phraseQuery.add(new > Term("title", "bar")); > > Is there really no easier way to associate the correct analyzer with these > types of queries? > > Bill > >

RE: Analyzer on query question

2012-08-03 Thread Bill Chesky
types of queries? Bill -Original Message- From: Simon Willnauer [mailto:simon.willna...@gmail.com] Sent: Friday, August 03, 2012 3:43 AM To: java-user@lucene.apache.org; Bill Chesky Subject: Re: Analyzer on query question On Thu, Aug 2, 2012 at 11:09 PM, Bill Chesky wrote: > Hi, >

Re: Analyzer on query question

2012-08-03 Thread Simon Willnauer
On Thu, Aug 2, 2012 at 11:09 PM, Bill Chesky wrote: > Hi, > > I understand that generally speaking you should use the same analyzer on > querying as was used on indexing. In my code I am using the SnowballAnalyzer > on index creation. However, on the query side I am building up a complex > Bo

Re: analyzer per document

2012-02-09 Thread Paul Libbrecht
I would use a different field per language and use PerFieldAnalyzer indeed. This is also important for queries whose language is not always clear. paul Le 9 févr. 2012 à 13:01, Vinaya Kumar Thimmappa a écrit : > Hello All, > > I have a requirement of using different analyzer per document. How

Re: analyzer per document

2012-02-09 Thread Francisco A. Lozano
Why don't you store each "file" in a single document, add a field for each "line" and use a PerFieldAnalyzerWrapper? Francisco A. Lozano On Thu, Feb 9, 2012 at 13:01, Vinaya Kumar Thimmappa wrote: > Hello All, > > I have a requirement of using different analyzer per document. How can > we do t

Re: Analyzer which creates terms of one to n words

2011-04-07 Thread Israel Tsadok
Take a look st http://lucene.apache.org/java/3_0_3/api/contrib-analyzers/org/apache/lucene/analysis/shingle/package-summary.html On Thu, Apr 7, 2011 at 11:30 AM, Clemens Wyss wrote: > Is there an analyzer which takes a text and creates search terms based on > the following rules: > - all single

Re: Analyzer enquiry

2011-03-14 Thread Vasiliki Gkouta
Thank you for your help! Best Regards, Vicky Quoting Erick Erickson : Nope, that should do it. Best Erick On Mon, Mar 14, 2011 at 9:35 AM, Vasiliki Gkouta wrote: Sorry for the confusion. I have two analyzers(of StandardAnalyzer) and use no stemmers. At the one analyzer I passed a german st

Re: Analyzer enquiry

2011-03-14 Thread Erick Erickson
Nope, that should do it. Best Erick On Mon, Mar 14, 2011 at 9:35 AM, Vasiliki Gkouta wrote: > Sorry for the confusion. I have two analyzers(of StandardAnalyzer) and use > no stemmers. At the one analyzer I passed a german stop words set to the > constructor and at the other one I passed an engli

Re: Analyzer enquiry

2011-03-14 Thread Vasiliki Gkouta
Sorry for the confusion. I have two analyzers(of StandardAnalyzer) and use no stemmers. At the one analyzer I passed a german stop words set to the constructor and at the other one I passed an english stop words set. My question was if I have to call any other function of the german analyze

Re: Analyzer enquiry

2011-03-14 Thread Erick Erickson
I don't understand what you're saying here. If you put a stemmer in the constructor, you *are* using it. If you don't specify any stemmer at all, you still have to define different analyzers to use different stop word lists. Can you restate your question? Best Erick On Mon, Mar 14, 2011 at 8:21

Re: Analyzer enquiry

2011-03-14 Thread Vasiliki Gkouta
Thanks a lot for your help Erick! About the fields you mentioned: If I don't use stemmers, except for the constructor argument related to the stop words, is there anything else that I have to modify? Thanks, Vicky Quoting Erick Erickson : StandardAnalyzer works well for most European lang

Re: Analyzer enquiry

2011-03-13 Thread Erick Erickson
StandardAnalyzer works well for most European languages. The problem will be stemming. Applying stemming via English rules to non-English languages produces...er...interesting results. You can go ahead and create language-specific fields for each language and use StandardAnalyzer with the appropri

Re: Analyzer

2010-12-02 Thread Ahmet Arslan
> By the way, is there an analyzer > which splites each letter of a word? > e.g. > hello world => h/e/l/l/o/w/o/r/l/d There are classes under the package org.apache.lucene.analysis.ngram - To unsubscribe, e-mail: java

Re: Analyzer

2010-12-02 Thread Christoph Hermann
Am Donnerstag, 2. Dezember 2010, 11:11:03 schrieb Sean: Hi, > By the way, is there an analyzer which splites each letter of a word? > e.g. > hello world => h/e/l/l/o/w/o/r/l/d There is a CharTokenizer, that should help you. regards Christoph Hermann -- Christoph Hermann Institut für Informati

Re: Analyzer

2010-12-02 Thread Sean
By the way, is there an analyzer which splites each letter of a word? e.g. hello world => h/e/l/l/o/w/o/r/l/d Regards, Sean -- Original -- From: "Erick Erickson"; Date: Tue, Nov 30, 2010 09:07 PM To: "java-user";

Re: Analyzer

2010-12-02 Thread manjula wijewickrema
Dear Erick, Thanx for your information. Manjula. On Tue, Nov 30, 2010 at 6:37 PM, Erick Erickson wrote: > WhitespaceAnalyzer does just that, splits the incoming stream on > white space. > > From the javadocs for StandardAnalyzer: > > A grammar-based tokenizer constructed with JFlex > > This sho

Re: Analyzer

2010-11-30 Thread Erick Erickson
WhitespaceAnalyzer does just that, splits the incoming stream on white space. >From the javadocs for StandardAnalyzer: A grammar-based tokenizer constructed with JFlex This should be a good tokenizer for most European-language documents: - Splits words at punctuation characters, removing pun

Re: Analyzer

2010-11-29 Thread manjula wijewickrema
Hi Steve, Thanx a lot for your reply. Yes there are only two classes and it's corrcet that the way you have realized the problem. As you have instructed, I checked WhitespaceAnalyzer for querying (instead of StandardAnalyzer) and it seems to me that it gives better results rather than StandardAnal

RE: Analyzer

2010-11-29 Thread Steven A Rowe
Hi Manjula, It's not terribly clear what you're doing here - I got lost in your description of your (two? or maybe four?) classes. Sometimes things are easier to understand if you provide more concrete detail. I suspect that you could benefit from reading the book Lucene in Action, 2nd editio

Re: analyzer not working properly when indexing

2010-04-21 Thread jm
ok, got this. I upgraded my analyzer to new api but it was not correct... thanks On Wed, Apr 21, 2010 at 11:45 AM, Ian Lea wrote: > OK, so it does indeed look like a problem with your analyzer, as you > suspected. > > You could confirm that by using e.g. WhitespaceAnalyzer instead.  Then > mayb

Re: analyzer not working properly when indexing

2010-04-21 Thread Ian Lea
OK, so it does indeed look like a problem with your analyzer, as you suspected. You could confirm that by using e.g. WhitespaceAnalyzer instead. Then maybe post the code for your custom analyzer, or step through in a debugger or however you prefer to debug code. -- Ian. On Wed, Apr 21, 2010 a

Re: analyzer not working properly when indexing

2010-04-21 Thread jm
I am using a TermQuery so no analyzer used... protected static int getHitCount(Directory directory, String fieldName, String searchString) throws IOException { IndexSearcher searcher = new IndexSearcher(directory, true); //5 Term t = new Term(fieldName, searchString); Query

Re: analyzer not working properly when indexing

2010-04-20 Thread Ian Lea
Are you using the same analyzer for searching, in your unshown getHitCount() method? There is lots of good advice in the FAQ under "Why am I getting no hits / incorrect hits?". And/or write the index to disk and use Luke to check that the correct content is being indexed. -- Ian. On Tue, Apr

Re: Analyzer for stripping non alpha-numeric characters?

2010-02-04 Thread Jason Rutherglen
Answering my own question... PatternReplaceFilter doesn't output multiple tokens... Which means messing with capture state... On Thu, Feb 4, 2010 at 2:16 PM, Jason Rutherglen wrote: > Transferred partially to solr-user... > > Steven, thanks for the reply! > > I wonder if PatternReplaceFilter can

Re: Analyzer for stripping non alpha-numeric characters?

2010-02-04 Thread Jason Rutherglen
Transferred partially to solr-user... Steven, thanks for the reply! I wonder if PatternReplaceFilter can output multiple tokens? I'd like to progressively strip the non-alphanums, for example output: apple!&* apple!& apple! apple On Thu, Feb 4, 2010 at 12:18 PM, Steven A Rowe wrote: > Hi Jaso

RE: Analyzer for stripping non alpha-numeric characters?

2010-02-04 Thread Steven A Rowe
Hi Jason, Solr's PatternReplaceFilter(ts, "\\P{Alnum}+$", "", false) should work, chained after an appropriate tokenizer. Steve On 02/04/2010 at 12:18 PM, Jason Rutherglen wrote: > Is there an analyzer that easily strips non alpha-numeric from the end > of a token? > >

Re: Analyzer

2008-11-26 Thread Erick Erickson
rching. Is there any way to do this? > > I am using the way which you suggested. > > Warm Regards, > Allahbaksh > > ____ > From: Erick Erickson [EMAIL PROTECTED] > Sent: Tuesday, November 25, 2008 9:38 PM > To: java-user@lucene.apache.org > Subject: Re: Analyzer >

RE: Analyzer

2008-11-26 Thread Allahbaksh Mohammedali Asadullah
e different analyzer for extracting and searching. Is there any way to do this? I am using the way which you suggested. Warm Regards, Allahbaksh From: Erick Erickson [EMAIL PROTECTED] Sent: Tuesday, November 25, 2008 9:38 PM To: java-user@lucene.ap

Re: Analyzer

2008-11-25 Thread Erick Erickson
H, how would you do this without open/closing your IndexWriter around different types of documents? And as far as querying is concerned, I doubt the input would be a file, so one of the canned analyzers should do. Although "care should be taken " Best Erick On Tue, Nov 25, 2008 at 10:57 A

Re: Analyzer

2008-11-25 Thread Erick Erickson
I'm assuming that you want a different analyzer to handle extracting the relevant information to put into a "text" field of the Lucene document. I know of no way you can attach different analyzers to a single field. You can certainly attach different analyzers to *different* fields... The first th

Re: Analyzer

2008-11-25 Thread Ian Lea
Yes, you can. But it is generally best to use the same analyzer for indexing and for searching so, assuming that you want searches to find matches in files of whatever type, I'd recommend pre-processing the files to a common text format before indexing and then using the same analyzer for all of t

Re: Analyzer at Query time

2008-08-28 Thread Yonik Seeley
On Thu, Aug 28, 2008 at 10:32 AM, Dino Korah <[EMAIL PROTECTED]> wrote: > If I am to completely avoid the query parser and use the BooleanQuery along > with TermQuery, RangeQuery, PrefixQuery, PhraseQuery, etc, does the search > words still get to the Analyzer, before actually doing the real search

Re: Analyzer at Query time

2008-08-28 Thread Mark Miller
Dino Korah wrote: Hi All, If I am to completely avoid the query parser and use the BooleanQuery along with TermQuery, RangeQuery, PrefixQuery, PhraseQuery, etc, does the search words still get to the Analyzer, before actually doing the real search? Many thanks, Dino Answer: no The Q

Re: Analyzer for WikipediaTokenizer

2008-04-16 Thread Yonik Seeley
On Wed, Apr 16, 2008 at 3:13 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > LOL. That would probably be useful, eh? :-). Not sure why it completely > slipped my mind other than I use it in Solr. I suppose it would make sense > to create a variation of the StandardAnalyzer that uses the > Wiki

Re: Analyzer for WikipediaTokenizer

2008-04-16 Thread Grant Ingersoll
LOL. That would probably be useful, eh? :-). Not sure why it completely slipped my mind other than I use it in Solr. I suppose it would make sense to create a variation of the StandardAnalyzer that uses the WikipediaTokenizer instead. Care to crank out a patch? -Grant On Apr 16, 2008,

Re: Analyzer to use with MultiSearcher using various indexes for multiple languages

2007-12-18 Thread Daniel Naber
On Dienstag, 18. Dezember 2007, Jay Hill wrote: > We > have a requirement to search across multiple languages, so I'm planning > to use MultiSearcher, passing an array of all IndexSearchers for each > language. You will need to analyze the query once per language and then build a new BooleanQuer

Re: Analyzer sharing

2007-06-22 Thread Jiye Yu
I see. I guess those Filters (e.g. PorterStemFilter) that make up the analyzer are not thread safe or cannot be shared. Thanks for your quick response! Jay Yonik Seeley wrote: On 6/22/07, Jiye Yu <[EMAIL PROTECTED]> wrote: I guess an Analyzer (built in ones such as StandardAnalyzer, POrterSt

Re: Analyzer sharing

2007-06-22 Thread Yonik Seeley
On 6/22/07, Jiye Yu <[EMAIL PROTECTED]> wrote: I guess an Analyzer (built in ones such as StandardAnalyzer, POrterStemAnalyer and etc) is not thread safe. Analyzers *are* thread-safe. Multiple threads can all call analyzer.tokenStream() without any synchronization. -Yonik

Re: analyzer to populate more that one field of Lucene document

2006-09-21 Thread Boris Galitsky
Thanks a lot Erick Boris * Erick Erickson <[EMAIL PROTECTED]> [Thu, 21 Sep 2006 20:53:42 -0400]: I think you want a PerFieldAnalyzerWrapper. It allows you to make a different analyzer for each field in your document. You'll have to write the code to extract the file contents in your desired

Re: analyzer to populate more that one field of Lucene document

2006-09-21 Thread Erick Erickson
I think you want a PerFieldAnalyzerWrapper. It allows you to make a different analyzer for each field in your document. You'll have to write the code to extract the file contents in your desired formats for each field, but you probably do that already ... You can instantiate your IndexWriter with

Re: Analyzer question

2006-05-23 Thread AsifTheManRahman
Thanks Jeff. :) -- View this message in context: http://www.nabble.com/Analyzer+question-t1650271.html#a4524125 Sent from the Lucene - Java Users forum at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional c

Re: Analyzer question

2006-05-19 Thread Jeff Rodenburg
The Keyword analyzer does no stemming or input modification of any sort: think of it as WYSIWYG for index population. The Whitespace analyzer simply removes spaces from your input (still no stemming), but the tokens are the individual words. I don't have the code in front of me, so I'm not sure

Re: Analyzer which distributes tokens to many fields

2006-05-16 Thread Erik Hatcher
On May 16, 2006, at 3:02 AM, Mathias Keilbach wrote: I'm going to create a small application with Lucene, which analyze diffenrent Strings. While analyzing the strings, patterns (like emails or urls) shall be sort out and saved in a seperate index field. I'm not sure if I can handle this with

Re: Analyzer

2006-01-19 Thread Stéphane Lagraulet
Yonik, You're right, it's unecessary unless you want to search on both (so index both), as you said in your other message SL Yonik Seeley a écrit : On 1/19/06, Stéphane Lagraulet <[EMAIL PROTECTED]> wrote: Hi, You'd better use 2 fields, one analysed and not stored, and the other one only

Re: Analyzer

2006-01-19 Thread Yonik Seeley
On 1/19/06, Stéphane Lagraulet <[EMAIL PROTECTED]> wrote: > Hi, > You'd better use 2 fields, one analysed and not stored, and the other > one only stored. There is no need for that. A single field that is both indexed and stored will give you the same ting. -Yonik --

Re: Analyzer

2006-01-19 Thread Stéphane Lagraulet
Hi, You'd better use 2 fields, one analysed and not stored, and the other one only stored. So you perform the query on the analysed field and present the other field (not stemmed) in the result. Stephan Lagraulet Klaus a écrit : Hi, Is there a way to get the unstemmed term out of the lucene

Re: Analyzer

2006-01-19 Thread Yonik Seeley
Do you want to search for the unstemmed term, or just be able to retrieve it? When you retrieve a document, you get the un-analyzed original fields. If you want to index both the stemmend and unstemmed terms, the easiest way is to add the field twice (the second time using a different field name)

Re: Analyzer question

2005-08-08 Thread Erik Hatcher
On Aug 8, 2005, at 10:43 AM, Dan Armbrust wrote: It is my understanding that the StandardAnalyzer will remove underscores - so "some_word" be indexed as 'some' and 'word'. I want to keep the underscores, so I was thinking of changing over to an Analyzer that uses the WhiteSpaceTokenizer, Low

RE: Analyzer or QueryParser problem?

2005-07-26 Thread Indu Abeyaratna
-T1". -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, 27 July 2005 10:35 AM To: java-user@lucene.apache.org Subject: Re: Analyzer or QueryParser problem? On Jul 26, 2005, at 7:29 PM, Indu Abeyaratna wrote: > I have a question related to thi

Re: Analyzer or QueryParser problem?

2005-07-26 Thread Erik Hatcher
On Jul 26, 2005, at 7:29 PM, Indu Abeyaratna wrote: I have a question related to this. when I search for wildcard "*11" IndexSearcher throws an exception but when I tries "\**11" it works. I'm guessing QueryParser actually throws an exception, not IndexSearcher, correct? Wildcards at t

RE: Analyzer or QueryParser problem?

2005-07-26 Thread Indu Abeyaratna
And the query it generate looks like : "+orgId:9146 +isRegistered:1 +docNo:**11" Regard, Indu -Original Message- From: Zhang, Lisheng [mailto:[EMAIL PROTECTED] Sent: Wednesday, 27 July 2005 3:25 AM To: 'java-user@lucene.apache.org' Subject: RE: Analyzer or QueryParser pro

RE: Analyzer or QueryParser problem?

2005-07-26 Thread Derek Westfall
D'OH! That was it! -Original Message- From: Zhang, Lisheng [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 26, 2005 10:25 AM To: java-user@lucene.apache.org Subject: RE: Analyzer or QueryParser problem? Hi Derek, My guessing is that ":" is special, QueryParser may re

RE: Analyzer or QueryParser problem?

2005-07-26 Thread Zhang, Lisheng
TECTED] Sent: Tuesday, July 26, 2005 9:11 AM To: java-user@lucene.apache.org Subject: Re: Analyzer or QueryParser problem? You can use Luke to see what got indexed. This will tell you what the Analyzer did. You can then use QueryParser from the command line (it's got a main method), give it

Re: Analyzer or QueryParser problem?

2005-07-26 Thread Otis Gospodnetic
You can use Luke to see what got indexed. This will tell you what the Analyzer did. You can then use QueryParser from the command line (it's got a main method), give it your input, and see what it returns. This will tell you what QueryParser+Analyzer did. Oh, you use MFQP. It may have a main me

Re: Analyzer don't work with wildcard queries, snowball analyzer.

2005-03-31 Thread Morus Walter
Ernesto De Santis writes: > Hi Erik > > Ok, in PrefixQuery cases, non analyze is right. > It creates the same problems. 'example*' should find 'example' but does not if 'example' is stemmed to 'exampl' and you don't analyze the prefix query. > > You search "example" and obtain x results. > You

Re: Analyzer don't work with wildcard queries, snowball analyzer.

2005-03-31 Thread Erik Hatcher
On Mar 31, 2005, at 12:26 PM, Ernesto De Santis wrote: Hi Erik Finally, my name spelled correctly. :)) Ok, in PrefixQuery cases, non analyze is right. But you think that non analyze in WildcardQuery is right? Do I think its right? That's just the way it is. Whether that is right or not I don

Re: Analyzer don't work with wildcard queries, snowball analyzer.

2005-03-31 Thread Ernesto De Santis
Hi Erik Ok, in PrefixQuery cases, non analyze is right. But you think that non analyze in WildcardQuery is right? You search "example" and obtain x results. You search "ex?mple" and don't obtain any result. This is correct for you? It is difficult to analyze wildcard queries in lucene code? Ernesto

Re: Analyzer don't work with wildcard queries, snowball analyzer.

2005-03-31 Thread Erik Hatcher
Wildcard terms simply are not analyzed. How could it be possible to do this? What if I search for "a*" - how could you stem that? Erik On Mar 31, 2005, at 9:51 AM, Ernesto De Santis wrote: Hi I get an unexpected behavior when use wildcards in my queries. I use a EnglishAnalyzer develope