Re: ClassicAnalyzer Behavior on accent character

2017-11-28 Thread Michael Sokolov
That's expected. Non letters are not mapped to letters, correctly. On Oct 19, 2017 9:38 AM, "Chitra" wrote: > Hi, > I indexed a term 'ⒶeŘꝋꝒɫⱯŋɇ' (aeroplane) and the term was > indexed as "er l n", some characters were trimmed while indexing. > > Here is my code > > protected Analyz

Re: ClassicAnalyzer Behavior on accent character

2017-10-26 Thread Chris Hostetter
on it. Standard is ... "standard" ... it implements that Unicode Standard text segmentation rules. : Date: Fri, 20 Oct 2017 18:58:35 +0530 : From: Chitra : Reply-To: java-user@lucene.apache.org : To: Lucene Users : Subject: Re: ClassicAnalyzer Behavior on accent character : : Hi, :

Re: ClassicAnalyzer Behavior on accent character

2017-10-20 Thread Chitra
Hi, I found the difference and understand the behavior of both tokenizers appropriately. Could you please suggest me which one is the better to use ClassicTokenizer/StandardTokenizer? -- Regards, Chitra

Re: ClassicAnalyzer Behavior on accent character

2017-10-20 Thread Chitra
Hi Robert, Yes, standardTokenizer solves my case... could you please explain the difference between ClassicalTokenizer and StandardTokenizer? How does standardTokenizer solve my case? I surf the web but I was unable to understand... Any help is greatly appreciated. On Fri, Oct 2

Re: ClassicAnalyzer Behavior on accent character

2017-10-19 Thread Robert Muir
easy, don't use classictokenizer: use standardtokenizer instead. On Thu, Oct 19, 2017 at 9:37 AM, Chitra wrote: > Hi, > I indexed a term 'ⒶeŘꝋꝒɫⱯŋɇ' (aeroplane) and the term was > indexed as "er l n", some characters were trimmed while indexing. > > Here is my code > > protected Ana

ClassicAnalyzer Behavior on accent character

2017-10-19 Thread Chitra
Hi, I indexed a term 'ⒶeŘꝋꝒɫⱯŋɇ' (aeroplane) and the term was indexed as "er l n", some characters were trimmed while indexing. Here is my code protected Analyzer.TokenStreamComponents createComponents(final String > fieldName, final Reader reader) > { > final ClassicTok