RE: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread andy
iginal Message- >> From: andy [mailto: > yhlweb@ > ] >> Sent: Wednesday, February 12, 2014 10:53 AM >> To: > java-user@.apache >> Subject: RE: Length of the filed does not affect the doc score accurately >> for >> chinese analyzer(SmartChin

RE: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread Uwe Schindler
affect the doc score accurately for > chinese analyzer(SmartChineseAnalyzer) > > Thanks Uwe,could you please give me a more detail example about how to > change the lucene behavior > > > Uwe Schindler wrote > > Hi Erick, > > > > a statement like " Adding &a

RE: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread andy
> uwe@ > > >> -Original Message- >> From: Erick Erickson [mailto: > erickerickson@ > ] >> Sent: Wednesday, January 15, 2014 1:30 PM >> To: java-user >> Subject: Re: Length of the filed does not affect the doc score accurately >> f

RE: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread Uwe Schindler
will show you if this is the case. > > Best > Erick > > On Wed, Jan 15, 2014 at 3:39 AM, andy wrote: > > Hi guys, > > > > As the topic,it seems that the length of filed does not affect the doc > > score accurately for chinese analyzer in my source c

Re: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread andy
thanks for your reply Erick, this is the case ,But how can I keep the precision of the fields' length? -- View this message in context: http://lucene.472066.n3.nabble.com/Length-of-the-filed-does-not-affect-the-doc-score-accurately-for-chinese-analyzer-SmartChineseAnalyz-tp4111390p4116832

Re: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-01-15 Thread Erick Erickson
Hi guys, > > As the topic,it seems that the length of filed does not affect the doc score > accurately for chinese analyzer in my source code > > index source code > > private static Directory DIRECTORY; > > > @BeforeClass > public static void before() throws I

Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-01-15 Thread andy
Hi guys, As the topic,it seems that the length of filed does not affect the doc score accurately for chinese analyzer in my source code index source code private static Directory DIRECTORY; @BeforeClass public static void before() throws IOException { DIRECTORY = new

答复: Smart Chinese Analyzer Performance

2013-09-06 Thread Oliver Xu (Aigine Co)
Erickson 发送时间: 2013年9月6日 21:07 收件人: java-user 主题: Re: Smart Chinese Analyzer Performance Well, various people have measured between a 50% and 70+% reduction in memory used for identical data, so I'd say so. The CHANGES.txt is where I'd look to see if anything mentioned is worth your time. Not

Re: 答复: Smart Chinese Analyzer Performance

2013-09-06 Thread Darren Hoffman
er.xu=aigine@lucene.apache.org >[mailto:java-user-return-56896-oliver.xu=aigine@lucene.apache.org] 代表 >Erick Erickson >发送时间: 2013年9月6日 21:07 >收件人: java-user >主题: Re: Smart Chinese Analyzer Performance > >Well, various people have measured between a 50% and 70+% reduction

Re: Smart Chinese Analyzer Performance

2013-09-06 Thread Erick Erickson
Well, various people have measured between a 50% and 70+% reduction in memory used for identical data, so I'd say so. The CHANGES.txt is where I'd look to see if anything mentioned is worth your time. Not to mention SolrCloud... Erick On Fri, Sep 6, 2013 at 3:41 PM, Darren Hoffman wrote: > I

Re: Smart Chinese Analyzer Performance

2013-09-06 Thread Darren Hoffman
Thanks for the feedback. I'll keep pressing on then. BTW, I'm not using solr; I am building an Android app. On 9/6/13 1:06 PM, "Erick Erickson" wrote: >Well, various people have measured between a 50% and 70+% reduction in >memory used for identical data, so I'd say so. The CHANGES.txt is where

Smart Chinese Analyzer Performance

2013-09-06 Thread Darren Hoffman
I am using the SmartChineseAnalyzer in v3.6 but accessing or instantiating it for the first time takes 10 to 15 seconds before it does anything. I do not see this huge delay with StandardAnalyzer. Is it loading a cache? Is there someway to speed it up? I am currently using Lucene 3.6 and am tryin

Re: Chinese analyzer

2013-01-24 Thread Jerome Lanneluc
Thanks Robert. Is there another analyzer I should use? Jerome From: Robert Muir To: java-user@lucene.apache.org, Date: 01/24/2013 06:20 PM Subject:Re: Chinese analyzer On Thu, Jan 24, 2013 at 10:53 AM, Jerome Lanneluc wrote: > It looks like my attachment was lost.

Re: Chinese analyzer

2013-01-24 Thread Robert Muir
On Thu, Jan 24, 2013 at 10:53 AM, Jerome Lanneluc wrote: > It looks like my attachment was lost. It referred to > org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer. > I think this analyzer will not properly tokenize text outside of the BMP: it pretty much only works for simplified text (e.

Re: Chinese analyzer

2013-01-24 Thread Jerome Lanneluc
System.out.print(")"); } } From: Robert Muir To: java-user@lucene.apache.org, Date: 01/24/2013 04:31 PM Subject:Re: Chinese analyzer On Thu, Jan 24, 2013 at 9:25 AM, Jerome Lanneluc wrote: > Note the 2 tokens in the second sample when I would expe

Re: Chinese analyzer

2013-01-24 Thread Robert Muir
On Thu, Jan 24, 2013 at 9:25 AM, Jerome Lanneluc wrote: > Note the 2 tokens in the second sample when I would expect to have only one > token with the (55401 57046) characters. > > I could not figure out if I'm doing something wrong, or if this is a bug in > the Chines

Chinese analyzer

2013-01-24 Thread Jerome Lanneluc
Hi, I'm using the 3.6.1 Chinese analyzer and when tokenizing some Chinese words containing CJK Unified Ideographs Extension B characters, the resulting tokens do not contain the original words. Instead it seems that the CJK Unified Ideographs Extension B characters are split in two chara

Re: Chinese Analyzer evaluation

2008-12-10 Thread Grant Ingersoll
Have you tried the Chinese options in the contrib/analysis JAR? I can't speak to their quality, so you will need to test. On Dec 9, 2008, at 10:02 PM, Cooper Geng wrote: I found these libraries from the google engine. But I have no experience on using these classes. Do you any suggestion o

Re: Chinese Analyzer evaluation

2008-12-09 Thread Cooper Geng
I found these libraries from the google engine. But I have no experience on using these classes. Do you any suggestion on Asian languages Analyzers? Specially for Chinese Thanks in advance. On Wed, Dec 10, 2008 at 9:17 AM, John Wang <[EMAIL PROTECTED]> wrote: > Hi Cooper: >Where are thes

Re: Chinese Analyzer evaluation

2008-12-09 Thread John Wang
Hi Cooper: Where are these classes? Thanks -John On Tue, Dec 9, 2008 at 2:27 AM, Cooper Geng <[EMAIL PROTECTED]> wrote: > Hi all, > > My application will provide Chinese search engine. I got some analyzer on > Chinese language. > Any suggestion about these: > > IK_CAnalyzer > IKAnalyzer > >

Chinese Analyzer evaluation

2008-12-09 Thread Cooper Geng
Hi all, My application will provide Chinese search engine. I got some analyzer on Chinese language. Any suggestion about these: IK_CAnalyzer IKAnalyzer or more? -- Best Regards Cooper Geng

Re: The best Chinese Analyzer?

2006-05-08 Thread Ray Tsang
Hi Bob, In short, I use a slightly modified ChineseAnalyzer to index chinese text. They differ mainly in the way they tokenize the text. StandardAnalyzer is inteded to use w/ Latin-based languages, that each word composes of multiple characters, and each word is separated by special markers such

The best Chinese Analyzer?

2006-05-07 Thread Bob Cheung
I have a question for those who have used Lucene to index and search for Chinese Characters, what is the best Analyzer for the job? I know all these three can do the job: 1. StandardAnalyzer 2. CJKAnalyzer 3. ChineseAnalyzer What are the difference between these 3 analyzers? TIA. Regards, Bob