iginal Message-
>> From: andy [mailto:
> yhlweb@
> ]
>> Sent: Wednesday, February 12, 2014 10:53 AM
>> To:
> java-user@.apache
>> Subject: RE: Length of the filed does not affect the doc score accurately
>> for
>> chinese analyzer(SmartChin
affect the doc score accurately for
> chinese analyzer(SmartChineseAnalyzer)
>
> Thanks Uwe,could you please give me a more detail example about how to
> change the lucene behavior
>
>
> Uwe Schindler wrote
> > Hi Erick,
> >
> > a statement like " Adding &a
> uwe@
>
>
>> -Original Message-
>> From: Erick Erickson [mailto:
> erickerickson@
> ]
>> Sent: Wednesday, January 15, 2014 1:30 PM
>> To: java-user
>> Subject: Re: Length of the filed does not affect the doc score accurately
>> f
will show you if this is the case.
>
> Best
> Erick
>
> On Wed, Jan 15, 2014 at 3:39 AM, andy wrote:
> > Hi guys,
> >
> > As the topic,it seems that the length of filed does not affect the doc
> > score accurately for chinese analyzer in my source c
thanks for your reply Erick, this is the case ,But how can I keep the
precision of the fields' length?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Length-of-the-filed-does-not-affect-the-doc-score-accurately-for-chinese-analyzer-SmartChineseAnalyz-tp4111390p4116832
Hi guys,
>
> As the topic,it seems that the length of filed does not affect the doc score
> accurately for chinese analyzer in my source code
>
> index source code
>
> private static Directory DIRECTORY;
>
>
> @BeforeClass
> public static void before() throws I
Hi guys,
As the topic,it seems that the length of filed does not affect the doc score
accurately for chinese analyzer in my source code
index source code
private static Directory DIRECTORY;
@BeforeClass
public static void before() throws IOException {
DIRECTORY = new
Erickson
发送时间: 2013年9月6日 21:07
收件人: java-user
主题: Re: Smart Chinese Analyzer Performance
Well, various people have measured between a 50% and 70+% reduction in
memory used for identical data, so I'd say so. The CHANGES.txt is where I'd
look to see if anything mentioned is worth your time.
Not
er.xu=aigine@lucene.apache.org
>[mailto:java-user-return-56896-oliver.xu=aigine@lucene.apache.org] 代表
>Erick Erickson
>发送时间: 2013年9月6日 21:07
>收件人: java-user
>主题: Re: Smart Chinese Analyzer Performance
>
>Well, various people have measured between a 50% and 70+% reduction
Well, various people have measured between a 50% and 70+% reduction in
memory used for identical data, so I'd say so. The CHANGES.txt is where I'd
look to see if anything mentioned is worth your time.
Not to mention SolrCloud...
Erick
On Fri, Sep 6, 2013 at 3:41 PM, Darren Hoffman wrote:
> I
Thanks for the feedback. I'll keep pressing on then.
BTW, I'm not using solr; I am building an Android app.
On 9/6/13 1:06 PM, "Erick Erickson" wrote:
>Well, various people have measured between a 50% and 70+% reduction in
>memory used for identical data, so I'd say so. The CHANGES.txt is where
I am using the SmartChineseAnalyzer in v3.6 but accessing or instantiating
it for the first time takes 10 to 15 seconds before it does anything. I do
not see this huge delay with StandardAnalyzer.
Is it loading a cache? Is there someway to speed it up?
I am currently using Lucene 3.6 and am tryin
Thanks Robert. Is there another analyzer I should use?
Jerome
From: Robert Muir
To: java-user@lucene.apache.org,
Date: 01/24/2013 06:20 PM
Subject:Re: Chinese analyzer
On Thu, Jan 24, 2013 at 10:53 AM, Jerome Lanneluc
wrote:
> It looks like my attachment was lost.
On Thu, Jan 24, 2013 at 10:53 AM, Jerome Lanneluc
wrote:
> It looks like my attachment was lost. It referred to
> org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer.
>
I think this analyzer will not properly tokenize text outside of the
BMP: it pretty much only works for simplified text (e.
System.out.print(")");
}
}
From: Robert Muir
To: java-user@lucene.apache.org,
Date: 01/24/2013 04:31 PM
Subject:Re: Chinese analyzer
On Thu, Jan 24, 2013 at 9:25 AM, Jerome Lanneluc
wrote:
> Note the 2 tokens in the second sample when I would expe
On Thu, Jan 24, 2013 at 9:25 AM, Jerome Lanneluc
wrote:
> Note the 2 tokens in the second sample when I would expect to have only one
> token with the (55401 57046) characters.
>
> I could not figure out if I'm doing something wrong, or if this is a bug in
> the Chines
Hi,
I'm using the 3.6.1 Chinese analyzer and when tokenizing some Chinese
words containing CJK Unified Ideographs Extension B characters, the
resulting tokens do not contain the original words. Instead it seems that
the CJK Unified Ideographs Extension B characters are split in two
chara
Have you tried the Chinese options in the contrib/analysis JAR? I
can't speak to their quality, so you will need to test.
On Dec 9, 2008, at 10:02 PM, Cooper Geng wrote:
I found these libraries from the google engine. But I have no
experience on
using these classes. Do you any suggestion o
I found these libraries from the google engine. But I have no experience on
using these classes. Do you any suggestion on Asian languages Analyzers?
Specially for Chinese
Thanks in advance.
On Wed, Dec 10, 2008 at 9:17 AM, John Wang <[EMAIL PROTECTED]> wrote:
> Hi Cooper:
>Where are thes
Hi Cooper:
Where are these classes?
Thanks
-John
On Tue, Dec 9, 2008 at 2:27 AM, Cooper Geng <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> My application will provide Chinese search engine. I got some analyzer on
> Chinese language.
> Any suggestion about these:
>
> IK_CAnalyzer
> IKAnalyzer
>
>
Hi all,
My application will provide Chinese search engine. I got some analyzer on
Chinese language.
Any suggestion about these:
IK_CAnalyzer
IKAnalyzer
or more?
--
Best Regards
Cooper Geng
Hi Bob,
In short, I use a slightly modified ChineseAnalyzer to index chinese text.
They differ mainly in the way they tokenize the text.
StandardAnalyzer is inteded to use w/ Latin-based languages, that each
word composes of multiple characters, and each word is separated by
special markers such
I have a question for those who have used Lucene to index and search for
Chinese Characters, what is the best Analyzer for the job?
I know all these three can do the job:
1. StandardAnalyzer
2. CJKAnalyzer
3. ChineseAnalyzer
What are the difference between these 3 analyzers?
TIA.
Regards,
Bob
23 matches
Mail list logo