Not really sure what to tell you other than you need to dig in and
look at how the other Query classes are implemented. I would start
with TermQuery/TermScorer.
One thing I did to get to know the scoring was to go through and
document it the best I could (given the time I had) as pseudocod
On Nov 10, 2007 2:08 AM, Steven A Rowe <[EMAIL PROTECTED]> wrote:
> Hi Cedric,
>
> On 11/08/2007, Cedric Ho wrote:
> > a sentence containing characters ABC, it may be segmented into AB, C or A,
> > BC.
> [snip]
> > In this cases we would like to index both segmentation into the index:
> >
> > AB o
The CJKAnalyzer is too simple for our need. But thanks for suggesting anyway.
Cheers,
Cedric
On Nov 9, 2007 10:43 PM, Open Study <[EMAIL PROTECTED]> wrote:
> Hi Cedric
>
> You may try the CJKAnalyzer within the lucene sandbox. It doesn't give
> a perfect solution for Chinese word segmentation, bu
Mike Streeton wrote:
> I have just tried this again using the index I built with lucene 2.1 but
> running the test using lucene 2.2 and it works okay, so it seems to be
> something related to an index built using lucene 2.2.
>
> Mike
>
Hi Mike,
does this also happen with the current trunk ver
Thanks for yours response, Probably, I will use this class in my project.
Hi Cedric,
On 11/08/2007, Cedric Ho wrote:
> a sentence containing characters ABC, it may be segmented into AB, C or A, BC.
[snip]
> In this cases we would like to index both segmentation into the index:
>
> AB offset (0,1) position 0A offset (0,0) position 0
> C offset (2,2) position
Hi,
I wanted two compare two indexes.Please recommend an algorithm
which takes all the factors into accoubt such as versions of software
being used by
lucene and application which has an effect on the index being
created.We can also
compare with certain fields and the text.
Regards
--
Hi,
Is there a way to get the number of segments in an index?
I looked at the API's for the reader, writer and searcher, but didn't
find anything.
Thanks,
Lucifer
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional comma
I have just tried this again using the index I built with lucene 2.1 but
running the test using lucene 2.2 and it works okay, so it seems to be
something related to an index built using lucene 2.2.
Mike
-Original Message-
From: Mike Streeton [mailto:[EMAIL PROTECTED]
Sent: 09 November 2
I have tried this again using Lucene 2.1 and as Erick found it works okay, I
have tried it on jdk 1.6 u1 and u3 both work, but both fail when using lucene
2.2
Mike
-Original Message-
From: Mike Streeton [mailto:[EMAIL PROTECTED]
Sent: 09 November 2007 16:05
To: java-user@lucene.apache.o
I see you do the wrapping in a RuntimeException trick. Perhaps you can
introduce a special exception derived from RuntimeException that you
would throw in that case. It would basically mean "The underlying FS
does something we cannot tolerate so we fail fast."
--Nikolay
Michael McCandless wro
Erick,
Sorry the numbers are just printed out for debugging when it is building the
index. I will try it with lucene 2.1 and see what happens
Thanks
Mike
-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 09 November 2007 15:59
To: java-user@lucene.apache.org
Sub
FWIW, running Lucene 2.1, Java 1.5 all I get is some numbers being printed
out
0
1
2
.
.
.
90,000
and ran through the above 4 times or so
Erick
On Nov 9, 2007 5:51 AM, Mike Streeton <[EMAIL PROTECTED]>
wrote:
> I have posted before about a problem with TermDocs.skipTo () but never
I agree, we should not ignore the return value here. I think throwing an
exception if it returns false is the right thing to do? Though, if it's
a checked exception, that's not a backwards compatible change...
Mike
"Nikolay Diakov" <[EMAIL PROTECTED]> wrote:
> I have briefly reviewed the Simpl
I have briefly reviewed the SimpleFSLock of Lucene 2.1 and 2.2. I see
that the lock release mechanism does not check the return value of delete:
public void release() {
lockFile.delete();
}
On most linux-es this can never return false, however under some windows
FS if someone (a virus
Hi Cedric
You may try the CJKAnalyzer within the lucene sandbox. It doesn't give
a perfect solution for Chinese word segmentation, but will solve the
problem in your case.
On Nov 9, 2007 10:59 AM, Cedric Ho <[EMAIL PROTECTED]> wrote:
> Hi,
>
> We are having an issue while indexing Chinese Documen
Grant Ingersoll-6 wrote:
>
> When you are indexing the file and adding the Document, you will need
> to parse out your filename per your regular expression, and then
> create the appropriate field:
>
> Document doc = new Document()
> String cat = getCategoryFromFileName(inputFileName)
> doc
I have posted before about a problem with TermDocs.skipTo () but never managed
to reproduce it. I have now got it to fail using the following program, please
can someone try it and see if they get the stack trace:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Array
index
It should work. See the following FAQ:
http://wiki.apache.org/lucene-java/LuceneFAQ#head-6c56b0449d114826586940dcc6fe51582676a36e
regards,
Koji
Matt Magoffin wrote:
Hello, I tried finding information about this from past mailing list
emails, but couldn't find anything. I'm using Lucene 1.4 se
19 matches
Mail list logo