Hi,
I'm getting an ArrayIndexOutOfBoundsException within the highlighter:
java.lang.ArrayIndexOutOfBoundsException: 50
at
org.apache.lucene.search.highlight.TokenGroup.addToken(TokenGroup.java:47)
at
org.apache.lucene.search.highlight.Highlighter.getBestDocFragments(Highlighter.java:
Lucene's scalability is not in question. The simple solution of
rebuilding the string of terms is what I referred to as not being
scalable. For instance, consider the following term vector:
termFreqVector (freq {myTermField: red/69, green/79, blue/899})
Recreating a string with 69
I don't think you need to parse the toString, you have the
TermFreqVector object which lets you access the appropriate pieces of
information (string, freq). You could then turn around and delete/index
the new document based on the vector with the increments. I don't know
whether it would scale or
Hi:
In the current Lucene sorting implementation, FieldCache is used to
retrieve 2 arrays, the lookup array and the order array. The order
array at load time stores the position of the term in the lookup
array. The lookup array is already sorted because it is read in from
the index.
My ques
I was doing a JProfiler install of our webapp/lucene last week and of
course a large part of our app is spent in RandomAccessFile.readBytes ...
This is called by InputStream.readByte which internally uses a
BUFFER_SIZE of 1024 (which is the default).
This value seems too small for a default
Hi Erik,
Here is the bug report with the test case:
http://issues.apache.org/bugzilla/show_bug.cgi?id=35157
The scoring algorithm doesn't seem to work correctly when
SpanTermQuerys are in a BooleanQuery. I will look for the
problem. Any advice on what I should look for?
Thanks,
Reece
--- Erik
Responses inline prefixed with
-Original Message-
From: Dawid Weiss <[EMAIL PROTECTED]>
Sent: Jun 1, 2005 3:24 AM
To: java-user@lucene.apache.org
Subject: Re: Clustering Carrot2 vs TermVector Analysis
Hi Andrew,
Coming up with an answer... sorry for the delay.
> By using the carro
If your stemmer worked on indexing, then won't the "breath" entry
automatically pick up all of these? So, isn't the project unnecessary
and otiose?
On 5/31/05, Daniel Naber <[EMAIL PROTECTED]> wrote:
> On Monday 30 May 2005 18:54, Andrew Boyd wrote:
>
> > Now that the QueryParser knows about pos
On May 31, 2005, at 8:38 PM, Reece Wilton wrote:
Hi,
Using a BooleanQuery to combine two SpanTermQuery objects causes
unexpected results on Lucene 1.9 RC1. Is this a problem that is
already known about or has already been fixed?
I have a test case and more info if this is a new issue.
Inte
Hi,
Using a BooleanQuery to combine two SpanTermQuery objects causes
unexpected results on Lucene 1.9 RC1. Is this a problem that is
already known about or has already been fixed?
I have a test case and more info if this is a new issue.
Thanks.
---
Andrew Boyd wrote:
The numbers look impressive. If I build from the 1.9 trunck will I get the
patch?
Funny... I went ahead and imoplemented this myself and it didn't work.
Of course I may have implemented it incorrectly. I'll look at the patch
source and try it out!
Something fun to
Hi Andrew,
Coming up with an answer... sorry for the delay.
By using the carrot demo:
http://www.newsarch.com/archive/mailinglist/jakarta/lucene/user/msg03928.html
I was able to easliy cluster search results based on the fields used
by carrot( url, title, and summary). However I was wonderi
Le 1 juin 05, à 01:12, Erik Hatcher a écrit :
1/ one index for all languages
2/ one index for all languages, with an extra language field so
searches
can be constrained to a particular language
3/ separate indices for each language?
I would vote for option #2 as it gives the most flexibilty - y
13 matches
Mail list logo