I think, This problem will happen for all sorted fields. I am sorting on
integer field.
I ran small test and found after closing all the Database, the WeekHashMap and
int[] are not released. Please find the profiler screenshot attached.
Is there any way to release this memory / How to fix it ex
Hi Max,
In 3.0.0 (actually in 2.9.0 already), Lucene moved to execute its searches
one sub-reader at a time. As a consequence, absolute docIDs are not passed
to the collect method anymore, but instead the relative docIDs of that
reader. An example, suppose you have 2 segments, with 6 documents tot
Thanks for the heads-up, TCK. The Dietz & Sleator article I found at
http://www.cs.cmu.edu/~sleator/papers/maintaining-order.pdf
looks very interesting.
String sorting in Lucene is indeed fairly expensive and we've experimented with
two solutions to this, none of which are golden bullets.
1) Sto
It's not that it's "necessary" -- this is just how Lucene's sorting
has always worked ;) But, it's just software! You could whip up a
patch...
I'm not familiar with the order-maintenance problem & solutions
offhand, but it certainly sounds interesting.
One issue is that loading only certain val
Yes, you found it! Is that what you're hitting?
I don't know of a workaround though... this is just how SpanQuery
currently works...
Mike
On Wed, Dec 9, 2009 at 4:56 PM, Jason Rutherglen
wrote:
> Mike,
>
> Is this the thread?
>
> http://www.lucidimagination.com/search/document/1e87d488a904b89f
This is a bug in InstantiatedIndex. The termDoc(null) was added to get all
documents. This was never implemented in Instantiated Index. Can you open an
issue?
There maybe other queries fail because of this (e.g.
FieldCacheRangeFilter,...).
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
Thanks Mike for opening this jira ticket and for your patch. Explicitly
removing the entry from the WHM definitely does reduce the number of GC
cycles taken to free the huge StringIndex objects that get created when
doing a sort by a string field.
But I'm still trying to figure out why it is neces
Mike,
Is this the thread?
http://www.lucidimagination.com/search/document/1e87d488a904b89f/spannearquery_s_spans_payloads#8103efdc9705a763
Maybe we need a recommended workaround for this?
Jason
On Wed, Dec 9, 2009 at 1:17 PM, Michael McCandless
wrote:
> That sounds familiar... try to track do
That sounds familiar... try to track down the last thread maybe?
I think it was this: if the payload was already retrieved for a prior
span then the current span won't be able to retrieve it, so even
though you know a payload falls within the span you're looking at, you
won't get it back, if it al
Hi,
I have a HitCollector that processes all hits from a query. I want all
hits, not the top N hits. I am converting my HitCollector to a Collector
for Lucene 3.0.0, and I'm a little confused by the new interface.
I assume that I can implement by new Collector much like the code on the API
Docs:
I'm trying to upgrade our application from Lucene 2.4.1 to Lucene 2.9.1.
I've been using an InstantiatedIndex to do a bunch of unit testing, but am
running into a some problems with Lucene 2.9.1.
In particular, when I try to run a MatchAllDocsQuery on my InstantiatedIndex
(which worked fine on 2.4.
On Fri, Oct 02, 2009 at 11:40:09PM -0700, m.harig wrote:
>
> Thanks Uwe Schindler ,
>
> If i use an IndexReader[] to use MultiReader , will it be thread
> safe? because i've to reopen my IndexReader to check whether my index is
> updated or not . In this case how do i handle it? please sug
Right we're getting the spans, however it's just the payloads that are
missing, randomly...
On Wed, Dec 9, 2009 at 2:23 AM, Michael McCandless
wrote:
> There was a thread a while back about how span queries don't enumerate
> every possible span, but I can't remember if that included sometimes
> m
> Don't you have a playground to properly test your changes
Yes, I'll be doing a practice run in a DEV cluster. It is the practice run that
I'm planning at this point.
Many thanks for your pointers, Danil.
-Original Message-
From: Danil ŢORIN [mailto:torin...@gmail.com]
Sent: 09 Dece
There are a LOT of deprecated stuff in 2.9.1 (but it's still there)
and your code should run as it is
(however there are some changes in behavior, so read carefully CHANGES.txt)
In 3.0 this old stuff is removed.
Your production readers may not even start (which I guess is more
painful than 2 step
COMPRESS is supported (only deprecated) in 2.9.1, so I'm expecting them to be
supported
http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/document/Field.Store.html#COMPRESS
I guess I should expect optimize() to increase the size of the index as
compressed fields are expanded as it s
2nd point can be simply archived by an optimize (which will read old
segments and will create a new one)
But I'm not sure how it handles compressed fields.
On Wed, Dec 9, 2009 at 16:50, Rob Staveley (Tom) wrote:
> Thanks, Danil. I think you've saved me a lot of time. Weiwei too - converting
> r
Thanks, Danil. I think you've saved me a lot of time. Weiwei too - converting
rather than reindexing everything, which will save a lot of time.
So, I should do this:
1. Convert readers to 2.9.1, which should be able to read any 2.x index
including the existing 2.3.1 indexes
2. Convert writers t
You NEED to update your readers first, or else they will be unable to
read files created by newer version.
And trust me, there are changes in index format from 2.3 -> 2.9
On Wed, Dec 9, 2009 at 15:11, Weiwei Wang wrote:
> Hi, Rob,
> I read
> http://wiki.apache.org/lucene-java/BackwardsCompatibili
Hi, Rob,
I read
http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
found no compatibility guarantee for IndexWriter between different version.
You can run your idea as a test and see the output.
If it doesn't work, i suggest you convert your index to new version as I
said i
Thanks for the swift response, Weiwei.
In my deployment, my index readers are in a data centre and therefore more
difficult to upgrade than the writers. That's why I wanted to start with the
writers rather than the readers. I realise that it looks the wrong way round
and http://wiki.apache.org/
I’ve finished a upgrade from 2.4.1 to 3.0.0
What I do is like this:
1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
2. Use a 3.0.0 IndexReader to read the old version index and then use a
3.0.0 IndexWriter to write all the documents into a new index
3. Update QueryPaser to 3.0.0
I have Lucene 2.3.1 code and indexes deployed in production in a distributed
system and would like to bring everything up to date with 3.0.0 via 2.9.1.
Here's my migration plan:
1. Add a index writer which generates a 2.9.1 "test" index
2. Have that "test" index writer push that 2.9.1 "test" ind
There was a thread a while back about how span queries don't enumerate
every possible span, but I can't remember if that included sometimes
missing payloads...
Mike
On Tue, Dec 8, 2009 at 7:34 PM, Jason Rutherglen
wrote:
> Howdy,
>
> I am wondering if anyone has seen
> NearSpansUnordered.getPayl
OK thanks for bringing closure!
Accidentally allowing 2 writers to write to the same index quickly
leads to corruption. They are like the Betta fish: they fight to the
death, removing each others files, if you put them in the same cage.
Mike
On Wed, Dec 9, 2009 at 1:56 AM, Max Lynch wrote:
> H
Hi all,
The missing maven artifacts for the fast-vector-highlighter contrib of
Lucene Java in version 2.9.1 and 3.0.0 are now available at:
http://repo1.maven.org/maven2/org/apache/lucene/
http://repo2.maven.org/maven2/org/apache/lucene/
Uwe
-
Uwe Schindler
uschind...@apache.org
Apache Luc
26 matches
Mail list logo