become smaller.
> - The optimized index has practically the same size as the not optimized one.
>
> Yuliya
>
>> -Ursprüngliche Nachricht-
>> Von: Michael McCandless [mailto:luc...@mikemccandless.com]
>> Gesendet: Freitag, 8. Januar 2010 14:38
>> An: java-user@l
sparsely"?
>
> Thanks,
> Yuliya
>
>> -Ursprüngliche Nachricht-
>> Von: Michael McCandless [mailto:luc...@mikemccandless.com]
>> Gesendet: Donnerstag, 7. Januar 2010 18:00
>> An: java-user@lucene.apache.org
>> Betreff: Re: Lucene 2.9 and 3.0: Optimi
gt; -Ursprüngliche Nachricht-
>> Von: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
>> Gesendet: Donnerstag, 7. Januar 2010 17:35
>> An: java-user@lucene.apache.org
>> Betreff: Re: Lucene 2.9 and 3.0: Optimized index is thrice as
>> large as the not optimize
Do you have a reader open on the index which was opened before your
your index was optimized? Maybe there is a reader around holding on
the references to the merged segments.
simon
On Thu, Jan 7, 2010 at 5:23 PM, Yuliya Palchaninava wrote:
> Hi,
>
> According to the api documentation: "In genera
Yuliya,
The index *directory* will be larger *while* you are optimizing. After the
optimization is completed successfully, the index directory will be smaller.
It is possible that your index directory is large(r) because you have some
left-over segments (e.g. from some earlier failed/interrup
On Thu, Dec 31, 2009 at 12:34 PM, Kumaravel Kandasami
wrote:
> Identified the problem.
>
> reader.close() was not getting called in a specific logic flow.
Phew :) Thanks for bringing closure.
Mike
-
To unsubscribe, e-mail: jav
Identified the problem.
reader.close() was not getting called in a specific logic flow.
Thank You.
Kumar_/|\_
www.saisk.com
ku...@saisk.com
"making a profound difference with knowledge and creativity..."
On Thu, Dec 31, 2009 at 11:11 AM, Kumaravel Kandasami <
kumaravel.kandas...@gmail.co
Thanks Mike.
I think it is something to do with the merge factor.
Modified the code to do optimize in the finally block the following error
message was thrown.
Code Snippet:
nameWriter.optimize(); // errors here
nameWriter.close();
valueWriter.optimize(); //I am using mult
It sounds like you may be running out of file descriptors -- how many
segments are in your index?
The reopen logic looks correct (you are closing the old reader). Is
there anything else that may be holding files open?
Have you changed any of IW's settings, eg mergeFactor?
Mike
On Wed, Dec 30,
That's indeed strange. The problem has nothing to do with
NumericField/NumericUtils and corresponding FieldCache parsing at all, it is
more the autodetection falling back to NumericField parser, if the first
term is not parseable as old-style numeric. Because of that you get this
error message, bec
Hallo Rajiv2,
The LocalLucene from Sourceforge is not index-compatible to the recently
added spatial contrib in Lucene. You have to reindex your spatial values
(because the index format now makes use of the new Lucene 2.9 NumericField,
which is now the standard for numeric fields).
Uwe
-
Uwe
The required format for contrib/spatial has changed to NumericField,
as of 2.9. Are you building your index with NumericField?
Mike
On Fri, Oct 2, 2009 at 2:04 PM, Rajiv2 wrote:
>
> Hello, I was using Lucene 2.4 and locallucene in my app and upgraded to
> lucene 2.9 and I'm using the new spatia
Per segment over many segments is actually a bit faster for none sort
cases and many sort cases -but an optimized index will still be
fastest - the speed benifit of many segments comes when reopening - so
say for realtime search - in that case you may want to sac the opt
perf for a segment
> Mark Miller wrote:
> > Hello Lucene users,
> >
> > ...
> >
> > We let out a bug in the lock factory changes we made in RC3 -
> > making a new SimpleFSDirectory with a String param would throw
> > an illegal state exception - a fix for this is in RC4.
>
> My apologies - not SimpleFSDirectory, but
Mark Miller wrote:
> Hello Lucene users,
>
> ...
>
> We let out a bug in the lock factory changes we made in RC3 -
> making a new SimpleFSDirectory with a String param would throw
> an illegal state exception - a fix for this is in RC4.
My apologies - not SimpleFSDirectory, but SimpleFSLockFactory
> http://svn.apache.org/viewvc?view=rev&revision=630698
This may be it. The scorer is sparse and usually in a conjuction with a
dense scorer.
Does the index format matter? I haven't yet built it with 2.9.
Peter
On Wed, Sep 9, 2009 at 10:17 AM, Yonik Seeley wrote:
> On Wed, Sep 9, 2009 at 9:40 AM
>Is it possible that skipTo is very costly with your custom scorer?
It's no more expensive than 'next'. The scorer's 'skipTo' and 'next' methods
call termdocs.skipTo or termdocs.next to get the next 'candidate' doc. This
just checks a BitVector to find the next non-deleted doc. But the scorer
mus
On Wed, Sep 9, 2009 at 9:40 AM, Peter Keegan wrote:
> IndexSearcher.search is calling my custom scorer's 'next' and 'doc' methods
> 64% fewer times. I see no 'advance' method in any of the hot spots'. I am
> getting the same number of hits from the custom scorer.
> Has the BooleanScorer2 logic chan
Right, BooleanQuery will now try to use BooleanScorer (does "out of
order" collection, which does not use skipTo/advance at all, I think)
when possible, instead of BooleanScorer2.
This only applies for boolean queries that have only SHOULD clauses,
and up to 32 MUST_NOT clauses (if there's even 1
How about the new score inorder/out of order stuff? It was an option
before, but I think now it uses whats best by default? And pairs with
the collector? I didn't follow any of that closely though.
- Mark
Peter Keegan wrote:
> IndexSearcher.search is calling my custom scorer's 'next' and 'doc' me
IndexSearcher.search is calling my custom scorer's 'next' and 'doc' methods
64% fewer times. I see no 'advance' method in any of the hot spots'. I am
getting the same number of hits from the custom scorer.
Has the BooleanScorer2 logic changed?
Peter
On Wed, Sep 9, 2009 at 9:17 AM, Yonik Seeley <
On Wed, Sep 9, 2009 at 9:17 AM, Yonik
Seeley wrote:
> On Wed, Sep 9, 2009 at 8:57 AM, Peter Keegan wrote:
>> Using JProfiler, I observe that the improvement
>> is due to a huge reduction in the number of calls to TermDocs.next and
>> TermDocs.skipTo (about 65% fewer calls).
>
> Indexes are searched
On Wed, Sep 9, 2009 at 8:57 AM, Peter Keegan wrote:
> Using JProfiler, I observe that the improvement
> is due to a huge reduction in the number of calls to TermDocs.next and
> TermDocs.skipTo (about 65% fewer calls).
Indexes are searched per-segment now (i.e. MultiTermDocs isn't normally used).
O
I've been testing 2.9 RC2 lately and comparing query performance to 2.3.2.
I'm seeing a huge increase in throughput (2x-10x) on an index that was built
with 2.3.2. The queries have a lot of BoostingTermQuerys and boolean clauses
containing a custom scorer. Using JProfiler, I observe that the improv
Hi All:
I am already have integrated Lucene 2.9RC2 with Lucene Domain Index:
http://docs.google.com/Doc?id=ddgw7sjp_54fgj9kg
As usual, a new Lucene version do a fastest product :)
All my internal test runs OK and I only need to re-test on 10g database.
Once Lucene 2.9 is ready for produ
Mark Miller wrote:
>
> Download release candidate 1 here:
> http://people.apache.org/~markrmiller/staging-area/lucene2.9rc2/
>
In case anyone catches - yes that is a cut and paste typo - should read
release candidate 2 (obvious, but just to cross my t's).
--
- Mark
http://www.lucidimagination.co
The dist build issues have been addressed and RC2 will include the
missing analyzer and db contrib binaries.
Unfortunately, people.apache.org is not up at the moment
(https://blogs.apache.org/infra/entry/apache_org_downtime_initial_report),
but I will put up Lucene 2.9 RC2 when it comes back up.
Apologies - you are correct - contrib/analyzers is in src but not the
jar distrib. I will address whatever is up with the build process and
put up another RC when apache servers are back up.
Thanks for pointing this out,
- Mark
Bogdan Ghidireac wrote:
> Thank you, Lucene 2.9 is a great release..
Thank you, Lucene 2.9 is a great release...
I have one issue so far - I cannot find the contrib/analyzers jars,
only the sources are present.
Bogdan
On Fri, Aug 28, 2009 at 1:17 AM, Mark Miller wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hello Lucene users,
>
> On behalf of the
I hope July. Could easily be August though. I'm kicking and screaming to get
it out soon though. Its been hurting my high brow reputation.
On Tue, Jun 30, 2009 at 2:41 PM, Siraj Haider wrote:
> is there an ETA for Lucene 2.9 release?
>
> -siraj
>
> ---
Also, conversely, if you know of important issues that should be fixed
for 2.9, please go and check that the "Fix Version" in Jira is in fact
set to 2.9...
Mike
On Thu, Jun 11, 2009 at 8:41 AM, Mark Miller wrote:
> Okay, its only been a short time and we have already whittled the list down
> from
Okay, its only been a short time and we have already whittled the list
down from 56 to 42. I think we have covered most of the easy calls. If
you know an issue your involved in won't likely be done soon, please
help us out and take off the version or push it to 3.1.
Next time I go through, I'm
Darned that Google; they need to do better ;)
Here's the entry from CHANGES.txt on Lucene's trunk:
2. LUCENE-1382: Add an optional arbitrary String "commitUserData" to
IndexWriter.commit(), which is stored in the segments file and is
then retrievable via IndexReader.getCommitUserData ins
On Thu, May 21, 2009 at 1:12 PM, Michael McCandless
wrote:
> Sorry for the slow response.
>
> It's really not clear when 2.9 will be released. We have accumulated
> a number of good improvements -- higher performance field sorting, new
> higher performance Collector (replaces HitCollector) API,
>
Sorry for the slow response.
It's really not clear when 2.9 will be released. We have accumulated
a number of good improvements -- higher performance field sorting, new
higher performance Collector (replaces HitCollector) API,
segment-based searching, attaching a String label to each commit from
Mark Miller wrote:
Hmmm - you can probably get qsol to do it:
http://myhardshadow.com/qsol. I think you can setup any token to
expand to anything with a regex matcher and use group capturing in the
replacement (I don't fully remember though, been a while since I've
used it).
So you could do
Yonik Seeley wrote:
On Mon, Mar 9, 2009 at 2:02 PM, Michael McCandless
wrote:
Once added, something inside the index (a "write once" schema)
records
that this field is an IntField and then it's an error to ever use a
different type field by that same name.
I dunno... coupling functionalit
Hmmm - you can probably get qsol to do it: http://myhardshadow.com/qsol.
I think you can setup any token to expand to anything with a regex
matcher and use group capturing in the replacement (I don't fully
remember though, been a while since I've used it).
So you could do a regex of something
Allahbaksh Mohammedali Asadullah wrote:
For example I want to search amount >= 15 rather than doing it
amount:[ 15] or something?
Is there any open source queryparser which converts something like
amount >=15 into lucene number format query.
I don't know of any effort to change Lucene's
On Mon, Mar 9, 2009 at 2:02 PM, Michael McCandless
wrote:
> Once added, something inside the index (a "write once" schema) records
> that this field is an IntField and then it's an error to ever use a
> different type field by that same name.
I dunno... coupling functionality to restrictions seem
markharw00d wrote:
>>(a "write once" schema)
I like this idea. Enforcing consistent field-typing on instances of
fields with the same name does not seem like an unreasonable
restriction - especially given the upsides to this.
And also when it's "opt-in", ie, you can continue to use untyp
>>(a "write once" schema)
I like this idea. Enforcing consistent field-typing on instances of
fields with the same name does not seem like an unreasonable restriction
- especially given the upsides to this.
It doesn't dispense with all the full schema logic in Solr but seems
like a useful ba
mark harwood wrote:
Time for some standardised index metadata?
OK, thinking out loud...
What if we created IntField, subclassing Field. It holds a single
int, and you can add it to Document just like any other field.
Once added, something inside the index (a "write once" schema) records
th
mark harwood wrote:
This trie/parser issue is an example of a broader issue for me.
Yeah I agree.
There was also a new Document impl attached in Jira somewhere to more
strongly type fields (can't find it now), ie IntField, DateField, etc.
And it also ties into refactoring AbstractField/Field
On Mon, Mar 9, 2009 at 8:10 AM, Michael McCandless
wrote:
> Could we add APIs to QueryParser so the application can state the
> disposition
> toward certain fields?
overriding QueryParser.getRangeQuery() seems the most powerful and
flexible (and it's already there).
-Yonik
http://www.lucidimagin
e.apache.org
Sent: Monday, 9 March, 2009 13:10:32
Subject: Re: Lucene 2.9
Uwe Schindler wrote:
>> Or perhaps we should move Trie* into core Lucene, and then build a
>> real (ootb) integration with QueryParser.
>
> The problem is that the query parser does not know if a fiel
he.org
Subject: Re: Lucene 2.9
Uwe Schindler wrote:
>> Or perhaps we should move Trie* into core Lucene, and then build a
>> real (ootb) integration with QueryParser.
>
> The problem is that the query parser does not know if a field is
> encoded as
> trie or is just a norma
Uwe Schindler wrote:
Or perhaps we should move Trie* into core Lucene, and then build a
real (ootb) integration with QueryParser.
The problem is that the query parser does not know if a field is
encoded as
trie or is just a normal text token. Furthermore, the new trie API
does not
differe
> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Monday, March 09, 2009 12:51 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene 2.9
>
>
> Uwe Schindler wrote:
>
> >>> Is there any plans to ha
query.
Regards,
Allahbaksh
-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Monday, March 09, 2009 4:26 PM
To: java-user@lucene.apache.org
Subject: RE: Lucene 2.9
> > Is there any plans to have simpler queries for Numbers and Data?
>
> With the
Uwe Schindler wrote:
Is there any plans to have simpler queries for Numbers and Data?
With the recent addition of TrieRangeQuery (in 2.9), I think Lucene's
range querying is actually very strong, though you'd have to subclass
QueryParser and override getRangeQuery to have it create
TrieRang
> > Is there any plans to have simpler queries for Numbers and Data?
>
> With the recent addition of TrieRangeQuery (in 2.9), I think Lucene's
> range querying is actually very strong, though you'd have to subclass
> QueryParser and override getRangeQuery to have it create TrieRangeQuery.
The add
Allahbaksh Mohammedali Asadullah wrote:
When is Lucene 2.9 due? I am eagerly waiting for the new lucene to
come.
There have been some discussions on java-dev, but there's no clear
consensus/date yet. We do have quite a few Jira issues marked as 2.9
at this point, which we need to make p
53 matches
Mail list logo