System Specification:
Processor speed: 2Ghz
Ram: 3 GB
IndexDB size 5 GB.
Total documents indexed: 5.8 million.
To collect hits, i have replaced Hits object with TopFieldDocs. This has
improved the search performance better. Sorting is faster on date / long
field, but it is very slow on string field. In a standalone application it
took 10 - 20 secs to dispaly the results sorted on string field. [I am not
opening indexsearcher every time].
Regards
Ganesh
----- Original Message -----
From: "Erick Erickson" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Monday, September 22, 2008 6:29 PM
Subject: Re: Exception while doing sorting
Sure, your tomcat instance is assigning some amount of memory
to the JVM that your searcher is running in. Of course, now you're
going to ask me now to increase that number... I have no idea but
I've seen this question multiple times in the mail archive,
so a search there or in the tomcat docs should let you know.
But 12 seconds is still a long time to wait for a search to complete.
Can you tell us more about your search?
For instance, are you opening a searcher for each request? That's bad.
Are you sorting? that can take a long time, but again the first one
will have a performance penalty as things are cached.
There are a number of tips here:
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
Best
Erick
On Mon, Sep 22, 2008 at 7:45 AM, Ganesh - yahoo
<[EMAIL PROTECTED]>wrote:
My index crossed 5 GB and 5 million documents are indexed.
My query includes searching and sorting returns 40000 hits.
If i do search from a standalone application, the results are returned in
12 seconds. If i perform the same from web application running inside
Tomcat, out of memory exception is occured.
Could any one clarify it?
Regards
Ganesh
----- Original Message ----- From: "Ganesh - yahoo"
<[EMAIL PROTECTED]
>
To: <java-user@lucene.apache.org>
Sent: Friday, September 19, 2008 10:56 AM
Subject: Re: Exception while doing sorting
Ok. If i distribure the indexes, whether sorting would be faster?
In Lucene user group mailing list, most emails suggests to use single
indicies. Searching across the indexes may not be slower?
Lucene uses FieldCache for sorting on non-tokenized field and tries to
maintain fields from all your 4 millions documents, even if you need
to sort only 4000 docs.
Don't know why Lucene keeps all terms in FieldCache for sorting. It
supposed to sort only the hits. Please clarify?
Regards
Ganesh
----- Original Message ----- From: "Otis Gospodnetic" <
[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Thursday, September 18, 2008 12:17 PM
Subject: Re: Exception while doing sorting
If your index is increasing in size so fast, you should start thinking
about sharding your index (breaking it into multiple smaller indices
that
each fits on its server) and searching across them (aka distributed
search).
Yes, Lucene can handle millions of records if run on adequate hardware
and if used correctly.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
From: Ganesh - yahoo <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Thursday, September 18, 2008 12:53:19 AM
Subject: Re: Exception while doing sorting
My index is growing by 1 million records per day. How much memory do i
need
to increase.
What kind of sorting algorithm being used in Lucene. Is this efficient
enough to handle millions of records.
Whether we could do sorting using our own algorithm?
Regards
Ganesh
----- Original Message ----- From: "Fuad Efendi"
To:
Sent: Wednesday, September 17, 2008 7:28 PM
Subject: Re: Exception while doing sorting
Increase memory.
Lucene uses FieldCache for sorting on non-tokenized field and tries to
maintain fields from all your 4 millions documents, even if you need
to sort only 4000 docs.
==============
http://www.tokenizer.org/bot.html
Quoting Ganesh - yahoo :
> Hello all,
>
> I am have indexed more than 4 million documents. My query fetches
> 300,000 hits. If i perform sorting on any field, then tomcat reports
> out of memory exception.
> Sometimes the query results may be around 1000, but sorting on any
> field might take more than 30 - 50 secs.
>
> I don't know what's going wrong.
>
> My index searcher is static object and it is getting refreshed every
> minute. JSP pages directly calls the index searcher object and >
performs
> search.
>
> Regards
> Ganesh
>
>
> Send instant messages to your online friends
> http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Send instant messages to your online friends
http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Send instant messages to your online friends
http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Send instant messages to your online friends
http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Send instant messages to your online friends http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]