-08-27 at 05:34 +0200, Shelly_Singh wrote:
> I have a lucene index of 100 million documents. [...] total index size is
> 7GB.
[...]
> I get a response time of over 2 seconds.
How many documents match such a query and how many of those documents do
you process (i.e. extract a
Hi,
I have a lucene index of 100 million documents. But the document size is very
small - 5 fields with 1 or 2 terms each. Only 1 field is analyzed and others
are just simply indexed. The index is optimized to 2 segments and the total
index size is 7GB.
I open a searcher with a termsInfoDiviso
work has been put into making Lucene fast, by very bright
people. See
if they've already solved your problem for you...
Best
Erick.
On Thu, Aug 19, 2010 at 1:51 AM, Shelly_Singh wrote:
> Hi Anshum,
>
> I require sorted results for all my queries and the field on which I need
> s
time or is it a presumption?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh wrote:
> Hi,
>
> I have a Lucene index that contains a numeric field along with certain
> other fields. The order of incoming documents is random and un-predictable.
&
Hi,
In my index lucene index, I want to search on a field, but the score or order
of returned documents is not important. What is important is which documents
are returned.
As, I do not need score or even default sorting(order by docid), what is the
best way to write a query.
I compared perf
Hi,
I have a Lucene index that contains a numeric field along with certain other
fields. The order of incoming documents is random and un-predictable. As a
result, while creating an index, I end up adding docs in random order with
respect to the numeric field value.
For example, documents may
easons for that?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Aug 11, 2010 at 10:28 AM, Shelly_Singh wrote:
> My final settings are:
> 1. 1.5 gig RAM to the jvm out of 2GB available for my desktop
> 2. 100GB disk space.
> 3. Index creation and searching tuning factor
: Scaling Lucene to 1bln docs
So, you didn't really use the setRamBuffer.. ?
Any reasons for that?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Aug 11, 2010 at 10:28 AM, Shelly_Singh wrote:
> My final settings are:
> 1. 1.5 gig RAM to the jvm out of 2GB available for my
gt; http://blog.anshumgupta.net
>
> Sent from BlackBerry®
>
> -Original Message-
> From: Shelly_Singh
> Date: Tue, 10 Aug 2010 19:11:11
> To: java-user@lucene.apache.org
> Reply-To: java-user@lucene.apache.org
> Subject: RE: Scaling Lucene to 1bln docs
>
> Hi f
lution.
Lucene is just a tool (a fine one) but you need to use it wisely to
archive great results.
On Tue, Aug 10, 2010 at 15:55, Shelly_Singh wrote:
> Hmm..I get the point. But, in my application, the document is basically a
> descriptive name of a particular thing. The user will search
arge datasets it's a lot of tuning, custom code, and no
one-size-fits-all solution.
Lucene is just a tool (a fine one) but you need to use it wisely to
archive great results.
On Tue, Aug 10, 2010 at 15:55, Shelly_Singh wrote:
> Hmm..I get the point. But, in my application, the document is
ate, otherwise random assignment
> is fine.
> - have a pool of IndexSearchers for each index
> - when a search comes in, allocate a Searcher from each index to the search.
> - perform the search in parallel across all indices.
> - merge the results in your own code using
allel across all indices.
- merge the results in your own code using an efficient merging algorithm.
Regards,
Dan
-Original Message-
From: Shelly_Singh [mailto:shelly_si...@infosys.com]
Sent: Tuesday, August 10, 2010 8:20 AM
To: java-user@lucene.apache.org
Subject: RE: Scaling Lucene to
ex on timeline, and as a query
would be associated with a particular period you would only query the
indexes containing data for that period.
This would make the data manageable and searchable within reasonable time.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 10, 2010 at 5:49 PM, Shelly_
like to know, are you using a particular type of sort? Do you need to
sort on relevance? Can you shard and restrict your search to a limited set of
indexes functionally?
--
Anshum
http://blog.anshumgupta.net
Sent from BlackBerry(r)
-Original Message-
From: Shelly_Singh
Date: Tue, 10
.
>
> --
> Anshum Gupta
> http://ai-cafe.blogspot.com
>
>
> On Tue, Aug 10, 2010 at 12:24 PM, Shelly_Singh
> wrote:
>
>> Hi,
>>
>> I am developing an application which uses Lucene for indexing and searching
>> 1 bln documents. (the document size is
intermittently.
You may also use a multithreaded approach in case reading the source takes
time in your case, though, the indexwriter would have to be shared among all
threads.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 10, 2010 at 12:24 PM, Shelly_Singh wrote:
> Hi,
>
> I
Hi,
I am developing an application which uses Lucene for indexing and searching 1
bln documents. (the document size is very small though. Each document has a
single field of 5-10 words; so I believe that my data size is within the tested
limits).
I am using the following configuration:
1.
18 matches
Mail list logo