Re: Lucene index sizes and performance

2009-04-16 Thread Michael Stoppelman
On Sat, Jul 7, 2007 at 8:19 PM, Chun Wei Ho wrote: > We are currently running a search service with a single Lucene index > of about 10 GB. We would like to find out: > > (a) What is the usual index size of everyone else? How large have > Lucene index gone in prodution environments, and is there

Re: Lucene index sizes and performance

2009-03-31 Thread sunnyfr
h guidance on Lucene all this time :) >> >> - >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > ---

Re: Lucene index sizes and performance

2007-07-07 Thread Chris Lu
Not really suggestion but some points to consider. (a) Greatly depending on your hardware, especially harddrive speed. (b) Do you do SortBy? Each SortBy field will need an array in memory. If no sortBy, reserve memory for about 10~15% of index size will be enough. (c) Maybe try to split by index c

Lucene index sizes and performance

2007-07-07 Thread Chun Wei Ho
We are currently running a search service with a single Lucene index of about 10 GB. We would like to find out: (a) What is the usual index size of everyone else? How large have Lucene index gone in prodution environments, and is there a sort of a optimal size that Lucene indexes should be? (b)

Re: large index sizes

2005-09-19 Thread Richard Littin
Hi Edward, We have indexed the MedLine data. We used the default StopAnalyzer on the full text fields (fields that are more than just dates or ids) and the default Keyword for the other fields. So the index has the short fields stored in it and just indexing for the larger fields. In our a

large index sizes

2005-09-19 Thread Edward Summers
I'm investigating possible alternatives for indexing/searching a very large dataset (2TB) of xml data from the pubmed database[1]. Does anyone have any experience working with indexes of this size? Granted the actual index size would be smaller than the source files, but I'm just curious h

Re: Index Sizes

2005-05-17 Thread Vince Taluskie
We're using a single dual-3Ghz Xeon box, Sun vx65 - indexes stored on Netapp nearstore R100. I think you can either try to investigate if there's a way your users will naturally group their searches and build indexes around that to minimize individual index size or prototype a distributed index

Re: Index Sizes

2005-05-17 Thread Dan Funk
Lucene is an excellent choice. If I were you I would not store the un-searched fields in the index. There's no clear benefit. Where you store the data depends on your needs - I use flat files for what I'm doing - as I need them just for display. If you need the functionality of a relational

Re: Index Sizes

2005-05-16 Thread Richard Krenek
Unfortunately our indexes will be performance sensitive. Is Lucene still a good choice? What kind of hardware are you using? Also what are the performance implications for having the additional 80 records in the index for just display purposes? Thanks, Richard Krenek On 5/13/05, Vince Taluski

Re: Index Sizes

2005-05-13 Thread Vince Taluskie
Yes, you'll be fine with 100 million, I've got a couple of non-performance sensitive indexes that are more than double that (280M) with about 20 seachable fields as well. We get results back in the 10-20 second range which is fine for our end users. Vince On 5/13/05, Richard Krenek <[EMAIL PRO

Index Sizes

2005-05-13 Thread Richard Krenek
Hypothetically I have 100 million records. Each record has 100+ fields. Only 20 of those fields need to be searched on, the rest (including the 20) are just for display purposes. Would it be best to just add the 20 fields to the index and keep the rest in a relational database? What affect does all