We're using a single dual-3Ghz Xeon box, Sun vx65 - indexes stored on Netapp
nearstore R100. I think you can either try to investigate if there's a way
your users will naturally group their searches and build indexes around that
to minimize individual index size or prototype a distributed index
Lucene is an excellent choice.
If I were you I would not store the un-searched fields in the index.
There's no clear benefit. Where you store the data depends on your needs
- I use flat files for what I'm doing - as I need them just for
display. If you need the functionality of a relational
Unfortunately our indexes will be performance sensitive. Is Lucene
still a good choice? What kind of hardware are you using?
Also what are the performance implications for having the additional
80 records in the index for just display purposes?
Thanks,
Richard Krenek
On 5/13/05, Vince Taluski
Yes, you'll be fine with 100 million, I've got a couple of non-performance
sensitive indexes that are more than double that (280M) with about 20
seachable fields as well. We get results back in the 10-20 second range
which is fine for our end users.
Vince
On 5/13/05, Richard Krenek <[EMAIL PRO