Re: Lucene vs RDBMS indexing at scale

2013-02-06 Thread Andrew Gilmartin
Drew Kutcharian wrote: I'm trying to figure out what would be a better approach to indexing when it comes to a large number of records (say 1 billion) A rule of thumb is that if you want a list of exact matches use a database. If you want a ranked list of matches use Lucene. -- Andrew

Re: Lucene vs RDBMS indexing at scale

2013-02-05 Thread David Pilato
So you probably should ask your question to the Elasticsearch mailing list. I think that some ES users already scales to x billion docs. Even if ES is Lucene based, it adds features to scale out (sharding, routing...). HTH -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 5

Re: Lucene vs RDBMS indexing at scale

2013-02-05 Thread Drew Kutcharian
The records are mostly logging events where they will have: 1. a timestamp 2. the type of the event 3. potentially a set of key/value properties Then I would want to be able to slice and dice the records based on time (required), type and/or the key/values. In addition, I would want to have sta

Re: Lucene vs RDBMS indexing at scale

2013-02-05 Thread Stephen Howe
Part of the answer depends on what kind of records you have. For instance, are you dealing with a lot of numeric data? If you need all those functions and only want to support exact matches and basic boolean comparisons, then I'd go with a RDBMS instead of Lucene. You'll get better support for the

Lucene vs RDBMS indexing at scale

2013-02-05 Thread Drew Kutcharian
Hey Guys, I'm trying to figure out what would be a better approach to indexing when it comes to a large number of records (say 1 billion). As far as queries: 1. Only support exact matches (a field is equal to some constant value) or range matches (a field is larger/smaller than some constant va