I've personally indexed over 1,000,000 documents and Lucene doesn't even
breath hard.
We are in the hundreds of millions and growing, and Lucene does tend
to sweat a little bit, although it can certainly handle it.
You're going to have to understand a bit of the internals of Lucene a
bit more.
riginal Message
From: Andreas Moroder <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, June 18, 2006 12:58:16 PM
Subject: Lucene as syslog storage
Hello,
I would like to write a application to browse around and search the log
files of linux machines, like www.splunk.org does
there's somebody on the mailing list who's talking about indexing a Billion
(with a "B") documents. I don't know how far they've gotten, but at least
*somebody* has contemplated a huge archive ... If memory serves, s/he had
indexed a significant number of documents, you might try searching for
"bi
Ray Tsang wrote:
> I think it ultimately depends on what you would like to do with the
> stored data? Would you need more of full text searches on the log or
> more of statistical anlaysis?
>
> ray,
Hello Ray,
possibly both, but the full text filtering is more important.
Bye
Andreas
--
I think it ultimately depends on what you would like to do with the
stored data? Would you need more of full text searches on the log or
more of statistical anlaysis?
ray,
On 6/18/06, Andreas Moroder <[EMAIL PROTECTED]> wrote:
Hello,
I would like to write a application to browse around and se
Hello,
I would like to write a application to browse around and search the log
files of linux machines, like www.splunk.org does.
Would lucene be the right db to store such text information ?
Because the log info should be stored in the db continuously and not as
batch, this would create many tho