Yes, I think it is. I think the only catch will be those log timestamps, how fine you really need them to be, and if you want them very fine what happens when you do range queries on timestamps. If you have a pile of log files lying around, it should be pretty easy to get them indexed. You don't even have to write a client for searching the resulting index, just point something like Luke to it, or even Solr.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ________________________________ From: Jeff Capone <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, November 10, 2008 6:51:20 PM Subject: Feasibility question Has anyone deployed Lucene to index log files? I have seen some articles about how RackSpace used Lucene and Hadoop for log processing, but I have not seen any details on the implementation. To get my required analytics, I think I would need to treat each line of the Apache log files as a document and I though I would treat each field as a key word to minimize processing. Assuming you have clusters operating on independent datasets (so I guess it would scale linearly) and you want to process Terabytes of logs per day, is such a solution even feasible? Thank you, Jeff Capone --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]