Re: Using Hadoop for near real-time processing of log data

Vadim Zaliva Wed, 25 Feb 2009 09:39:47 -0800

On Wed, Feb 25, 2009 at 05:59, Ryan LeCompte <[email protected]> wrote:
> Hello all,
>
> Is anyone using Hadoop as more of a near/almost real-time processing
> of log data for their systems to aggregate stats, etc? I know that
> Hadoop has generally been good at off-line processing of large amounts
> of data, but I've wondered if anyone has tried using it for processing
> of near real-time log data as it is appears in your systems with any
> success? My gut feeling is that Hadoop isn't suitable for this yet
> given redundancy issues around the JobTracker/NameNode, as well as the
> overhead of moving blocks around in HDFS. Thoughts?


Ryan,

Several people (myself including) asked similar question. You may want
to search the mailing list archives for previous discussions on the
topic.

In short, you are right, Hadoop is not perfecltly suited for realtime
processing.

Vadim

Re: Using Hadoop for near real-time processing of log data

Reply via email to