Re: Log processing

Daniel Bruno Mon, 25 Feb 2013 12:38:51 -0800

Thanks Jeff,

your explanation was very useful.



On Mon, Feb 25, 2013 at 12:37 PM, Jeff Lord <[email protected]> wrote:

> Daniel,
>
> Flume was designed as a configurable pipeline for discrete events in order
> to get them reliably from a source (e.g. web server application) -> to a
> destination (e.g. into hdfs).
> Flume provides the facility to write the same event to multiple
> destinations (e.g. HDFS and Hbase or HDFS and Cassandra).
> There is also a third party cassandra plugin (sink) for Flume NG that will
> write events into Cassandra.
> https://github.com/btoddb/flume-ng-cassandra-sink
> Whether or not you process the log "in the fly" is going to depend on your
> use case and resources, but if it is feasible than writing directly into
> Cassandra is probably going to be the most efficient.
>
> I am not personally familiar with the logprocessing plugin you mention but
> it appears to be built on top of the old flume.
> We highly recommend using Flume NG going forward, so it sounds like you
> might want to try Flume NG with the cassandra sink.
>
> Hope this helps.
>
> -Jeff
>
>
>
>
>
> On Sun, Feb 24, 2013 at 8:39 PM, Daniel Bruno <[email protected]>wrote:
>
>> Hello everyone,
>>
>> I'm researching about Flume as a solution for web analytics.
>>
>> I read some texts about that, and my idea is to use Flume to collect the
>> logs and put in a Cassadra database. But first i have some doubts that I
>> wanna share.
>>
>> Is a good approach process the log "in the fly" and insert it in the
>> database processed?
>>
>> Or is better collect the log, and store them (e.g. HDFS), and have
>> scheduled jobs with Pig and later insert in a database like HBase or
>> Cassandra?
>>
>> I found an interesting solution made by Gemini (now Cloudian) called
>> logprocessing, someone used it?
>>
>>
>> Thanks
>> --
>> Daniel Bruno
>> http://danielbruno.eti.br
>>
>
>


-- 
Daniel Bruno
http://danielbruno.eti.br

Re: Log processing

Reply via email to