Hey Andrew,

for your reference, we have a lot of developer informations in our wiki:

https://cwiki.apache.org/confluence/display/FLUME/Developer+Section
https://cwiki.apache.org/confluence/display/FLUME/Developers+Quick+Hack+Sheet

cheers,
 Alex

On Jan 14, 2013, at 6:37 PM, Hari Shreedharan <[email protected]> wrote:

> Hi Andrew, 
> 
> Really happy to hear Wikimedia Foundation is considering Flume. I am fairly 
> sure that if you find such a source useful, there would definitely be others 
> who find it useful too. I'd recommend filing a jira and starting a 
> discussion, and then submitting the patch. We would be happy to review and 
> commit it. 
> 
> 
> Thanks,
> Hari
> 
> -- 
> Hari Shreedharan
> 
> 
> On Monday, January 14, 2013 at 9:29 AM, Andrew Otto wrote:
> 
>> Hi all,
>> 
>> I'm an Systems Engineer at the Wikimedia Foundation, and we're investigating 
>> using Flume for our web request log HDFS imports. We've previously been 
>> using Kafka, but have had to change short term architecture plans in order 
>> to get data into HDFS reliably and regularly soon.
>> 
>> Our current web request logs are available for consumption over a multicast 
>> UDP stream. I could hack something together to try and pipe this into Flume 
>> using the existing sources (SyslogUDPSource, or maybe some combination of 
>> socat + NetcatSource), but I'd rather reduce the number of moving parts. I'd 
>> like to consume directly from the multicast UDP stream as a Flume source.
>> 
>> I coded up proof of concept based on the SyslogUDPSource, mainly just 
>> stripping out the syslog event header extraction, and adding in multicast 
>> Datagram connection code. I plan on cleaning this up, and making this a 
>> generic raw UDP source, with multicast being a configuration option.
>> 
>> My question to you guys is, is this something the Flume community would find 
>> useful? If so, should I open up a JIRA to track this? I've got a fork of the 
>> Flume git repo over on github and will be doing my work there. I'd love to 
>> share it upstream if it would be useful.
>> 
>> Thanks!
>> -Andrew Otto
>> Systems Engineer
>> Wikimedia Foundation
>> 
>> 
> 
> 

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF

Reply via email to