Re: Need for UDP / Multicast Source

Andrew Otto Mon, 14 Jan 2013 10:02:07 -0800

Thanks guys!  I've opened up a JIRA here:

https://issues.apache.org/jira/browse/FLUME-1838



On Jan 14, 2013, at 12:43 PM, Alexander Alten-Lorenz <[email protected]> 
wrote:

> Hey Andrew,
> 
> for your reference, we have a lot of developer informations in our wiki:
> 
> https://cwiki.apache.org/confluence/display/FLUME/Developer+Section
> https://cwiki.apache.org/confluence/display/FLUME/Developers+Quick+Hack+Sheet
> 
> cheers,
> Alex
> 
> On Jan 14, 2013, at 6:37 PM, Hari Shreedharan <[email protected]> 
> wrote:
> 
>> Hi Andrew, 
>> 
>> Really happy to hear Wikimedia Foundation is considering Flume. I am fairly 
>> sure that if you find such a source useful, there would definitely be others 
>> who find it useful too. I'd recommend filing a jira and starting a 
>> discussion, and then submitting the patch. We would be happy to review and 
>> commit it. 
>> 
>> 
>> Thanks,
>> Hari
>> 
>> -- 
>> Hari Shreedharan
>> 
>> 
>> On Monday, January 14, 2013 at 9:29 AM, Andrew Otto wrote:
>> 
>>> Hi all,
>>> 
>>> I'm an Systems Engineer at the Wikimedia Foundation, and we're 
>>> investigating using Flume for our web request log HDFS imports. We've 
>>> previously been using Kafka, but have had to change short term architecture 
>>> plans in order to get data into HDFS reliably and regularly soon.
>>> 
>>> Our current web request logs are available for consumption over a multicast 
>>> UDP stream. I could hack something together to try and pipe this into Flume 
>>> using the existing sources (SyslogUDPSource, or maybe some combination of 
>>> socat + NetcatSource), but I'd rather reduce the number of moving parts. 
>>> I'd like to consume directly from the multicast UDP stream as a Flume 
>>> source.
>>> 
>>> I coded up proof of concept based on the SyslogUDPSource, mainly just 
>>> stripping out the syslog event header extraction, and adding in multicast 
>>> Datagram connection code. I plan on cleaning this up, and making this a 
>>> generic raw UDP source, with multicast being a configuration option.
>>> 
>>> My question to you guys is, is this something the Flume community would 
>>> find useful? If so, should I open up a JIRA to track this? I've got a fork 
>>> of the Flume git repo over on github and will be doing my work there. I'd 
>>> love to share it upstream if it would be useful.
>>> 
>>> Thanks!
>>> -Andrew Otto
>>> Systems Engineer
>>> Wikimedia Foundation
>>> 
>>> 
>> 
>> 
> 
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>

Re: Need for UDP / Multicast Source

Reply via email to