I just submitted the patch on https://issues.apache.org/jira/browse/FLUME-1838.
Would love some reviews, thanks! -Andrew On Jan 14, 2013, at 1:01 PM, Andrew Otto <[email protected]> wrote: > Thanks guys! I've opened up a JIRA here: > > https://issues.apache.org/jira/browse/FLUME-1838 > > > On Jan 14, 2013, at 12:43 PM, Alexander Alten-Lorenz <[email protected]> > wrote: > >> Hey Andrew, >> >> for your reference, we have a lot of developer informations in our wiki: >> >> https://cwiki.apache.org/confluence/display/FLUME/Developer+Section >> https://cwiki.apache.org/confluence/display/FLUME/Developers+Quick+Hack+Sheet >> >> cheers, >> Alex >> >> On Jan 14, 2013, at 6:37 PM, Hari Shreedharan <[email protected]> >> wrote: >> >>> Hi Andrew, >>> >>> Really happy to hear Wikimedia Foundation is considering Flume. I am fairly >>> sure that if you find such a source useful, there would definitely be >>> others who find it useful too. I'd recommend filing a jira and starting a >>> discussion, and then submitting the patch. We would be happy to review and >>> commit it. >>> >>> >>> Thanks, >>> Hari >>> >>> -- >>> Hari Shreedharan >>> >>> >>> On Monday, January 14, 2013 at 9:29 AM, Andrew Otto wrote: >>> >>>> Hi all, >>>> >>>> I'm an Systems Engineer at the Wikimedia Foundation, and we're >>>> investigating using Flume for our web request log HDFS imports. We've >>>> previously been using Kafka, but have had to change short term >>>> architecture plans in order to get data into HDFS reliably and regularly >>>> soon. >>>> >>>> Our current web request logs are available for consumption over a >>>> multicast UDP stream. I could hack something together to try and pipe this >>>> into Flume using the existing sources (SyslogUDPSource, or maybe some >>>> combination of socat + NetcatSource), but I'd rather reduce the number of >>>> moving parts. I'd like to consume directly from the multicast UDP stream >>>> as a Flume source. >>>> >>>> I coded up proof of concept based on the SyslogUDPSource, mainly just >>>> stripping out the syslog event header extraction, and adding in multicast >>>> Datagram connection code. I plan on cleaning this up, and making this a >>>> generic raw UDP source, with multicast being a configuration option. >>>> >>>> My question to you guys is, is this something the Flume community would >>>> find useful? If so, should I open up a JIRA to track this? I've got a fork >>>> of the Flume git repo over on github and will be doing my work there. I'd >>>> love to share it upstream if it would be useful. >>>> >>>> Thanks! >>>> -Andrew Otto >>>> Systems Engineer >>>> Wikimedia Foundation >>>> >>>> >>> >>> >> >> -- >> Alexander Alten-Lorenz >> http://mapredit.blogspot.com >> German Hadoop LinkedIn Group: http://goo.gl/N8pCF >> >
