I have a similar use case that cropped up yesterday. I saw the archive and found that there was a recommendation to build it as Sharninder suggested.
For now, I went down the route of writing a python script which downloads from S3 and puts the files in a directory which is configured to be picked up via a spooldir. I would prefer to get a direct S3 source, and maybe we could collaborate on it and open-source it. Let me know if you prefer that and we can work directly on it by creating a JIRA. Thanks, Viral On Thu, Jul 31, 2014 at 10:26 AM, Hari Shreedharan < hshreedha...@cloudera.com> wrote: > In both cases, Sharninder is right :) > > Sharninder wrote: > > > As far as I know, there is no (open source) implementation of an S3 > source, so yes, you'll have to implement your own. You'll have to > implement a Pollable source and the dev documentation has an outline > that you can use. You can also look at the existing Execsource and > work your way up. > > As far as I know, there is no way to configure flume without using the > configuration file. > > > > On Thu, Jul 31, 2014 at 7:57 PM, Paweł <pro...@gmail.com > <mailto:pro...@gmail.com>> wrote: > > Hi, > I'm wondering if Flume is able to read directly from S3. > > I'll describe my case. I have log files stored in AWS S3. I have > to fetch periodically new S3 objects and read log lines from it. > Than use log lines (events) are processed in standard flume's way > (as with other sources). > > *1) Is there any way to fetch S3 objects or I have to write my own > Source?* > > > There is also second case. I want to have flume configuration > dynamic. Flume sources can change in time. New AWS key and S3 > bucket can be added or deleted. > > *2) Is there any other way to configure Flume than by static > configuration file?* > > -- > Paweł Róg > >