I have a similar use case that cropped up yesterday. I saw the archive and
found that there was a recommendation to build it as Sharninder suggested.

For now, I went down the route of writing a python script which downloads
from S3 and puts the files in a directory which is configured to be picked
up via a spooldir.

I would prefer to get a direct S3 source, and maybe we could collaborate on
it and open-source it. Let me know if you prefer that and we can work
directly on it by creating a JIRA.

Thanks,
Viral



On Thu, Jul 31, 2014 at 10:26 AM, Hari Shreedharan <
hshreedha...@cloudera.com> wrote:

> In both cases, Sharninder is right :)
>
> Sharninder wrote:
>
>
> As far as I know, there is no (open source) implementation of an S3
> source, so yes, you'll have to implement your own. You'll have to
> implement a Pollable source and the dev documentation has an outline
> that you can use. You can also look at the existing Execsource and
> work your way up.
>
> As far as I know, there is no way to configure flume without using the
> configuration file.
>
>
>
> On Thu, Jul 31, 2014 at 7:57 PM, Paweł <pro...@gmail.com
> <mailto:pro...@gmail.com>> wrote:
>
>     Hi,
>     I'm wondering if Flume is able to read directly from S3.
>
>     I'll describe my case. I have log files stored in AWS S3. I have
>     to fetch periodically new S3 objects and read log lines from it.
>     Than use log lines (events) are processed in standard flume's way
>     (as with other sources).
>
>     *1) Is there any way to fetch S3 objects or I have to write my own
>     Source?*
>
>
>     There is also second case. I want to have flume configuration
>     dynamic. Flume sources can change in time. New AWS key and S3
>     bucket can be added or deleted.
>
>     *2) Is there any other way to configure Flume than by static
>     configuration file?*
>
>     --
>     Paweł Róg
>
>

Reply via email to