Re: how flume identifies a file transfer is complete or not

Jonathan Natkins Fri, 25 Jul 2014 10:54:12 -0700

Hi Anand,

What you're doing is a slightly odd way to use Flume. With the exec source,
Flume will execute that command, and consume the output as events. Often
the exec source is used to tail -F a file, which allows you to pipe more
data to the file and ingest additional events. By using cat, Flume will cat
the file, but then the source will become useless, because the command will
have finished, and there's no way that I'm aware of to get an agent to
start a new command. By using tail -F, the command persists, and if you do
`ps aux | grep flume`, you would see a running tail -F command.


As for figuring out when the transfer is complete, I don't think there's a
really good way other than checking the file itself, or looking to see if
the cat command is still running.

Does that help?

Thanks,
Natty


On Thu, Jul 24, 2014 at 2:00 AM, Anandkumar Lakshmanan <an...@orzota.com>
wrote:

> Hi,
>
> I am new to flume.
>
> I am doing cat a file using exec source into hdfs.
> While running it manually, I am able to see the file transferred
> completely. But still flume in is running state.
> How do I find when the complete transfer would be done.
>
> Example:
>
> My flume.conf
>
> myAgent.sources.mySource.type = exec
> myAgent.sources.mySource.command = cat /home/haas/file2.txt
>
>
> And checking the transfer is complete or not, only by typing the following
> command manually by comparing the file size.
>
> hadoop fs -ls /user/flumedata/
>
> Is there a way to know when the transfer is get completed?
>
> Thanks.
> Anand
>

Re: how flume identifies a file transfer is complete or not

Reply via email to