Thanks Sharinder for the suggestions.

Let me use spool directory source. Will let you know how it works for me.

But anyone let me know, is there any way to find that the transfer is complete?

Thanks,
Anand.


On 07/26/2014 01:38 PM, Sharninder wrote:
If you really want to add files to HDFS, use the spool directory source which is much more reliable. If you do want to use the exec source, no point using cat since that's as good as cp'ing the file the HDFS, use tail -f rather.

--
Sharninder



On Sat, Jul 26, 2014 at 9:34 AM, Anandkumar Lakshmanan <an...@orzota.com <mailto:an...@orzota.com>> wrote:

    Hi Natty,

    Thanks for the Reply.

    So far I am verifying the transfer is complete or not by checking
    the file in the destination  or as you mentioned only.

    Thanks
    Anand.

    On 07/25/2014 11:22 PM, Jonathan Natkins wrote:
    Hi Anand,

    What you're doing is a slightly odd way to use Flume. With the
    exec source, Flume will execute that command, and consume the
    output as events. Often the exec source is used to tail -F a
    file, which allows you to pipe more data to the file and ingest
    additional events. By using cat, Flume will cat the file, but
    then the source will become useless, because the command will
    have finished, and there's no way that I'm aware of to get an
    agent to start a new command. By using tail -F, the command
    persists, and if you do `ps aux | grep flume`, you would see a
    running tail -F command.

    As for figuring out when the transfer is complete, I don't think
    there's a really good way other than checking the file itself, or
    looking to see if the cat command is still running.

    Does that help?

    Thanks,
    Natty


    On Thu, Jul 24, 2014 at 2:00 AM, Anandkumar Lakshmanan
    <an...@orzota.com <mailto:an...@orzota.com>> wrote:

        Hi,

        I am new to flume.

        I am doing cat a file using exec source into hdfs.
        While running it manually, I am able to see the file
        transferred completely. But still flume in is running state.
        How do I find when the complete transfer would be done.

        Example:

        My flume.conf

        myAgent.sources.mySource.type = exec
        myAgent.sources.mySource.command = cat /home/haas/file2.txt


        And checking the transfer is complete or not, only by typing
        the following command manually by comparing the file size.

        hadoop fs -ls /user/flumedata/

        Is there a way to know when the transfer is get completed?

        Thanks.
        Anand





Reply via email to