I believe that Hadoop supports S3, so it may work, but I've never tried
using S3 for that; the easiest way to see if that will work is for you to
try it.  You'll likely have to set
oozie.service.HadoopAccessorService.supported.filesystems to "hdfs,s3" in
oozie-site.xml.

- Robert


On Fri, Mar 29, 2013 at 12:19 PM, Panshul Whisper <ouchwhis...@gmail.com>wrote:

> Hello,
>
> Thank you for the responses.
> I get the point that it is not possible to load a pig script file from
> local file system from within a oozie workflow, not even while using the
> Hue - oozie interface.
>
> Is it possible to load a pig script stored on S3 into a workflow, if the
> cluster is an EC2 cluster. running CDH4.
>
> Can i specify the pig script file in the normal way as I would refer to any
> normal file on S3 from within hdfs?
>
> Thank you for the help,
>
> Regards,
>
>
> On Thu, Mar 28, 2013 at 10:44 PM, Robert Kanter <rkan...@cloudera.com
> >wrote:
>
> > DistCp (and thus the DistCp action itself) are meant for copying large
> > amounts of data and files between two Hadoop clusters or within a single
> > cluster.  As far as I know, it won't accept a local filesystem path or an
> > ftp/sftp path.
> >
> > - Robert
> >
> >
> > On Thu, Mar 28, 2013 at 10:02 AM, Harish Krishnan <
> > harish.t.krish...@gmail.com> wrote:
> >
> > > Can we use distcp action to copy from local file system to hdfs?
> > > Use sftp:// for files in local file system and hdfs:// for destination
> > dir.
> > >
> > > Thanks & Regards,
> > > Harish.T.K
> > >
> > >
> > > On Thu, Mar 28, 2013 at 9:35 AM, Ryota Egashira <
> egash...@yahoo-inc.com
> > > >wrote:
> > >
> > > > Hi, Panshul
> > > >
> > > > >1)
> > > > You might need to upload pig script to HDFS (e..g, using hadoop dfs
> > > > command) before running workflow.
> > > > >2)
> > > > AFAIK, it is not common way to do copyFromLocal as part of workflow,
> > > since
> > > > workflow action is running on tasktracker node as M/R job.
> > > > once pig script uploaded on HDFS, Oozie takes care of copying it from
> > > HDFS
> > > > to tasktracker node using Hadoop distributed cache mechanism before
> > > > running pig action, and we don't have to worry about it.
> > > >
> > > > I guess Cloudera folks have answer on 3).
> > > >
> > > > Hope it helps.
> > > > Ryota
> > > >
> > > > On 3/28/13 5:35 AM, "Panshul Whisper" <ouchwhis...@gmail.com> wrote:
> > > >
> > > > >Hello,
> > > > >
> > > > >sorry for a novice question, but I have the following question:
> > > > >
> > > > >1. How do I give a pig script file to a workflow if the file is
> stored
> > > on
> > > > >the local filesystem.
> > > > >2. If i need to perform a copyfomlocal before i execute the pig
> > script,
> > > > >what action type should I use? Please give an example if possible.
> > > > >3. I am using CDH4 Hue interface for creating workflow. Any pointers
> > > with
> > > > >that perspective will also help.
> > > > >
> > > > >Thanking You,
> > > > >--
> > > > >Regards,
> > > > >Ouch Whisper
> > > > >010101010101
> > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Reply via email to