I believe that Hadoop supports S3, so it may work, but I've never tried using S3 for that; the easiest way to see if that will work is for you to try it. You'll likely have to set oozie.service.HadoopAccessorService.supported.filesystems to "hdfs,s3" in oozie-site.xml.
- Robert On Fri, Mar 29, 2013 at 12:19 PM, Panshul Whisper <ouchwhis...@gmail.com>wrote: > Hello, > > Thank you for the responses. > I get the point that it is not possible to load a pig script file from > local file system from within a oozie workflow, not even while using the > Hue - oozie interface. > > Is it possible to load a pig script stored on S3 into a workflow, if the > cluster is an EC2 cluster. running CDH4. > > Can i specify the pig script file in the normal way as I would refer to any > normal file on S3 from within hdfs? > > Thank you for the help, > > Regards, > > > On Thu, Mar 28, 2013 at 10:44 PM, Robert Kanter <rkan...@cloudera.com > >wrote: > > > DistCp (and thus the DistCp action itself) are meant for copying large > > amounts of data and files between two Hadoop clusters or within a single > > cluster. As far as I know, it won't accept a local filesystem path or an > > ftp/sftp path. > > > > - Robert > > > > > > On Thu, Mar 28, 2013 at 10:02 AM, Harish Krishnan < > > harish.t.krish...@gmail.com> wrote: > > > > > Can we use distcp action to copy from local file system to hdfs? > > > Use sftp:// for files in local file system and hdfs:// for destination > > dir. > > > > > > Thanks & Regards, > > > Harish.T.K > > > > > > > > > On Thu, Mar 28, 2013 at 9:35 AM, Ryota Egashira < > egash...@yahoo-inc.com > > > >wrote: > > > > > > > Hi, Panshul > > > > > > > > >1) > > > > You might need to upload pig script to HDFS (e..g, using hadoop dfs > > > > command) before running workflow. > > > > >2) > > > > AFAIK, it is not common way to do copyFromLocal as part of workflow, > > > since > > > > workflow action is running on tasktracker node as M/R job. > > > > once pig script uploaded on HDFS, Oozie takes care of copying it from > > > HDFS > > > > to tasktracker node using Hadoop distributed cache mechanism before > > > > running pig action, and we don't have to worry about it. > > > > > > > > I guess Cloudera folks have answer on 3). > > > > > > > > Hope it helps. > > > > Ryota > > > > > > > > On 3/28/13 5:35 AM, "Panshul Whisper" <ouchwhis...@gmail.com> wrote: > > > > > > > > >Hello, > > > > > > > > > >sorry for a novice question, but I have the following question: > > > > > > > > > >1. How do I give a pig script file to a workflow if the file is > stored > > > on > > > > >the local filesystem. > > > > >2. If i need to perform a copyfomlocal before i execute the pig > > script, > > > > >what action type should I use? Please give an example if possible. > > > > >3. I am using CDH4 Hue interface for creating workflow. Any pointers > > > with > > > > >that perspective will also help. > > > > > > > > > >Thanking You, > > > > >-- > > > > >Regards, > > > > >Ouch Whisper > > > > >010101010101 > > > > > > > > > > > > > > > > > -- > Regards, > Ouch Whisper > 010101010101 >