Hi Lin, We are working on getting Pig on spark functional with 0.8.0, have you got it working on any spark version ? Also what all functionality works on it? Regards Mayur
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Mon, Mar 10, 2014 at 11:00 PM, Xiangrui Meng <men...@gmail.com> wrote: > Hi Sameer, > > Lin (cc'ed) could also give you some updates about Pig on Spark > development on her side. > > Best, > Xiangrui > > On Mon, Mar 10, 2014 at 12:52 PM, Sameer Tilak <ssti...@live.com> wrote: > > Hi Mayur, > > We are planning to upgrade our distribution MR1> MR2 (YARN) and the goal > is > > to get SPROK set up next month. I will keep you posted. Can you please > keep > > me informed about your progress as well. > > > > ________________________________ > > From: mayur.rust...@gmail.com > > Date: Mon, 10 Mar 2014 11:47:56 -0700 > > > > Subject: Re: Pig on Spark > > To: user@spark.apache.org > > > > > > Hi Sameer, > > Did you make any progress on this. My team is also trying it out would > love > > to know some detail so progress. > > > > Mayur Rustagi > > Ph: +1 (760) 203 3257 > > http://www.sigmoidanalytics.com > > @mayur_rustagi > > > > > > > > On Thu, Mar 6, 2014 at 2:20 PM, Sameer Tilak <ssti...@live.com> wrote: > > > > Hi Aniket, > > Many thanks! I will check this out. > > > > ________________________________ > > Date: Thu, 6 Mar 2014 13:46:50 -0800 > > Subject: Re: Pig on Spark > > From: aniket...@gmail.com > > To: user@spark.apache.org; tgraves...@yahoo.com > > > > > > There is some work to make this work on yarn at > > https://github.com/aniket486/pig. (So, compile pig with ant > > -Dhadoopversion=23) > > > > You can look at https://github.com/aniket486/pig/blob/spork/pig-spark to > > find out what sort of env variables you need (sorry, I haven't been able > to > > clean this up- in-progress). There are few known issues with this, I will > > work on fixing them soon. > > > > Known issues- > > 1. Limit does not work (spork-fix) > > 2. Foreach requires to turn off schema-tuple-backend (should be a > pig-jira) > > 3. Algebraic udfs dont work (spork-fix in-progress) > > 4. Group by rework (to avoid OOMs) > > 5. UDF Classloader issue (requires SPARK-1053, then you can put > > pig-withouthadoop.jar as SPARK_JARS in SparkContext along with udf jars) > > > > ~Aniket > > > > > > > > > > On Thu, Mar 6, 2014 at 1:36 PM, Tom Graves <tgraves...@yahoo.com> wrote: > > > > I had asked a similar question on the dev mailing list a while back (Jan > > 22nd). > > > > See the archives: > > http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser-> > > look for spork. > > > > Basically Matei said: > > > > Yup, that was it, though I believe people at Twitter picked it up again > > recently. I'd suggest > > asking Dmitriy if you know him. I've seen interest in this from several > > other groups, and > > if there's enough of it, maybe we can start another open source repo to > > track it. The work > > in that repo you pointed to was done over one week, and already had most > of > > Pig's operators > > working. (I helped out with this prototype over Twitter's hack week.) > That > > work also calls > > the Scala API directly, because it was done before we had a Java API; it > > should be easier > > with the Java one. > > > > > > Tom > > > > > > > > On Thursday, March 6, 2014 3:11 PM, Sameer Tilak <ssti...@live.com> > wrote: > > Hi everyone, > > > > We are using to Pig to build our data pipeline. I came across Spork -- > Pig > > on Spark at: https://github.com/dvryaboy/pig and not sure if it is still > > active. > > > > Can someone please let me know the status of Spork or any other effort > that > > will let us run Pig on Spark? We can significantly benefit by using > Spark, > > but we would like to keep using the existing Pig scripts. > > > > > > > > > > > > -- > > "...:::Aniket:::... Quetzalco@tl" > > > > >