Hi Mayur, I wondered if you could share your findings in some way (github, blog post, etc). I guess your experience will be very interesting/useful for many people
sent from Lenovo YogaTablet On Apr 8, 2014 8:48 PM, "Mayur Rustagi" <mayur.rust...@gmail.com> wrote: > Hi Ankit, > Thanx for all the work on Pig. > Finally got it working. Couple of high level bugs right now: > > - Getting it working on Spark 0.9.0 > - Getting UDF working > - Getting generate functionality working > - Exhaustive test suite on Spark on Pig > > are you maintaining a Jira somewhere? > > I am currently trying to deploy it on 0.9.0. > > Regards > Mayur > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi <https://twitter.com/mayur_rustagi> > > > > On Fri, Mar 14, 2014 at 1:37 PM, Aniket Mokashi <aniket...@gmail.com>wrote: > >> We will post fixes from our side at - https://github.com/twitter/pig. >> >> Top on our list are- >> 1. Make it work with pig-trunk (execution engine interface) (with 0.8 or >> 0.9 spark). >> 2. Support for algebraic udfs (this mitigates the group by oom problems). >> >> Would definitely love more contribution on this. >> >> Thanks, >> Aniket >> >> >> On Fri, Mar 14, 2014 at 12:29 PM, Mayur Rustagi >> <mayur.rust...@gmail.com>wrote: >> >>> Dam I am off to NY for Structure Conf. Would it be possible to meet >>> anytime after 28th March? >>> I am really interested in making it stable & production quality. >>> >>> Regards >>> Mayur Rustagi >>> Ph: +1 (760) 203 3257 >>> http://www.sigmoidanalytics.com >>> @mayur_rustagi <https://twitter.com/mayur_rustagi> >>> >>> >>> >>> On Fri, Mar 14, 2014 at 11:53 AM, Julien Le Dem <jul...@twitter.com>wrote: >>> >>>> Hi Mayur, >>>> Are you going to the Pig meetup this afternoon? >>>> http://www.meetup.com/PigUser/events/160604192/ >>>> Aniket and I will be there. >>>> We would be happy to chat about Pig-on-Spark >>>> >>>> >>>> >>>> On Tue, Mar 11, 2014 at 8:56 AM, Mayur Rustagi <mayur.rust...@gmail.com >>>> > wrote: >>>> >>>>> Hi Lin, >>>>> We are working on getting Pig on spark functional with 0.8.0, have you >>>>> got it working on any spark version ? >>>>> Also what all functionality works on it? >>>>> Regards >>>>> Mayur >>>>> >>>>> Mayur Rustagi >>>>> Ph: +1 (760) 203 3257 >>>>> http://www.sigmoidanalytics.com >>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi> >>>>> >>>>> >>>>> >>>>> On Mon, Mar 10, 2014 at 11:00 PM, Xiangrui Meng <men...@gmail.com>wrote: >>>>> >>>>>> Hi Sameer, >>>>>> >>>>>> Lin (cc'ed) could also give you some updates about Pig on Spark >>>>>> development on her side. >>>>>> >>>>>> Best, >>>>>> Xiangrui >>>>>> >>>>>> On Mon, Mar 10, 2014 at 12:52 PM, Sameer Tilak <ssti...@live.com> >>>>>> wrote: >>>>>> > Hi Mayur, >>>>>> > We are planning to upgrade our distribution MR1> MR2 (YARN) and the >>>>>> goal is >>>>>> > to get SPROK set up next month. I will keep you posted. Can you >>>>>> please keep >>>>>> > me informed about your progress as well. >>>>>> > >>>>>> > ________________________________ >>>>>> > From: mayur.rust...@gmail.com >>>>>> > Date: Mon, 10 Mar 2014 11:47:56 -0700 >>>>>> > >>>>>> > Subject: Re: Pig on Spark >>>>>> > To: user@spark.apache.org >>>>>> > >>>>>> > >>>>>> > Hi Sameer, >>>>>> > Did you make any progress on this. My team is also trying it out >>>>>> would love >>>>>> > to know some detail so progress. >>>>>> > >>>>>> > Mayur Rustagi >>>>>> > Ph: +1 (760) 203 3257 >>>>>> > http://www.sigmoidanalytics.com >>>>>> > @mayur_rustagi >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Thu, Mar 6, 2014 at 2:20 PM, Sameer Tilak <ssti...@live.com> >>>>>> wrote: >>>>>> > >>>>>> > Hi Aniket, >>>>>> > Many thanks! I will check this out. >>>>>> > >>>>>> > ________________________________ >>>>>> > Date: Thu, 6 Mar 2014 13:46:50 -0800 >>>>>> > Subject: Re: Pig on Spark >>>>>> > From: aniket...@gmail.com >>>>>> > To: user@spark.apache.org; tgraves...@yahoo.com >>>>>> > >>>>>> > >>>>>> > There is some work to make this work on yarn at >>>>>> > https://github.com/aniket486/pig. (So, compile pig with ant >>>>>> > -Dhadoopversion=23) >>>>>> > >>>>>> > You can look at >>>>>> https://github.com/aniket486/pig/blob/spork/pig-spark to >>>>>> > find out what sort of env variables you need (sorry, I haven't been >>>>>> able to >>>>>> > clean this up- in-progress). There are few known issues with this, >>>>>> I will >>>>>> > work on fixing them soon. >>>>>> > >>>>>> > Known issues- >>>>>> > 1. Limit does not work (spork-fix) >>>>>> > 2. Foreach requires to turn off schema-tuple-backend (should be a >>>>>> pig-jira) >>>>>> > 3. Algebraic udfs dont work (spork-fix in-progress) >>>>>> > 4. Group by rework (to avoid OOMs) >>>>>> > 5. UDF Classloader issue (requires SPARK-1053, then you can put >>>>>> > pig-withouthadoop.jar as SPARK_JARS in SparkContext along with udf >>>>>> jars) >>>>>> > >>>>>> > ~Aniket >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Thu, Mar 6, 2014 at 1:36 PM, Tom Graves <tgraves...@yahoo.com> >>>>>> wrote: >>>>>> > >>>>>> > I had asked a similar question on the dev mailing list a while back >>>>>> (Jan >>>>>> > 22nd). >>>>>> > >>>>>> > See the archives: >>>>>> > >>>>>> http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser-> >>>>>> > look for spork. >>>>>> > >>>>>> > Basically Matei said: >>>>>> > >>>>>> > Yup, that was it, though I believe people at Twitter picked it up >>>>>> again >>>>>> > recently. I'd suggest >>>>>> > asking Dmitriy if you know him. I've seen interest in this from >>>>>> several >>>>>> > other groups, and >>>>>> > if there's enough of it, maybe we can start another open source >>>>>> repo to >>>>>> > track it. The work >>>>>> > in that repo you pointed to was done over one week, and already had >>>>>> most of >>>>>> > Pig's operators >>>>>> > working. (I helped out with this prototype over Twitter's hack >>>>>> week.) That >>>>>> > work also calls >>>>>> > the Scala API directly, because it was done before we had a Java >>>>>> API; it >>>>>> > should be easier >>>>>> > with the Java one. >>>>>> > >>>>>> > >>>>>> > Tom >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Thursday, March 6, 2014 3:11 PM, Sameer Tilak <ssti...@live.com> >>>>>> wrote: >>>>>> > Hi everyone, >>>>>> > >>>>>> > We are using to Pig to build our data pipeline. I came across Spork >>>>>> -- Pig >>>>>> > on Spark at: https://github.com/dvryaboy/pig and not sure if it is >>>>>> still >>>>>> > active. >>>>>> > >>>>>> > Can someone please let me know the status of Spork or any other >>>>>> effort that >>>>>> > will let us run Pig on Spark? We can significantly benefit by using >>>>>> Spark, >>>>>> > but we would like to keep using the existing Pig scripts. >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > "...:::Aniket:::... Quetzalco@tl" >>>>>> > >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> "...:::Aniket:::... Quetzalco@tl" >> > >