Re: Parquet-SPARK-PIG integration.

2014-04-26 Thread suman bharadwaj
IHadoopFile("/part-m-0.parquet",classOf[ParquetInputFormat[Tuple]],classOf[Void],classOf[Tuple],conf).map(x=>(x._2.get(0),x._2.get(1))).collect Regards, SB On Sat, Apr 26, 2014 at 3:31 PM, suman bharadwaj wrote: > Hi All, > > We have written PIG Jobs which outp

Parquet-SPARK-PIG integration.

2014-04-26 Thread suman bharadwaj
Hi All, We have written PIG Jobs which outputs the data in parquet format. For eg: register parquet-column-1.3.1.jar; register parquet-common-1.3.1.jar; register parquet-format-2.0.0.jar; register parquet-hadoop-1.3.1.jar; register parquet-pig-1.3.1.jar; register parquet-encoding-1.3.1.jar; A =

Re: Pig on Spark

2014-04-24 Thread suman bharadwaj
Hey Mayur, We use HiveColumnarLoader and XMLLoader. Are these working as well ? Will try few things regarding porting Java MR. Regards, Suman Bharadwaj S On Thu, Apr 24, 2014 at 3:09 AM, Mayur Rustagi wrote: > Right now UDF is not working. Its in the top list though. You should be > a

Re: Pig on Spark

2014-04-23 Thread suman bharadwaj
> I am translating the ones we need, would be happy to get help on others. > Will host a jira to track them if you are intersted. > > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi <https://twitter.com/mayur_rustagi> > &g

Re: Pig on Spark

2014-04-23 Thread suman bharadwaj
Are all the features available in PIG working in SPORK ?? Like for eg: UDFs ? Thanks. On Thu, Apr 24, 2014 at 1:54 AM, Mayur Rustagi wrote: > Thr are two benefits I get as of now > 1. Most of the time a lot of customers dont want the full power but they > want something dead simple with which t

Re: PIG to SPARK

2014-03-06 Thread suman bharadwaj
Thanks Mayur. I don't have clear idea on how pipe works wanted to understand more on it. But when do we use pipe() and how it works ?. Can you please share some sample code if you have ( even pseudo-code is fine ) ? It will really help. Regards, Suman Bharadwaj S On Thu, Mar 6, 2014 at 3:

PIG to SPARK

2014-03-05 Thread suman bharadwaj
Hi, How can i call pig script using SPARK. Can I use rdd.pipe() here ? And can anyone share sample implementation of rdd.pipe () and if you can explain how rdd.pipe() works, it would really really help. Regards, SB