Hi, Paul You are right.
The story is that we have a lot of pig load function to load our different data. And now we want to use spark to read and process these data. So we want to figure out a way to reuse our existing load function in spark to read these data. Any idea? Best Regards, Kevin. From: Paul Brown [mailto:p...@mult.ifario.us] Sent: 2015年3月24日 4:11 To: Dai, Kevin Subject: Re: Use pig load function in spark The answer is "Maybe, but you probably don't want to do that.". A typical Pig load function is devoted to bridging external data into Pig's type system, but you don't really need to do that in Spark because it is (thankfully) not encumbered by Pig's type system. What you probably want to do is to figure out a way to use native Spark facilities (e.g., textFile) coupled with some of the logic out of your Pig load function necessary to turn your external data into an RDD. — p...@mult.ifario.us<mailto:p...@mult.ifario.us> | Multifarious, Inc. | http://mult.ifario.us/ On Mon, Mar 23, 2015 at 2:29 AM, Dai, Kevin <yun...@ebay.com<mailto:yun...@ebay.com>> wrote: Hi, all Can spark use pig’s load function to load data? Best Regards, Kevin.