Objects been transformed need to be one of these in flight. Source data can just use the mapreduce input formats, so anything you can do with mapred. doing an avro one for this you probably want one of : https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*
or just whatever your using at the moment to open them in a MR job probably could be re-purposed On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez <zlgonza...@yahoo.com> wrote: > > Hi, > I know that sources need to either be java serializable or use kryo > serialization. > Does anyone have sample code that reads, transforms and writes avro > files in spark? > > Thanks, > Ron >