subject:"Re\: Appending to an hdfs file"

Re: Appending to an hdfs file

2015-01-29 Thread Matan Safriel

Thanks. I actually looked up foreachPartition() in this context yesterday, and couldn't land where it's documented in Javadocs or elsewhere.. probably for some silly reason. Can you please point me in the right direction? Many thanks! By the way, I realize the solution should rather be to concate

Re: Appending to an hdfs file

2015-01-28 Thread Sean Owen

You can call any API you like in a Spark job, as long as the libraries are available, and Hadoop HDFS APIs will be available from the cluster. You could write a foreachPartition() that appends partitions of data to files, yes. Spark itself does not use appending. I think the biggest reason is that