That means the save has not finished yet. Are you sure it did? it writes in _temporary while it's in progress
On Fri, Sep 4, 2015 at 10:10 AM, Chirag Dewan <chirag.de...@ericsson.com> wrote: > Hi, > > > > I have a 2 node Spark cluster and I am trying to read data from a Cassandra > cluster and save the data as CSV file. Here is my code: > > > > JavaRDD<String> mapPair = cachedRdd.map(new Function<CassandraRow, String>() > { > > > > /** > > * > > */ > > private static final long > serialVersionUID = 1L; > > > > @Override > > public String > call(CassandraRow v1) throws Exception { > > > > > StringBuilder sb = new StringBuilder(); > > > sb.append(v1.getString(0)); > > > sb.append(","); > > > sb.append(v1.getBytes(1)); > > > sb.append(","); > > > sb.append(v1.getString(2)); > > > sb.append(","); > > > sb.append(v1.getString(3)); > > > sb.append(","); > > > sb.append(v1.getString(4)); > > > sb.append(","); > > > sb.append(v1.getString(5)); > > return > sb.toString(); > > } > > }); > > > > JavaRDD<String> cachedRdd1 = mapPair.cache(); > > > > JavaRDD<String> coalescedRdd = cachedRdd1.coalesce(1); > > > coalescedRdd.saveAsTextFile("file:///home/echidew/cassandra/test-100.txt"); > > > > context.stop(); > > > > The problem is that part-00000 file is created with all the records in the > _temporary/task-UUID folder. As I have read and understood this file should > be stored at my output path and the temporary directory is deleted. Anything > I need to change in my code or environment? What could be the reason for > that? > > > > Any help appreciated. > > > > P.S : Posting only the relevant code. Sorry for the formatting. > > > > Thanks, > > > > Chirag --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org