++Deepak, There is also a option to use saveAsHadoopFile & saveAsNewAPIHadoopFile, In which you can customize(filename and many things ...) the way you want to save it. :)
Happy Sparking !!!! Regards, Rabin Banerjee On Wed, Jul 20, 2016 at 10:01 AM, Deepak Sharma <deepakmc...@gmail.com> wrote: > In spark streaming , you have to decide the duration of micro batches to > run. > Once you get the micro batch , transform it as per your logic and then you > can use saveAsTextFiles on your final RDD to write it to HDFS. > > Thanks > Deepak > > On 20 Jul 2016 9:49 am, <rajesh_kall...@dellteam.com> wrote: > > *Dell - Internal Use - Confidential * > > *Dell - Internal Use - Confidential * > > While writing to Kafka from Storm, the hdfs bolt provides a nice way to > batch the messages , rotate files, file name convention etc as shown below. > > > > Do you know of something similar in Spark Streaming or do we have to roll > our own? If anyone attempted this can you throw some pointers. > > > > Every other streaming solution like Flume and NIFI handle logic like below. > > > > > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.6/bk_storm-user-guide/content/writing-data-with-storm-hdfs-connector.html > > > > // use "|" instead of "," for field delimiter > > RecordFormat format = new DelimitedRecordFormat() > > .withFieldDelimiter("|"); > > > > // Synchronize the filesystem after every 1000 tuples > > SyncPolicy syncPolicy = new CountSyncPolicy(1000); > > > > // Rotate data files when they reach 5 MB > > FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(5.0f, > Units.MB); > > > > // Use default, Storm-generated file names > > FileNameFormat fileNameFormat = new DefaultFileNameFormat() > > .withPath("/foo/"); > > > > > > // Instantiate the HdfsBolt > > HdfsBolt bolt = new HdfsBolt() > > .withFsUrl("hdfs://localhost:8020") > > .withFileNameFormat(fileNameFormat) > > .withRecordFormat(format) > > .withRotationPolicy(rotationPolicy) > > .withSyncPolicy(syncPolicy); > > > > > > >