Thanks very much for the pointers Vinay. That helps ☺
-Raja. From: vinay patil <vinay18.pa...@gmail.com> Date: Monday, August 7, 2017 at 1:56 AM To: "user@flink.apache.org" <user@flink.apache.org> Subject: Re: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS Files Hi Raja, That is why they are in the pending state. You can enable checkpointing by setting env.enableCheckpointing(<duration>) After doing this they will not remain in pending state. Check this out : https://ci.apache.org/projects/flink/flink-docs-release-1.3/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html Regards, Vinay Patil On Mon, Aug 7, 2017 at 9:15 AM, Raja.Aravapalli [via Apache Flink User Mailing List archive.] <[hidden email]<file:////user/SendEmail.jtp%3ftype=node&node=14717&i=0>> wrote: Hi Vinay, Thanks for the response. I have NOT enabled any checkpointing. Files are rolling out correctly for every 2mb, but the files are remaining as below: -rw-r--r-- 3 2097424 2017-08-06 21:10 /xxxx/xxxx/xxxx/Test/part-0-0.pending -rw-r--r-- 3 1431430 2017-08-06 21:12 /xxxx/xxxx/xxxx/Test/part-0-1.pending Regards, Raja. From: vinay patil <[hidden email]<http://user/SendEmail.jtp?type=node&node=14716&i=0>> Date: Sunday, August 6, 2017 at 10:40 PM To: "[hidden email]<http://user/SendEmail.jtp?type=node&node=14716&i=1>" <[hidden email]<http://user/SendEmail.jtp?type=node&node=14716&i=2>> Subject: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS Files Hi Raja, Have you enabled checkpointing? The files will be rolled to complete state when the batch size is reached (in your case 2 MB) or when the bucket is inactive for a certain amount of time. Regards, Vinay Patil On Mon, Aug 7, 2017 at 7:53 AM, Raja.Aravapalli [via Apache Flink User Mailing List archive.] <[hidden email]> wrote: Hi, I am working on a poc to write to hdfs files using BucketingSink class. Even thought I am the data is being writing to hdfs files, but the files are lying with “.pending” on hdfs. Below is the code I am using. Can someone pls help me identify the issue and help me fix this ? BucketingSink<String> HdfsSink = new BucketingSink<String>("hdfs://xxxx/xxxx/xxxx/Test/"); HdfsSink.setBucketer(new DateTimeBucketer<String>("yyyy-MM-dd--HHmm")); HdfsSink.setBatchSize(1024 * 1024 * 2); // this is 2 MB, HdfsSink.setInactiveBucketCheckInterval(10000L); HdfsSink.setInactiveBucketThreshold(10000L); Thanks a lot. Regards, Raja. ________________________________ If you reply to this email, your message will be added to the discussion below: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714.html To start a new topic under Apache Flink User Mailing List archive., email [hidden email] To unsubscribe from Apache Flink User Mailing List archive., click here. NAML<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> ________________________________ View this message in context: Re: Help required - "BucketingSink" usage to write HDFS Files<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14715.html> Sent from the Apache Flink User Mailing List archive. mailing list archive<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> at Nabble.com. ________________________________ If you reply to this email, your message will be added to the discussion below: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14716.html To start a new topic under Apache Flink User Mailing List archive., email [hidden email]<file:////user/SendEmail.jtp%3ftype=node&node=14717&i=1> To unsubscribe from Apache Flink User Mailing List archive., click here. NAML<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> ________________________________ View this message in context: Re: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS Files<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14717.html> Sent from the Apache Flink User Mailing List archive. mailing list archive<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> at Nabble.com.