Hi Kostas, good job! 2018-07-12 19:40 GMT+08:00 Kostas Kloudas <k.klou...@data-artisans.com>:
> Hi Lakshmi, > > Since Flink-1.5 you have the ability to set the part suffix. > As you said, you only want the .gzip to be the suffix of the final (or > “completed”) part files, which is exactly what is currently supported. > > If you want also intermediate files to have this suffix, then you can > always set all the suffixes (in-progress, pending and final) to “.gzip” > but then you have to also set the appropriate preffixes so that Flink can > distinguish completed from non-completed files (filenames > must not collide). > > Also I would recommend to use the most recent stable version 1.5.3 which > also includes this bug fix: > https://issues.apache.org/jira/browse/FLINK-9603 < > https://issues.apache.org/jira/browse/FLINK-9603> > > I hope this helps, > Kostas > > > > On Apr 5, 2018, at 6:23 PM, Lakshmi Gururaja Rao <l...@lyft.com> wrote: > > > > I can see two ways of achieving this: > > > > 1. Setting a suffix* **only*** for the completed part files. I don't > > necessarily think the suffix should be added for the intermediate files > (as > > intermediate files should not really be ready for consumption by a > > downstream process?) > > 2. Be able to override this partPath name creation - > > https://github.com/apache/flink/blob/release-1.4.0/ > flink-connectors/flink-connector-filesystem/src/main/ > java/org/apache/flink/streaming/connectors/fs/ > bucketing/BucketingSink.java#L523 > > . That way any user who needs to set a custom/dynamic part file name can > > still do so. > > > > Do you think either or one of these options is feasible? > > > > Thanks > > Lakshmi > > > > On Tue, Apr 3, 2018 at 12:57 AM, Aljoscha Krettek <aljos...@apache.org> > > wrote: > > > >> So you want to be able to set a "global" suffix that should be appended > to > >> all different kinds of files that the sink writes, including > intermediate > >> files? > >> > >> Aljoscha > >> > >>> On 29. Mar 2018, at 16:59, l...@lyft.com wrote: > >>> > >>> Sorry, I meant "I don't see a way of doing this apart from setting a > >> part file *suffix* with the required file extension. " > >>> > >>> > >>> On 2018/03/29 14:55:43, l...@lyft.com <l...@lyft.com> wrote: > >>>> Currently the BucketingSink allows addition of part prefix, pending > >> prefix/suffix and in-progress prefix/suffix via setter methods. Can we > also > >> support setting part suffixes? > >>>> An instance where this maybe useful: I am currently writing GZIP > >> compressed output to S3 using the BucketingSink and I would want the > >> uploaded files to have a ".gz" or ".zip" extensions (if the files does > not > >> have such an extensionelse they are written as garbled bytes and don't > get > >> rendered correctly for reading). I don't see a way of doing this apart > from > >> setting a part file prefix with the required file extension. > >>>> > >>>> Thanks > >>>> Lakshmi > >>>> > >> > >> > > > > > > -- > > *Lakshmi Gururaja Rao* > > SWE > > 217.778.7218 <+12177787218> > > [image: Lyft] <http://www.lyft.com/> > >