Re: Customize Part file naming (Flink 1.9.0)

Taher Koitawala Sat, 19 Oct 2019 01:21:38 -0700

Beware when using Bucketing sink as it does not follow exactly once
semantics. Also it has issues with s3 consistency.




On Sat, Oct 19, 2019, 1:42 PM Ravi Bhushan Ratnakar <
[email protected]> wrote:

> Hi,
>
> As an alternative, you may use BucketingSink which provides you the
> provision to customize suffix/prefix.
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html
>
> Regards,
> Ravi
>
> On Sat, Oct 19, 2019 at 3:54 AM amran dean <[email protected]> wrote:
>
>> Hello,
>> StreamingFileSink's part file naming convention is not adjustable. It has
>> form: *part-<integer>-<integer>. *
>>
>> My use case for StreamingFileSink is a Kafka -> S3 pipeline, and files
>> are read and processed from S3 using spark. In almost all cases, I want to
>> compress raw data before writing to S3 using the BulkFormat.
>>
>> Spark relies on filename extensions to do compression inference, so the
>> current naming scheme results in gibberish. I see that 1.10 currently
>> provides the ability to customize the suffix/prefix, but I really need an
>> alternative solution to this as soon as possible. Can this be backported to
>> 1.9, or are there alternatives?
>>
>>
>>

Re: Customize Part file naming (Flink 1.9.0)

Reply via email to