Beware when using Bucketing sink as it does not follow exactly once
semantics. Also it has issues with s3 consistency.



On Sat, Oct 19, 2019, 1:42 PM Ravi Bhushan Ratnakar <
ravibhushanratna...@gmail.com> wrote:

> Hi,
>
> As an alternative, you may use BucketingSink which provides you the
> provision to customize suffix/prefix.
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html
>
> Regards,
> Ravi
>
> On Sat, Oct 19, 2019 at 3:54 AM amran dean <adfs54545...@gmail.com> wrote:
>
>> Hello,
>> StreamingFileSink's part file naming convention is not adjustable. It has
>> form: *part-<integer>-<integer>. *
>>
>> My use case for StreamingFileSink is a Kafka -> S3 pipeline, and files
>> are read and processed from S3 using spark. In almost all cases, I want to
>> compress raw data before writing to S3 using the BulkFormat.
>>
>> Spark relies on filename extensions to do compression inference, so the
>> current naming scheme results in gibberish. I see that 1.10 currently
>> provides the ability to customize the suffix/prefix, but I really need an
>> alternative solution to this as soon as possible. Can this be backported to
>> 1.9, or are there alternatives?
>>
>>
>>

Reply via email to