Re: Flink StreamingFileSink part file behavior

2019-10-24 Thread Paul Lam
Hi, StreamingFileSink can write to many buckets at the same time, and it uses BucketAssigner to determine the Bucket for each record. WRT you questions, the records would be written to the expected bucket even if they arrive out of order. You can refer to [1] for more information. [1] https:

Flink StreamingFileSink part file behavior

2019-10-23 Thread amran dean
Hello, I am using StreamingFileSink, KafkaConsumer010 as a Kafka -> S3 connector (Flink 1.8.1, Kafka 0.10.1). The setup is simple: Data is written first bucketed by datetime (granularity of 1 day), then by kafka partition. I am using *event time* (Kafka timestamp, recorded at the time of creation