Hi Amran, If you want to know from which partition your input data come from, you can always have a separate bucket for each partition. As described in [1], you can extract the offset/partition/topic information for an incoming record and based on this, decide the appropriate bucket to put the record.
Cheers, Kostas [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html On Wed, Oct 16, 2019 at 4:00 AM amran dean <adfs54545...@gmail.com> wrote: > > I am evaluating StreamingFileSink (Kafka 0.10.11) as a production-ready > alternative to a current Kafka -> S3 solution. > > Is there any way to verify the integrity of data written in S3? I'm confused > how the file names (e.g part-1-17) map to Kafka partitions, and further > unsure how to ensure that no Kafka records are lost (I know Flink guarantees > exactly-once, but this is more of a sanity check). > > > >