Re: Micro batching with flink

2024-11-07 Thread Anil Dasari
Just following up—please let me know if any of you have recommendations for implementing the mentioned use case. Thanks.On Tuesday, October 29, 2024 at 10:37:30 PM PDT, Anil Dasari wrote: Hi Venkat,Thanks for the reply.  Microbatching is a data processing technique where small batche

Re: Micro batching with flink

2024-11-05 Thread Arvid Heise
Hey, could you please check if a bucket assigner is already enough? If not, what's missing? FileSink orcSink = FileSink .forBulkFormat(new Path("s3a://mybucket/flink_file_sink_orc_test"), factory) .withBucketAssigner(new DateTimeBucketAssigner<>("'dt='MMdd/'hour='HH",

Re: Micro batching with flink

2024-11-04 Thread Gabor Somogyi
Presume you're coming from Spark and looking for something like RDD.foreach. In Flink there is no such feature. I think you can use a batch job for processing and storing the data. All the rest can be done in a custom code outside of Flink. The hard way is to implement a custom connector which is

Re: Micro batching with flink

2024-10-29 Thread Anil Dasari
Hi Venkat,Thanks for the reply.  Microbatching is a data processing technique where small batches of data are collected and processed together at regular intervals.However, I'm aiming to avoid traditional micro-batch processing by tagging records within a time window as a batch, allowing for ne

Re: Micro batching with flink

2024-10-29 Thread Venkatakrishnan Sowrirajan
Can you share more details on what do you mean by micro-batching? Can you explain with an example to understand it better? Thanks Venkat On Tue, Oct 29, 2024, 1:22 PM Anil Dasari wrote: > Hello team, > I apologize for reaching out on the dev mailing list. I'm working on > implementing micro-bat

Micro batching with flink

2024-10-29 Thread Anil Dasari
Hello team, I apologize for reaching out on the dev mailing list. I'm working on implementing micro-batching with near real-time processing. I've seen similar questions in the Flink Slack channel and user mailing list, but there hasn't been much discussion or feedback. Here are the options I've