Just following up—please let me know if any of you have recommendations for
implementing the mentioned use case.
Thanks.On Tuesday, October 29, 2024 at 10:37:30 PM PDT, Anil Dasari
wrote:
Hi Venkat,Thanks for the reply.
Microbatching is a data processing technique where small batche
Hey,
could you please check if a bucket assigner is already enough? If not,
what's missing?
FileSink orcSink = FileSink
.forBulkFormat(new
Path("s3a://mybucket/flink_file_sink_orc_test"), factory)
.withBucketAssigner(new
DateTimeBucketAssigner<>("'dt='MMdd/'hour='HH",
Presume you're coming from Spark and looking for something like RDD.foreach.
In Flink there is no such feature. I think you can use a batch job for
processing and storing the data.
All the rest can be done in a custom code outside of Flink.
The hard way is to implement a custom connector which is
Hi Venkat,Thanks for the reply.
Microbatching is a data processing technique where small batches of data are
collected and processed together at regular intervals.However, I'm aiming to
avoid traditional micro-batch processing by tagging records within a time
window as a batch, allowing for ne
Can you share more details on what do you mean by micro-batching? Can you
explain with an example to understand it better?
Thanks
Venkat
On Tue, Oct 29, 2024, 1:22 PM Anil Dasari
wrote:
> Hello team,
> I apologize for reaching out on the dev mailing list. I'm working on
> implementing micro-bat
Hello team,
I apologize for reaching out on the dev mailing list. I'm working on
implementing micro-batching with near real-time processing.
I've seen similar questions in the Flink Slack channel and user mailing list,
but there hasn't been much discussion or feedback. Here are the options I've