Hi Guys

I've just recently started using Apache Flink to evaluate its suitability
for  a project I'm working on.

First impressions are that the project is great, well documented and has
lots of examples and guidance showcasing the multitude of things that it
can do. Challenging knowing where to start at times as there are many ways
to achieve the same result.

So my pipeline is similar to an ETL, I have a continuous DataStream source
of Java records modelled as POJOs which I then transform each POJO to a
single JSON record before writing to a streaming sink.  This all works as
expected.

However my question is in 2 parts and I hope you can help, apologies in
advance if this question highlights my lack of experience.

   - Can I get access to the infight records that are currently being
   processed within Flink ? By inflight I mean the records that are currently
   being processed but haven't been written to the sink.
   - Is the number of inflight records deterministic? How many records does
   Flink process per subtask/thread.  For example, it might be 1 record at a
   time per subtask?

Thanks
Declan

Reply via email to