OMG! Thank you! Thank you! I didn't think this could be a problem. When I
removed validation the time needed to ingest all events reduced to 10min.
BR,
BB
On Thu, May 27, 2021 at 11:50 AM Arvid Heise wrote:
> Hi,
>
> The implementation looks good. I'd probably cache the
> *ObjectValidator.of().
Hi,
The implementation looks good. I'd probably cache the
*ObjectValidator.of().getValidator()* in a field to be sure that it's not a
pricey construction.
Did you evaluate what happens when you skip the validation entirely in
terms of records/s?
On Thu, May 27, 2021 at 11:18 AM B.B. wrote:
> I
I am having a problem with sending code. So here it is. Hope this now looks
ok
This is my main job (some parts of codes are abbreviated and this is the
main part):
*public class MyJob {*
* private StreamExecutionEnvironment env;*
* private static final Integer NUM_OF_PARALLEL_OPERATORS = 1;*
Hi,
I forgot to mention that we are using Flink 1.12.0. This is a job that has
only minimum components. Reading from source and printing it.
Profiling was my next step to do. Regarding memory I didn't see any
bottlenecks.
I guess I will have to do some investigating in the metric part of Flink.
BR
Hi,
I forgot to mention that we are running Flink 1.12.0.
This is the main function (some parts of codes are abbreviated and this is
the main part). As you can see the job was simplified to minimum. Just
reading from source and printing.
[image: Screenshot 2021-05-26 at 08.05.53.png]
And this
Could you share your KafkaDeserializationSchema, we might be able to spot
some optimization potential. You could also try out enableObjectReuse [1],
which avoids copying data between tasks (not sure if you have any
non-chained tasks).
If you are on 1.13, you could check out the flamegraph to see w
Hi,
That's a throughput of 700 records/second, which should be well below
theoretical limits of any deserializer (from hundreds thousands up to tens
of millions records/second/per single operator), unless your records are
huge or very complex.
Long story short, I don't know of a magic bullet to h
Hi,
I am in the process of optimizing my job which at the moment by our
thinking is too slow.
We are deploying job in kubernetes with 1 job manager with 1gb ram and 1
cpu and 1 task manager with 4gb ram and 2 cpu-s (eg. 2 task slots and
parallelism of two).
The main problem is one kafka source t