Hi,

I am trying to benchmark a stream application in Flink. So, I am using
the source Function that reads events from the NYC Taxi Rides
(http://training.ververica.com/trainingData/nycTaxiRides.gz) and I
control the emission with System.nanoTime(). I am not using
Thread.sleep because Java does not guarantee the time that the thread
will be awakened.

public void busySleep() {
final long startTime = System.nanoTime();
while (System.nanoTime() - startTime < this.delayInNanoSeconds) ;
}

So, when I wait for 10000 nanoseconds I will get a workload of 100K
rec/sec. When I wait for 2000 nanoseconds I will get a workload of
500K rec/sec. For 1000 nanoseconds I will get a workload of 1M
rec/sec. And for 500 nanoseconds a workload of 2M rec/sec.

The problem that I am facing is that when I set the workload for 1M
rec/sec it seems that it is not generating at this rate. I guess it is
because it is consuming more time reading the TaxiRide file, or doing
IO operations, Or maybe it is some Java limitation.
If I use some message broker it will end up adding one more middleware
to have read/write IO operations and I guess it will be worst.
What do you recommend to do a controllable benchmark for stream processing?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
-- https://felipeogutierrez.blogspot.com

Reply via email to