Re: Flink/Kafka POC performance issue

2018-04-17 Thread TechnoMage
Also, I note that none of the operations show any back pressure issues, and the records out from the kafka connector slow down to a crawl. Are there any known issues with kafka throughput that could be the issue rather than flink? I have a java program that monitors the test that reads all the

Re: Flink/Kafka POC performance issue

2018-04-17 Thread TechnoMage
Also, I note some messages in the log about my java class not being a valid POJO because it is missing accessors for a field. Would this impact performance significantly? Michael > On Apr 17, 2018, at 12:54 PM, TechnoMage wrote: > > No checkpoints are active. > I will try that back end. > Ye

Re: Flink/Kafka POC performance issue

2018-04-17 Thread TechnoMage
No checkpoints are active. I will try that back end. Yes, using JSONObject subclass for most of the intermediate state, with JSON strings in and out of Kafka. I will look at the config page for how to enable that. Thank you, Michael > On Apr 17, 2018, at 12:51 PM, Stephan Ewen wrote: > > A f

Re: Flink/Kafka POC performance issue

2018-04-17 Thread Stephan Ewen
A few ideas how to start debugging this: - Try deactivating checkpoints. Without that, no work goes into persisting rocksdb data to the checkpoint store. - Try to swap RocksDB for the FsStateBackend - that reduces serialization cost for moving data between heap and offheap (rocksdb). - Do yo

Re: Flink/Kafka POC performance issue

2018-04-17 Thread TechnoMage
Memory use is steady throughout the job, but the CPU utilization drops off a cliff. I assume this is because it becomes I/O bound shuffling managed state. Are there any metrics on managed state that can help in evaluating what to do next? Michael > On Apr 17, 2018, at 7:11 AM, Michael Latta

Re: Flink/Kafka POC performance issue

2018-04-17 Thread Michael Latta
Thanks for the suggestion. The task manager is configured for 8GB of heap, and gets to about 8.3 total. Other java processes (job manager and Kafka). Add a few more. I will check it again but the instances have 16GB same as my laptop that completes the test in <90 min. Michael Sent from my iP

Re: Flink/Kafka POC performance issue

2018-04-16 Thread Niclas Hedhman
Have you checked memory usage? It could be as simple as either having memory leaks, or aggregating more than you think (sometimes not obvious how much is kept around in memory for longer than one first thinks). If possible, connect FlightRecorder or similar tool and keep an eye on memory. Additiona