Re: IO benchmarking

2021-03-31 Thread Matthias Pohl
For 2. there are also efforts to expose the state and operator initialization through the logs (see FLINK-17012 [1]). For 3. the TypeSerializer [2] might be another point of interest. It is used to serialize specific types. Other than that, the state serialzation depends heavily on the used state b

Re: IO benchmarking

2021-03-31 Thread deepthi Sridharan
Thanks, Matthias. This is very helpful. Regarding the checkpoint documentation, I was mostly looking for information on how states from various tasks get serialized into one (or more?) files on persistent storage. I'll check out the code pointers! On Wed, Mar 31, 2021 at 7:07 AM Matthias Pohl wr

Re: IO benchmarking

2021-03-31 Thread Matthias Pohl
Hi Deepthi, 1. Have you had a look at flink-benchmarks [1]? I haven't used it but it might be helpful. 2. Unfortunately, Flink doesn't provide metrics like that. But you might want to follow FLINK-21736 [2] for future developments. 3. Is there anything specific you are looking for? Unfortunately, I