Re: Exceptions when launching counts on a Flink DataSet concurrently

2019-05-02 Thread Juan Rodríguez Hortalá
Thanks for your answer Fabian. In my opinion this is not just a possible new feature for an optimization, but a bigger problem because the client program crashes with an exception when concurrent counts or collects are triggered on the same data set, and this also happens non deterministically dep

Re: Exceptions when launching counts on a Flink DataSet concurrently

2019-04-29 Thread Fabian Hueske
Hi Juan, count() and collect() trigger the execution of a job. Since Flink does not cache intermediate results (yet), all operations from the sink (count()/collect()) to the sources are executed. So in a sense a DataSet is immutable (given that the input of the sources do not change) but completel

Re: Exceptions when launching counts on a Flink DataSet concurrently

2019-04-26 Thread Juan Rodríguez Hortalá
Hi Timo, Thanks for your answer. I was surprised to have problems calling those methods concurrently, because I though data sets were immutable. Now I understand calling count or collect mutates the data set, not its contents but some kind of execution plan included in the data set. I suggest add

Re: Exceptions when launching counts on a Flink DataSet concurrently

2019-04-26 Thread Timo Walther
Hi Juan, as far as I know we do not provide any concurrency guarantees for count() or collect(). Those methods need to be used with caution anyways as the result size must not exceed a certain threshold. I will loop in Fabian who might know more about the internals of the execution? Regards,

Re: Exceptions when launching counts on a Flink DataSet concurrently

2019-04-25 Thread Juan Rodríguez Hortalá
Any thoughts on this? On Sun, Apr 7, 2019, 6:56 PM Juan Rodríguez Hortalá < juan.rodriguez.hort...@gmail.com> wrote: > Hi, > > I have a very simple program using the local execution environment, that > throws NPE and other exceptions related to concurrent access when launching > a count for a Dat

Exceptions when launching counts on a Flink DataSet concurrently

2019-04-07 Thread Juan Rodríguez Hortalá
Hi, I have a very simple program using the local execution environment, that throws NPE and other exceptions related to concurrent access when launching a count for a DataSet from different threads. The program is https://gist.github.com/juanrh/685a89039e866c1067a6efbfc22c753e which is basically t