Yes, if that's the case you should go with option (2) and run with the
checksums I think.

On Thu, Oct 6, 2016 at 10:32 AM, Flavio Pompermaier
<pomperma...@okkam.it> wrote:
> The problem is that data is very large and usually cannot run on a single
> machine :(
>
> On Thu, Oct 6, 2016 at 10:11 AM, Ufuk Celebi <u...@apache.org> wrote:
>>
>> On Wed, Oct 5, 2016 at 7:08 PM, Tarandeep Singh <tarand...@gmail.com>
>> wrote:
>> > @Stephan my flink cluster setup- 5 nodes, each running 1 TaskManager.
>> > Slots
>> > per task manager: 2-4 (I tried varying this to see if this has any
>> > impact).
>> > Network buffers: 5k - 20k (tried different values for it).
>>
>> Could you run the job first on a single task manager to see if the
>> error occurs even if no network shuffle is involved? That should be
>> less overhead for you than running the custom build (which might be
>> buggy ;)).
>>
>> – Ufuk
>
>
>
>

Reply via email to