Re: Exception while running Flink jobs (1.0.0)

Ufuk Celebi Wed, 12 Oct 2016 02:28:16 -0700

No, sorry. I was waiting for Tarandeep's feedback before looking into
it further. I will do it over the next days in any case.


On Wed, Oct 12, 2016 at 10:49 AM, Flavio Pompermaier
<pomperma...@okkam.it> wrote:
> Hi Ufuk,
> any news on this?
>
> On Thu, Oct 6, 2016 at 1:30 PM, Ufuk Celebi <u...@apache.org> wrote:
>>
>> I guess that this is caused by a bug in the checksum calculation. Let
>> me check that.
>>
>> On Thu, Oct 6, 2016 at 1:24 PM, Flavio Pompermaier <pomperma...@okkam.it>
>> wrote:
>> > I've ran the job once more (always using the checksum branch) and this
>> > time
>> > I got:
>> >
>> > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1953786112
>> > at
>> >
>> > org.apache.flink.api.common.typeutils.base.EnumSerializer.deserialize(EnumSerializer.java:83)
>> > at
>> >
>> > org.apache.flink.api.common.typeutils.base.EnumSerializer.deserialize(EnumSerializer.java:32)
>> > at
>> >
>> > org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:431)
>> > at
>> >
>> > org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:135)
>> > at
>> >
>> > org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:30)
>> > at
>> >
>> > org.apache.flink.runtime.io.disk.ChannelReaderInputViewIterator.next(ChannelReaderInputViewIterator.java:100)
>> > at
>> >
>> > org.apache.flink.runtime.operators.sort.MergeIterator$HeadStream.nextHead(MergeIterator.java:161)
>> > at
>> >
>> > org.apache.flink.runtime.operators.sort.MergeIterator.next(MergeIterator.java:113)
>> > at
>> >
>> > org.apache.flink.runtime.operators.util.metrics.CountingMutableObjectIterator.next(CountingMutableObjectIterator.java:45)
>> > at
>> >
>> > org.apache.flink.runtime.util.NonReusingKeyGroupedIterator.advanceToNext(NonReusingKeyGroupedIterator.java:130)
>> > at
>> >
>> > org.apache.flink.runtime.util.NonReusingKeyGroupedIterator.access$300(NonReusingKeyGroupedIterator.java:32)
>> > at
>> >
>> > org.apache.flink.runtime.util.NonReusingKeyGroupedIterator$ValuesIterator.next(NonReusingKeyGroupedIterator.java:192)
>> > at
>> >
>> > org.okkam.entitons.mapping.flink.IndexMappingExecutor$TupleToEntitonJsonNode.reduce(IndexMappingExecutor.java:64)
>> > at
>> >
>> > org.apache.flink.runtime.operators.GroupReduceDriver.run(GroupReduceDriver.java:131)
>> > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:486)
>> >         at
>> > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:351)
>> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:585)
>> > at java.lang.Thread.run(Thread.java:745)
>> >
>> >
>> > On Thu, Oct 6, 2016 at 11:00 AM, Ufuk Celebi <u...@apache.org> wrote:
>> >>
>> >> Yes, if that's the case you should go with option (2) and run with the
>> >> checksums I think.
>> >>
>> >> On Thu, Oct 6, 2016 at 10:32 AM, Flavio Pompermaier
>> >> <pomperma...@okkam.it> wrote:
>> >> > The problem is that data is very large and usually cannot run on a
>> >> > single
>> >> > machine :(
>> >> >
>> >> > On Thu, Oct 6, 2016 at 10:11 AM, Ufuk Celebi <u...@apache.org> wrote:
>> >> >>
>> >> >> On Wed, Oct 5, 2016 at 7:08 PM, Tarandeep Singh
>> >> >> <tarand...@gmail.com>
>> >> >> wrote:
>> >> >> > @Stephan my flink cluster setup- 5 nodes, each running 1
>> >> >> > TaskManager.
>> >> >> > Slots
>> >> >> > per task manager: 2-4 (I tried varying this to see if this has any
>> >> >> > impact).
>> >> >> > Network buffers: 5k - 20k (tried different values for it).
>> >> >>
>> >> >> Could you run the job first on a single task manager to see if the
>> >> >> error occurs even if no network shuffle is involved? That should be
>> >> >> less overhead for you than running the custom build (which might be
>> >> >> buggy ;)).
>> >> >>
>> >> >> – Ufuk
>> >> >
>> >> >
>> >> >
>> >> >
>> >
>> >
>> >
>
>
>

Re: Exception while running Flink jobs (1.0.0)

Reply via email to