RE: Buffer re-ordering problem

2015-06-09 Thread Kruse, Sebastian
iters. It is never impossible for a bug to be in there such that the serialization affects the buffer transport. Thanks for the pointer, we will definitely look into that. On Thu, Jun 4, 2015 at 1:07 PM, Kruse, Sebastian wrote: > Thanks for your feedback. I am neither running IPSec nor the

RE: Buffer re-ordering problem

2015-06-04 Thread Kruse, Sebastian
this as well. > > – Ufuk > > On Wed, Jun 3, 2015 at 11:07 AM, Kruse, Sebastian > > wrote: > > > I am currently using 0.9-SNAPSHOT. All the non-jar files are from an > older > > build, but I recently manually updated the flink-dist.jar with > > commit d163a

RE: Buffer re-ordering problem

2015-06-03 Thread Kruse, Sebastian
.@apache.org] Sent: Mittwoch, 3. Juni 2015 10:33 To: dev@flink.apache.org Subject: Re: Buffer re-ordering problem This is a critical bug. - which version are you using? If snapshot, which commit? - what is your setup? Number of machines, datset etc? - is it reproducible? On Wednesday, June 3, 2015, Kru

Buffer re-ordering problem

2015-06-03 Thread Kruse, Sebastian
Hi everyone, I had some jobs running over the night and in two of them after about half an hour the following exception occurred. Do you know why this happens? Thanks, Sebastian tenem16.hpi.uni-potsdam.de Error: java.lang.Exception: The data preparation for task 'CHAIN GroupReduce (GroupReduce

RE: Changed the behavior of "DataSet.print()"

2015-05-28 Thread Kruse, Sebastian
> > > > > Hi Sebastian, > > > > > > > > > > thank you for the feedback. I agree that both variants have a > > > > > right > > to > > > > > exist. > > > > > > > > > > I would vote for adding a

RE: Changed the behavior of "DataSet.print()"

2015-05-28 Thread Kruse, Sebastian
Hi everyone, I am a bit worried about that recent change of the print() method. I can understand the rationale that obtaining the stdout from all the taskmanagers is cumbersome (although, for local debugging the old print() was fine). However, a major problem, I see with the new print(), is, th

RE: NioEventLoop consumes most of the CPU

2015-05-06 Thread Kruse, Sebastian
; > thread? > > The former is from Flink's data network thread, the later from akka. > > > > - Is you job data heavy (data transfer is in progress most of the > > time), > or > > is it compute heavy (network is not fully utilized) > > > > Than

NioEventLoop consumes most of the CPU

2015-05-05 Thread Kruse, Sebastian
Hi everyone, Everytime when I am running jvisualvm on one of the machines in our cluster during a Flink job, I see that NioEventLoop.select() is taking 50% to 70% CPU self-time. I wonder how severe this is. It might be busy-waiting time that cannot be filled otherwise, but I wanted to ask you i

RE: Gzip support

2015-05-04 Thread Kruse, Sebastian
format to have one > input split per file (rather than block). > > On Thu, Apr 30, 2015 at 5:41 PM, Kruse, Sebastian > wrote: > >> Hi everyone, >> >> I just recently came across a use-case where I needed to read gzip files >> and handle byte order marks tran

Gzip support

2015-04-30 Thread Kruse, Sebastian
Hi everyone, I just recently came across a use-case where I needed to read gzip files and handle byte order marks transparently. I know that gzip can be read with Hadoop input formats but that did not work for me since I wanted to reuse my existing custom Flink input formats. It turned out tha

RE: Gather a distributed dataset

2015-01-27 Thread Kruse, Sebastian
Hi everyone, I just wanted to give you the pointer FLINK-1038 https://github.com/apache/flink/pull/94 This is an output format that can send DataSet contents via Java RMI to, e.g., the driver. I am currently using it a lot and it seems to scale pretty well. Cheers, Sebastian -Original Mess