Re: Streaming + Window not working when ran using direct runner

2019-09-20 Thread Reza Rokni
Hi, Are you are using a very small volume of PubSub messages for testing purposes, for example manually sending just one message to see it work its way through? Don't have the Jira number to hand but if I recall the the PubSubIO connector for the directrunner had a issue around moving the waterma

Re: Streaming + Window not working when ran using direct runner

2019-09-20 Thread Ankur Goenka
It seems that all the worker threads are blocked on reading data from PubSub. Can you try pulling a few pub sub messages from your command line using gcloud beta pubsub subscriptions pull --limit=1 Example thread stack "direct-runner-worker" #19 prio=5 os_prio=31 cpu=5955.37ms elapsed=1441.13s

Re: Streaming + Window not working when ran using direct runner

2019-09-20 Thread Daniel Amadei
Hi Ankur, I tried to do that but nothing spot to my eyes as the culprit. Stack trace attached in case you want to give a look. Thanks Daniel Em sex, 20 de set de 2019 às 14:44, Ankur Goenka escreveu: > On the surface, pipeline looks fine. > As this is a java pipeline running locally, I would r

Re: Streaming + Window not working when ran using direct runner

2019-09-20 Thread Ankur Goenka
On the surface, pipeline looks fine. As this is a java pipeline running locally, I would recommend checking the thread stack using jstack and see where is the process stuck. On Fri, Sep 20, 2019 at 10:39 AM Daniel Amadei wrote: > Hi all, > > I'm trying to run the following code using direct ru

Streaming + Window not working when ran using direct runner

2019-09-20 Thread Daniel Amadei
Hi all, I'm trying to run the following code using direct runner, consuming from a pub sub topic and sending aggregate data to bigquery. When ran locally using DirectRunner, the code gets stuck after the window. If ran using DataflowRunner, it runs successfully. Any ideas? Thanks Daniel Amadei

Re: # of Readers < MaxNumWorkers for UnboundedSource. Causing deadlocks

2019-09-20 Thread Jan Lukavský
Yes, that should be fine. But what do you mean by re-entrant in this context? All accesses to reader should be single-threaded. On 9/20/19 6:11 PM, Ken Barr wrote: Is the IO SDK re-entrant? Is it safe to call advance() from within start()? On 2019/09/19 14:57:09, Jan Lukavský wrote: Hi Ken,

Re: # of Readers < MaxNumWorkers for UnboundedSource. Causing deadlocks

2019-09-20 Thread Ken Barr
Is the IO SDK re-entrant? Is it safe to call advance() from within start()? On 2019/09/19 14:57:09, Jan Lukavský wrote: > Hi Ken, > > I have seen some deadlock behavior with custom sources earlier > (different runner, but that might not be important). Some lessons learned: > >  a) please ma

Re: Word-count example

2019-09-20 Thread Matthew Patterson
411: Although it does not seem correct, adding a symbolic link to `virtualenv` in ` /sdks/python` solves the issue of not finding `virtualenv`. From: Matthew Patterson Reply-To: "user@beam.apache.org" Date: Thursday, September 19, 2019 at 2:56 PM To: "user@beam.apache.org" Subject: Re: Word-c

Re: Submitting job to AWS EMR fails with error=2, No such file or directory

2019-09-20 Thread Yu Watanabe
Thank you for the reply. > Why not manually docker pull the image (remember to adjust the container location to omit to registry) locally first? Actually, when using docker, I have already pulled the image from remote repository (GCR). Is there a way to make Flink Runner to not call "docker p

Re: Submitting job to AWS EMR fails with error=2, No such file or directory

2019-09-20 Thread Benjamin Tan
Seems like some file is missing. Why not manually docker pull the image (remember to adjust the container location to omit to registry) locally first? At least you can eliminate another source of trouble. Also, depending on your pipeline, if you are running distributed, you must make sure your

Re: KinesisIO.write throws NPE during producer.flush();

2019-09-20 Thread Alexey Romanenko
Hi Ankit, I guess it can be a bug there. Let me check this out. > On 20 Sep 2019, at 00:12, Ankit Jhalaria wrote: > > Hey beam devs, > > I am using beam 2.15 and while doing KinesisIO.write() getting a NPE. > This is how I am using it: > KinesisIO.write() > .withStreamName(“streamName") >

Re: Submitting job to AWS EMR fails with error=2, No such file or directory

2019-09-20 Thread Yu Watanabe
Ankur Thank you for the advice. You're right. Looking at the task manager's log, looks like first "docker pull" fails from yarn user and then couple of errors comes after. As a result, "docker run" seems to fail. I have been working on whole week and still not manage through from yarn session to

Re: Submitting job to AWS EMR fails with error=2, No such file or directory

2019-09-20 Thread Yu Watanabe
Benjamin. boot command also failed when process is started from yarn (aws emr flink). Below is the log from TaskManager on worker node. ===-- 2019-09-19 06:40:01,244 INFO org.apache.flink.runtime.taskmanager.Task