Re: Samza container hang on exception

2016-09-06 Thread Yi Pan
Hi, Sining, I took a look at your log and stack traces and want to clarify two points: 1) It seems that your container actually exited, instead of hanging, based on the log, which is the expected behavior from 0.10.1 (retry X times and error-out in SamzaContainer RunLoop). 2) The Kafka producer c

Re: Samza container hang on exception

2016-09-02 Thread 李斯宁
yes, upgraded to 0.10.1 jstack: https://drive.google.com/open?id=0B19olQZ1dUO8VjltQmtxLTJ4SVdFZWhYWHZ3Y2hMOVhCMWNn task log: https://drive.google.com/open?id=0B19olQZ1dUO8eVRLWmJCVl9nRlg2UUM4c21udUViWW8tSUVV On Fri, Sep 2, 2016 at 4:41 PM, Yi Pan wrote: > Hi, Sining, > > You note is on a site t

Re: Samza container hang on exception

2016-09-02 Thread Yi Pan
Hi, Sining, You note is on a site that I don't have account/access and it requires sign-up. Can you share it via google doc, since you have a gmail account? And just to confirm, you have upgrade and using 0.10.1 now, right? Thanks and apologize for the delay. -Yi On Fri, Sep 2, 2016 at 1:03 AM,

Re: Samza container hang on exception

2016-09-02 Thread 李斯宁
Can any one help on this? Thanks! On Thu, Sep 1, 2016 at 11:59 AM, 李斯宁 wrote: > If you cannot see the attachment, please try http://note.youdao.com/ > noteshare?id=56b826c24af47a9fdb600490ce788710 > > On Thu, Sep 1, 2016 at 1:50 AM, Chinmay Soman > wrote: > >> Sorry dont see anything in the att

Re: Samza container hang on exception

2016-08-31 Thread 李斯宁
If you cannot see the attachment, please try http://note.youdao.com/noteshare?id=56b826c24af47a9fdb600490ce788710 On Thu, Sep 1, 2016 at 1:50 AM, Chinmay Soman wrote: > Sorry dont see anything in the attachment. Can you please re-attach and > re-send ? > > On Wed, Aug 31, 2016 at 3:27 AM, 李斯宁 w

Re: Samza container hang on exception

2016-08-31 Thread Chinmay Soman
Sorry dont see anything in the attachment. Can you please re-attach and re-send ? On Wed, Aug 31, 2016 at 3:27 AM, 李斯宁 wrote: > It seems upgrading does not solve the problem. All task hang in today's > "rush hour". > I attached log and jstack. > > The SAMZA-911 want to fix by stopping the proces

Re: Samza container hang on exception

2016-08-31 Thread 李斯宁
It seems upgrading does not solve the problem. All task hang in today's "rush hour". I attached log and jstack. The SAMZA-911 want to fix by stopping the process if failed too much times. But the process is still there and hanging. On Mon, Aug 22, 2016 at 1:14 PM, 李斯宁 wrote: > Thanks so much,

Re: Samza container hang on exception

2016-08-21 Thread 李斯宁
Thanks so much, I'll try. On Mon, Aug 22, 2016 at 6:26 AM, Yi Pan wrote: > Hi, Sining, > > This is a known bug that is fixed in 0.10.1 (SAMZA-911). Please try to > upgrade to 0.10.1. > > Thanks! > > -Yi > > On Sun, Aug 21, 2016 at 5:55 AM, 李斯宁 wrote: > > > I have tried restart every kafka serve

Re: Samza container hang on exception

2016-08-21 Thread Yi Pan
Hi, Sining, This is a known bug that is fixed in 0.10.1 (SAMZA-911). Please try to upgrade to 0.10.1. Thanks! -Yi On Sun, Aug 21, 2016 at 5:55 AM, 李斯宁 wrote: > I have tried restart every kafka server. The container did not recover. > > log have something below: > > 2016-08-21 20:08:21 [WARN

Re: Samza container hang on exception

2016-08-21 Thread 李斯宁
I have tried restart every kafka server. The container did not recover. log have something below: 2016-08-21 20:08:21 [WARN ](o.a.s.s.k.KafkaSystemProducer :66 ) Retrying send messsage due to RetriableException - org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is

Re: Samza container hang on exception

2016-08-21 Thread 李斯宁
at the last of the container's log, prints these: 2016-08-21 19:57:01 [WARN ](o.a.s.s.k.KafkaSystemProducer :66 ) Retrying send messsage due to RetriableException - org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.. Turn on

Samza container hang on exception

2016-08-21 Thread 李斯宁
hi, guys I'm using samza in realtime process. After running for about 10 hours, some containers paused and not processing. When I looked into the log, I found a lot of 2016-08-21 10:03:07 [WARN ](o.a.k.c.p.i.Sender :257) Got error produce response with correlation id 490345 on top