You probably should return a valid SystemAdmin object, but returning null for SystemConsumer should be OK. Again, two questions: 1) Did the container hangs during the shutdown? Or it just crashes w/ exception? Since stderr does not show anything, I was assuming that the container hangs??? 2) If the container hangs, could you take a thread dump?
Thanks! -Yi On Tue, Jan 17, 2017 at 1:50 AM, 舒琦 <sh...@eefung.com> wrote: > Hi, > > My SystemFactory implementation return null for both 『getConsumer』 and > 『getAdmin』, is this the cause of the problem? > > Thanks. > > ———————— > 舒琦 > 地址:长沙市岳麓区文轩路27号麓谷企业广场A4栋1单元6F > 网址:http://www.eefung.com > 微博:http://weibo.com/eefung > 邮编:410013 > 电话:400-677-0986 > 传真:0731-88519609 > > > 在 2017年1月17日,17:18,Yi Pan <nickpa...@gmail.com> 写道: > > > > Hi, Qi, > > > > In your log, the log line stops at "closing simple consumer...". It is > part of the shutdownConsumers() method in the shutdown sequence. Are you > sure that the container process actually proceed further in the shutdown > sequence? If the container process does not proceed further (i.e. somehow > stuck at certain steps before shutdownProducers() method), your producer > stop() method will not be executed. I noticed that in your log file, there > is not even a line "Shutting down task instance stream tasks.", which means > your program does not even executed shutdownTasks() in the shutdown > sequence (right after the shutdownConsumers()). Since in your stderr, there > is no exception reported either, can you check your implementation of > HStoreSystemConsumer to see whether the consumer hangs on shutdown? A > thread-dump would be super helpful here. > > > > On Sun, Jan 15, 2017 at 11:30 PM, 舒琦 <sh...@eefung.com <mailto: > sh...@eefung.com>> wrote: > > Hi, > > > > Thanks for your help. > > > > Here are 2 questions: > > > > 1. I have defined my own HDFS producer which implemented SystemProducer > and overwrite stop method(I log something in the first line of stop > method), but when I kill the app, the log are not printed out. The tricky > thing is the logic defined in stop method sometimes can be executed and > sometimes not. > > > > Below is stop method: > > > > @Override > > public void stop() { > > try { > > LOGGER.info("Begin to close files"); > > closeFiles(); > > } catch (IOException e) { > > LOGGER.error("Error when close Files", e); > > } > > > > if (fs != null) { > > try { > > fs.close(); > > } catch (IOException e) { > > //do nothing > > } > > } > > } > > > > Below is the log: > > > > 2017-01-16 15:13:35.273 [Thread-9] SamzaContainer [INFO] Shutting down, > will wait up to 5000 ms > > 2017-01-16 15:13:35.284 [main] SamzaContainer [INFO] Shutting down. > > 2017-01-16 15:13:35.285 [main] SamzaContainer [INFO] Shutting down > consumer multiplexer. > > 2017-01-16 15:13:35.287 [main] BrokerProxy [INFO] Shutting down > BrokerProxy for 172.19.105.22:9096 <http://172.19.105.22:9096/> > > 2017-01-16 15:13:35.288 [main] BrokerProxy [INFO] closing simple > consumer... > > 2017-01-16 15:13:35.340 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed > at 172.19.105.22:9096 <http://172.19.105.22:9096/> for client > samza_consumer-canal_status_persistent_hstore-1] BrokerProxy [INFO] Got > interrupt exception in broker proxy thread. > > 2017-01-16 15:13:35.340 [main] BrokerProxy [INFO] Shutting down > BrokerProxy for 172.19.105.21:9096 <http://172.19.105.21:9096/> > > 2017-01-16 15:13:35.341 [main] BrokerProxy [INFO] closing simple > consumer… > > > > You can see the log “Begin to close files” are not printed out and of > course the logic is not executed. > > > > 2. The hadoop cluster I use is “HDP-2.5.0”,the log aggregation is also > enabled, but logs of containers can not be collected, only the log of am > can be seen. > > > > > > > > > > ———————— > > ShuQi > > > >> 在 2017年1月16日,10:39,Liu Bo <diabl...@gmail.com <mailto: > diabl...@gmail.com>> 写道: > >> > >> Hi, > >> > >> *container log will be removed automatically,* > >> > >> you can turn on yarn log aggregation, so that terminated yarn jobs' log > >> will be dumped to HDFS > >> > >> On 14 January 2017 at 07:44, Yi Pan <nickpa...@gmail.com <mailto: > nickpa...@gmail.com>> wrote: > >> > >>> Hi, Qi, > >>> > >>> Sorry to reply late. I am curious on your comment that the close and > stop > >>> methods are not called. When user initiated a kill request, the > graceful > >>> shutdown sequence is triggered by the shutdown hook added to > >>> SamzaContainer. The shutdown sequence is the following in the code: > >>> {code} > >>> info("Shutting down.") > >>> > >>> shutdownConsumers > >>> shutdownTask > >>> shutdownStores > >>> shutdownDiskSpaceMonitor > >>> shutdownHostStatisticsMonitor > >>> shutdownProducers > >>> shutdownLocalityManager > >>> shutdownOffsetManager > >>> shutdownMetrics > >>> shutdownSecurityManger > >>> > >>> info("Shutdown complete.") > >>> {code} > >>> > >>> in which, MessageChooser.stop() is invoked in shutdownConsumers, and > >>> SystemProducer.close() is invoked in shutdownProducers. > >>> > >>> Could you explain why you are not able to shutdown a Samza job > gracefully? > >>> > >>> Thanks! > >>> > >>> -Yi > >>> > >>> On Mon, Dec 12, 2016 at 6:33 PM, 舒琦 <sh...@eefung.com <mailto: > sh...@eefung.com>> wrote: > >>> > >>>> Hi Guys, > >>>> > >>>> How can I stop running samza job gracefully except killing it? > >>>> > >>>> Because when samza job was killed, the close and stop method in > >>>> BaseMessageChooser and SystemProducer will not be called and the > >>> container > >>>> log will be removed automatically, how can resolve this? > >>>> > >>>> Thanks. > >>>> > >>>> ———————— > >>>> ShuQi > >>> > >> > >> > >> > >> -- > >> All the best > >> > >> Liu Bo > > > > > >