Thanks Abhishek, I believe you are correct; removing the shutdown hook from inside the container was the problem. I took the shutdown hook code removed from SamzaContainer in your commit #83e152904ef5 and pulled it out to our App Runner, calling LocalApplicationRunner.kill() and then LocalApplicationRunner.waitForFinish(timeout) and I think that has restored all of the shutdown sequencing.
Where do you think this belongs (in the scope of SAMZA-2426 that you created)? Maybe in LocalApplicationRunner itself? You said you already took care of shutdown sequence in cluster mode, yes? I'm open to helping on this one, just let me know. thanks, Thunder On Tue, Jan 7, 2020 at 7:12 PM Abhishek S <abks...@gmail.com> wrote: > The rational behind moving the shutdown handler out of SamzaContainer was > to let standalone jobs (which are usually part of other online > applications) maintain their own shutdown hooks. > This prevents Samza hooks from running in parallel or causing a deadlock > with the shutdown hooks of parent application that uses Samza in standalone > mode (as a library). > > That being said, I agree that Containers should attempt graceful shutdown > and wait at-most "task.shutdown.ms" on SIGTERM. > I created SAMZA-2426 to investigate the issue further and track the work > required. > > Abhishek > > > > > > > On Tue, Jan 7, 2020 at 1:55 PM Brett Konold <bkon...@linkedin.com.invalid> > wrote: > > > Thunder, > > > > How were you able to determine that the shutdown hooks are not being > > called? > > > > If you're able to share any of your shutdown logs from before and after > > your 1.3 upgrade, that would be helpful while I try to reproduce this > issue > > myself. > > > > Brett > > ________________________________ > > From: Brett Konold <bkon...@linkedin.com> > > Sent: Tuesday, January 7, 2020 1:25 PM > > To: dev@samza.apache.org <dev@samza.apache.org> > > Subject: Re: Problem: upgrade 1.2 to 1.3 caused loss of clean shutdown on > > SIGTERM > > > > Hey Thunder, > > > > Thanks for reporting. Taking a look into this and will get back to you > > when I have something. > > > > Brett > > ________________________________ > > From: Thunder Stumpges <thunder.stump...@gmail.com> > > Sent: Monday, January 6, 2020 6:54 PM > > To: dev@samza.apache.org <dev@samza.apache.org> > > Subject: Problem: upgrade 1.2 to 1.3 caused loss of clean shutdown on > > SIGTERM > > > > We are attempting to upgrade from samza 1.2 to 1.3 in hopes of fixing > > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSAMZA-2198&data=02%7C01%7Cbkonold%40linkedin.com%7C3d5e0f0bf97e4cc633de08d7931d06ad%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637139625187723288&sdata=uwKMfKxQmS9uWS62Hew814P%2Fja7FpaViNTVzYrp03Ok%3D&reserved=0 > > where there was a deadlock > > in the shutdown code which prevented completing a clean shutdown. > > > > After the upgrade, it appears like NONE of the shutdown hooks / code are > > being called and it just immediately shuts down. > > > > We are running stand alone Low Level Tasks with LocalApplicationRunner / > > ZkJobCoordinator in Docker / K8s. > > > > When killing from docker for testing, we use docker kill -s SIGTERM > > <container> to send SIGTERM instead of SIGKILL. This was working in samza > > 1.2 (other than the deadlock from the above issue). > > > > Any ideas what changed? > > > > Thanks, > > Thunder > > >