Yes I think your hunch is right. Each container queries the AM over HTTP to obtain the jobModel that it is supposed to run. The AM runs a HTTP server usually on a dynamically allocated free port on the machine it's running on. So its possible that a firewall rule blocks the container when it tries to reach this port on the AM's machine?
-- thanks rayman On Thu, May 30, 2019 at 5:30 PM Malcolm McFarland <mmcfarl...@cavulus.com> wrote: > Thanks for the image, appreciate you taking the effort to do that! I'm > still hitting this wall. The AM will launch the container, the container > will go from "accepted" to "running", but there will be no output from the > container (I'm piping all of the Samza, org.apache, org.kafka, and our own > application's logging output to a Kafka topic). During these periods, the > container will hang out at ~100MB/8GB memory usage and stall. There's no > error output when this happens; it just kind of stops. My suspicion is that > our Ops group has a firewall rule up that's interfering with this,or maybe > just isn't white-listing a port correctly, and if I could identify where > the application is stalling, it'd probably help to narrow down the > possibilities. > > Cheers, > Malcolm McFarland > Cavulus > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any > unauthorized or improper disclosure, copying, distribution, or use of the > contents of this message is prohibited. The information contained in this > message is intended only for the personal and confidential use of the > recipient(s) named above. If you have received this message in error, > please notify the sender immediately and delete the original message. > > > On Thu, May 30, 2019 at 1:39 PM rayman preet <rayman7...@gmail.com> wrote: > > > I uploaded the image here: > > https://www.dropbox.com/s/rv57v165ysp12c5/samza%20flow.png?dl=0 > > > > Are you still running into this issue? > > Is there anything in the container's log that shows any > exceptions/errors. > > > > On Wed, May 22, 2019 at 10:15 PM Malcolm McFarland < > mmcfarl...@cavulus.com > > > > > wrote: > > > > > Hey rayman, > > > > > > What it looks like is that the AM has started, the container has > started, > > > but, ie, here will be the last messages I see in the Samza logs: > > > > > > 2019-05-23T05:10:45.048Z INFO Making a request for ANY_HOST > > > 2019-05-23T05:10:45.057Z INFO Starting the container > allocator > > > thread > > > 2019-05-23T05:10:47.098Z INFO Received new token for : > > > <valid_host>:8032 > > > 2019-05-23T05:10:47.102Z INFO Container allocated from RM on > > > <same_valid_host> > > > 2019-05-23T05:10:47.105Z INFO Container allocated from RM on > > > <same_valid_host> > > > > > > At this point, it seems to stall, and no more output is produced. > > > > > > Also, I couldn't see you diagram (it's possible my company's email > > filters > > > attachments); can I see that on the web anywhere? > > > > > > Cheers, > > > Malcolm > > > > > > On Wed, May 22, 2019 at 4:30 PM rayman preet <rayman7...@gmail.com> > > wrote: > > > > > > > Hi Malcolm, > > > > > > > > This figure (attached) gives an overview of the flow. Is > > > > this something you were looking for? > > > > > > > > Also, by "don't fully start up" do you mean that > > > > applications are missing some containers (but the ApplicationMaster > is > > > > running)? > > > > Or the application is missing entirely. > > > > > > > > -- > > > > thanks > > > > rayman > > > > [image: Samza Job Launch Sequence.png] > > > > > > > > On Tue, May 21, 2019 at 3:58 PM Malcolm McFarland < > > > mmcfarl...@cavulus.com> > > > > wrote: > > > > > > > >> Hey Folks, > > > >> > > > >> I'm still trying to pin down why these applications are sometimes > not > > > >> starting. Everything looks fine in the YARN web UI and in the > > > >> immediately available logs, but the applications don't always fully > > > >> start up. Does anybody have a rundown about how to trace the Samza > > > >> startup process on a YARN cluster, from Accepted status, to > > > >> localization, to the application master startup, to the actual > > > >> application's startup? > > > >> > > > >> Cheers, > > > >> Malcolm > > > >> > > > >> -- > > > >> Malcolm McFarland > > > >> Cavulus > > > >> > > > >> > > > >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any > > > >> unauthorized or improper disclosure, copying, distribution, or use > of > > > >> the contents of this message is prohibited. The information > contained > > > >> in this message is intended only for the personal and confidential > use > > > >> of the recipient(s) named above. If you have received this message > in > > > >> error, please notify the sender immediately and delete the original > > > >> message. > > > >> > > > > > > > > > > > > -- > > > > thanks > > > > rayman > > > > > > > > > > > > > -- > > > Malcolm McFarland > > > Cavulus > > > 1-800-760-6915 > > > mmcfarl...@cavulus.com > > > > > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any > > > unauthorized or improper disclosure, copying, distribution, or use of > the > > > contents of this message is prohibited. The information contained in > this > > > message is intended only for the personal and confidential use of the > > > recipient(s) named above. If you have received this message in error, > > > please notify the sender immediately and delete the original message. > > > > > > > > > -- > > thanks > > rayman > > > -- thanks rayman