pdsh is available in head node only, but when I tried to do *start-cluster *from head node (note Job manager node is not head node) it didn't work, which is why I modified the scripts.
Yes, exactly, this is what I was trying to do. My research area has been on these NUMA related issues and binding a process to a socket (CPU) and then its thread to individual cores have shown great advantage. I actually have Java code that automatically (user configurable as well) bind processes and threads. For Flink, I've manually done this using shell script that scans TMs in a node and pin them appropriately. This approach is OK, but it's better if the support is integrated to Flink. On Sun, Jul 10, 2016 at 8:33 PM, Greg Hogan <c...@greghogan.com> wrote: > Hi Saliya, > > Would you happen to have pdsh (parallel distributed shell) installed? If > so the TaskManager startup in start-cluster.sh will run in parallel. > > As to running 24 TaskManagers together, are these running across multiple > NUMA nodes? I had filed FLINK-3163 ( > https://issues.apache.org/jira/browse/FLINK-3163) last year as I have > seen that even with only two NUMA nodes performance is improved by binding > TaskManagers, both memory and CPU. I think we can improve configuration of > task slots as we do with memory, where the latter can be a fixed measure or > a fraction relative to total memory. > > Greg > > On Sat, Jul 9, 2016 at 3:44 AM, Saliya Ekanayake <esal...@gmail.com> > wrote: > >> Hi, >> >> The current start/stop scripts SSH worker nodes each time they appear in >> the slaves file. When spawning multiple TMs (like 24 per node), this is >> very inefficient. >> >> I've changed the scripts to do one SSH per node and spawn a given N >> number of TMs afterwards. I can make a pull request if this seems usable to >> others. For now, I assume slaves file will indicate the number of TMs per >> slave in "IP N" format. >> >> Thank you, >> Saliya >> >> -- >> Saliya Ekanayake >> Ph.D. Candidate | Research Assistant >> School of Informatics and Computing | Digital Science Center >> Indiana University, Bloomington >> >> > -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington