Re: [OMPI users] Gadget2 error 818 when using more than 1 process?

2022-02-07 Thread Diego Zuccato via users
h the application itself. Have you talked to the Gadget2 authors? -- Jeff Squyres jsquy...@cisco.com ____ From: users on behalf of Diego Zuccato via users Sent: Wednesday, January 26, 2022 2:06 AM To: users@lists.open-mpi.org Cc: Diego Zuccato Subject: Re:

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-27 Thread Diego Zuccato via users
Sorry for the noob question, but: what should I configure for OpenMPI "to perform on the host cluster"? Any link to a guide would be welcome! Slightly extended rationale for the question: I'm currently using "unconfigured" Debian packages and getting some strange behaviour... Maybe it's just s

Re: [OMPI users] Gadget2 error 818 when using more than 1 process?

2022-01-25 Thread Diego Zuccato via users
____ From: users on behalf of Diego Zuccato via users Sent: Tuesday, January 25, 2022 5:43 AM To: Open MPI Users Cc: Diego Zuccato Subject: [OMPI users] Gadget2 error 818 when using more than 1 process? Hello all. A user of our cluster is experiencing a weird problem that I can't pinpoint

[OMPI users] Gadget2 error 818 when using more than 1 process?

2022-01-25 Thread Diego Zuccato via users
Hello all. A user of our cluster is experiencing a weird problem that I can't pinpoint. He does have a job script that worked well on every node. I's based on Gadget2. Lately, *sometimes*, the same executable with the same parameters file works, sometimes it fails. On the same node and submi

Re: [OMPI users] Debugging a crash

2021-01-31 Thread Diego Zuccato via users
Il 29/01/21 15:58, Gilles Gouaillardet via users ha scritto: Hi Gilles. Tks for the answer. > the mpirun command line starts 2 MPI task, but the error log mentions > rank 56, so unless there is a copy/paste error, this is highly > suspicious. Uhm... Going to re-check. Most probably it's just my

[OMPI users] Debugging a crash

2021-01-29 Thread Diego Zuccato via users
Hello all. I'm having a problem with a job: if it gets scheduled on a specific node of our cluster, it fails with: -8<-- -- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the

Re: [OMPI users] Code failing when requesting all "processors"

2020-10-21 Thread Diego Zuccato via users
Il 14/10/20 14:32, Jeff Squyres (jsquyres) ha scritto: >> The version is 3.1.3 , as packaged in Debian Buster. > The 3.1.x series is pretty old.  If you want to stay in the 3.1.x > series, you might try upgrading to the latest -- 3.1.6.  That has a > bunch of bug fixes compared to v3.1.3. I'm boun

Re: [OMPI users] Code failing when requesting all "processors"

2020-10-19 Thread Diego Zuccato via users
the problem. Too bad on this server gdb is already installed and apparently useless to debug the issue. >> On Oct 13, 2020, at 6:34 AM, Diego Zuccato via users >> wrote: >> >> Hello all. >> >> I have a problem on a server: launching a job with mpirun fails if I

[OMPI users] Code failing when requesting all "processors"

2020-10-13 Thread Diego Zuccato via users
Hello all. I have a problem on a server: launching a job with mpirun fails if I request all 32 CPUs (threads, since HT is enabled) but succeeds if I only request 30. The test code is really minimal: -8<-- #include "mpi.h" #include #include #define MASTER 0 int main (int argc, char *ar