"Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)"
writes:
> On Sep 29, 2022, at 10:34 AM, Steffen Grunewald
> wrote:
>
> Hi Noam,
>
> I'm wondering why one would want to know that - given that there are
> approaches to multi-node operation beyond MPI (Charm++ comes to mind)?
>
> The
On Sep 29, 2022, at 10:34 AM, Steffen Grunewald
mailto:steffen.grunew...@aei.mpg.de>> wrote:
Hi Noam,
I'm wondering why one would want to know that - given that there are
approaches to multi-node operation beyond MPI (Charm++ comes to mind)?
The thread title requested a way of detecting non-MPI
On Thu, 2022-09-29 at 14:03:58 +, Bernstein, Noam CIV USN NRL (6393)
Washington DC (USA) wrote:
> Can you check slurm for a job that requests multiple nodes but doesn't have
> mpirun (or srun, or mpiexec) running on its head node?
Hi Noam,
I'm wondering why one would want to know that - giv
Hi Loris,
On 29/09/2022 09:26, Loris Bennett wrote:
I can see that this is potentially not easy, since an MPI job might have
still have phases where only one core is actually being used.
Slurm will create the needed cgroups on all the nodes that are part of the job
when the job starts. So yo
Can you check slurm for a job that requests multiple nodes but doesn't have
mpirun (or srun, or mpiexec) running on its head node?
Hi Ole,
Ole Holm Nielsen writes:
> Hi Loris,
>
> On 9/29/22 09:26, Loris Bennett wrote:
>> Has anyone already come up with a good way to identify non-MPI jobs which
>> request multiple cores but don't restrict themselves to a single node,
>> leaving cores idle on all but the first node?
>> I can
Hi Davide,
That is a interesting idea. We already do some averaging, but over the
whole of the past month. For each user we use the output of seff to
generate two scatterplots: CPU-efficiency vs. CPU-hours and
memory-efficiency vs. GB-hours. See
https://www.fu-berlin.de/en/sites/high-perfo
Hi Loris,
On 9/29/22 09:26, Loris Bennett wrote:
Has anyone already come up with a good way to identify non-MPI jobs which
request multiple cores but don't restrict themselves to a single node,
leaving cores idle on all but the first node?
I can see that this is potentially not easy, since an M
At my previous job there were cron jobs running everywhere measuring
possibly idle cores which were eventually averaged out for the
duration of the job, and reported (the day after) via email to the
user support team.
I believe they stopped doing so when compute became (relatively) cheap
at the exp
Hi,
Has anyone already come up with a good way to identify non-MPI jobs which
request multiple cores but don't restrict themselves to a single node,
leaving cores idle on all but the first node?
I can see that this is potentially not easy, since an MPI job might have
still have phases where only
10 matches
Mail list logo