In terms of dependencies, please think about timing. Currently one loop
takes ~70 minutes, and say there is a queue time T for any job. If you
split the slow part to run serial one loop takes ~190 minutes + 2T. The
time for N iterations would be ~ 190N +570*T versus 70N+T.
---
Professor Laurence
Dependencies is not an appropriate approach.
---
Professor Laurence Marks (Laurie)
www.numis.northwestern.edu
"Research is to see what everybody else has seen, and to think what nobody
else has thought" Albert Szent-Györgyi
On Wed, Dec 20, 2023, 14:40 Renfro, Michael wrote:
r ABC in XYZ" then I
may persuade them to look at specifics. They will need the coaching, alas.
On Wed, Dec 20, 2023 at 1:25 PM Gerhard Strangar wrote:
> Laurence Marks wrote:
>
> > After some (irreproducible) time, often one of the three slow tasks
> hangs.
> > A symptom
years). I wonder
if there are some timeouts or something similar which drop connectivity. I
also wonder whether repeated launching of srun subtasks might be doing
something beyond what is normally expected.
--
Emeritus Professor Laurence Marks (Laurie)
Northwestern University
Web
t the issue might be? The jwks
can be found at the following URL.
https://auth.cern.ch/auth/realms/cern/protocol/openid-connect/certs
Cheers,
Laurence
On 27/03/2023 11:07, Laurence Field wrote:
Hi Ümit,
Thanks for the reply. Yes, it looks like this is the issue. Although
from the master b
Hi Ümit,
Thanks for the reply. Yes, it looks like this is the issue. Although
from the master branch it suggests that the claim_field can also be used
but this is not in the version we have deployed.
Cheers,
Laurence
On 24.03.23 16:51, Ümit Seren wrote:
Looks like you are missing the
give me some hints, they would be most welcome.
Cheers,
Laurence
On 24.03.23 10:41, Laurence Field wrote:
Hi Ümit,
Thanks for your reply. We are using Keycloak and the JWKS does contain
this parameter. I will continue to debug but any suggestions would be
greatly appreciated.
Cheers
Hi Ümit,
Thanks for your reply. We are using Keycloak and the JWKS does contain
this parameter. I will continue to debug but any suggestions would be
greatly appreciated.
Cheers,
Laurence
On 23.03.23 11:42, Ümit Seren wrote:
If you use AzureAD as your identity provider beware that their
ailed to verify jwt, rc=22//
//slurmctld: error: could not find matching kid or decode failed/
Thanks,
Laurence
result you observed suggests that MIG is a feature of the driver i.e
lspci shows one device but nvidia-smi shows 7 devices.
I haven't played around with this myself in slurm but would be
interested to know the answers.
Laurence
On 15/11/2022 17:46, Groner, Rob wrote:
We have successfully
10 matches
Mail list logo