4
billing 5
fsdisk 6
vmem 7
pages 8
gres gpu 1001
gres gpu:k20 1002
gres gpu:1080gtx 1003
Can anyone point out what am I missing?
Thanks!
Lou
--
*Lou Nicotra*
IT Systems Engi
using my login name. The default account for all users is "slt"
Is this the cause of my problems?
root@panther02 slurm# getent passwd lnicotra
lnicotra:*:1498:1152:Lou Nicotra:/home/lnicotra:/bin/bash
If so, how is this resolved as we use multiple servers and there are no
local accounts for
fig` from the machine
> having trouble
> On Mon, Dec 3, 2018 at 12:10 PM Lou Nicotra
> wrote:
> >
> > I'm running slurmd version 18.08.0...
> >
> > It seems that the system recognizes the GPUs after a slurmd restart. I
> tuned debug to 5, restarted and the
wrote:
> Is that a lowercase k in k20 specified in the batch script and nodename
> and a uppercase K specified in gres.conf?
>
> On 12/03/2018 09:13 AM, Lou Nicotra wrote:
>
> Hi All, I have recently set up a slurm cluster with my servers and I'm
> running into an issue whi
ee anything specifically wrong. The one thing i might try
> is backing the software down to a 17.x release series. I recently
> tried 18.x and had some issues. I can't say whether it'll be any
> different, but you might be exposing an undiagnosed bug in the 18.x
> branch
> On Mon
, 2018 at 9:31 AM Lou Nicotra
wrote:
> Thanks Michael. I will try 17.x as I also could not see anything wrong
> with my settings... Will report back afterwards...
>
> Lou
>
> On Tue, Dec 4, 2018 at 9:11 AM Michael Di Domenico
> wrote:
>
>> unfortunately, someone sma
Cores=0
> NodeName=tiger[02-04,06-09,11-14,16-19,21-22] Name=gpu Type=k80
> File=/dev/nvidia1 Cores=1
> NodeName=tiger[01,05,10,15,20] Name=gpu Type=1080gtx File=/dev/nvidia0
> Cores=0
> NodeName=tiger[01,05,10,15,20] Name=gpu Type=1080gtx File=/dev/nvidia1
> Cores=1
>
> which ca
t recent release is 18.08.3. NEWS packed in the
> > tarballs gives the fixes in the versions. I don't see any that would
> > fit you case.
> >
> >
> > On 12/04/2018 02:11 PM, Lou Nicotra wrote:
> >> Brian, I used a single gres.conf file and distribute
ned a node that has two different nvidia
> cards, so what was on what port became important, not because the
> 'range' configuration caused problems.
>
> This wasn't a fresh install of 18.x - it was a 17.x installation that I
> upgraded to 18.x. Not sure if tha
ild...
My LD_LIBRARY_PATH is
/usr/lib64:/usr/lib:/usr/local/lib64:/usr/local/lib:/var/local/miniconda2/lib/:
Can anyone provide suggestions on working out this issue?
Thanks.
--
LOU NICOTRA
IT Systems Engineer - SLT
Interactions LLC
o: 908-673-1833 <781-405-5114>
m: 908-451-6983 <78
1.el7.centos #
[100%]
Oh, well...
Lou
On Mon, Aug 12, 2019 at 1:32 AM Barbara KraĊĦovec
wrote:
> What if you try to run ldconfig manually before building the rpm?
>
> Cheers,
>
> Barbara
> On 8/8/19 5:57 PM, Lou Nicotra wrote:
>
> I am running int
Are the nvidia libraries installed by RPM or a 'make install' on the box
> you compiled it on?
>
> Brian Andrus
> On 8/15/2019 7:53 AM, Lou Nicotra wrote:
>
> I have tried running ldconfig manually as suggested with
> slurm-19.05.1-2 and it fails the same way...
> erro
an into that trying to install tensorflow.
>
> If you can, downgrade to 10.0, which does a better job of installing
> itself.
>
> Brian
> On 8/16/2019 5:47 AM, Lou Nicotra wrote:
>
> Brian, the package is being built and installed on the master server. I
> am testing b
13 matches
Mail list logo