On 11/01/2017 03:39 PM, hskarlupka wrote: > Hello, >
Hi Heath! > I'm running a 4 node Kubernetes bare metal cluster using Centos Atomic > as the base OS. My team is interested in installing a Nvidia GPU in one > of the nodes. One concern I have is on the installation of the Nvidia > drivers. The first thing that comes to mind is that it won't like the > /usr RO filesystem. The only guides I've seen are for CoreOS Nvidia > installs that use a set of Docker containers to install drivers and use > the device: > > https://github.com/src-d/coreos-nvidia > > https://github.com/ryanolson/CoreOS-GPU > > Has anyone run into this before for CentOS Atomic? > Unfortunately I don't have good answers for you on this one. I was going to suggest package layering via `rpm-ostree install foo.rpm` but there are some compatibility issues with the nvidia rpms and rpm-ostree that don't quite work [1] [2]. Really you have two options. 1. unlock your ostree and install the rpms 2. build the kernel module for the kernel you are targeting and deliver via a system container like in https://github.com/giuseppe/hellomod The `unlock your ostree` approach basically converts the atomic host back into a system that is completely writable: - ostree admin unlock --hotfix - alias yum='/usr/share/yum-cli/yummain.py' - yum install -y epel-release kernel-{devel,headers}-$(uname -r) https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-9.0.176-1.x86_64.rpm - yum install -y nvidia-kmod should do it. Sorry I don't have better answers on this right now. Dusty [1] https://github.com/projectatomic/rpm-ostree/issues/1091 [2] https://github.com/projectatomic/rpm-ostree/issues/233#issuecomment-342643731