Written by: Sergey Karatkevich (@kevit)
Edited by: Eric Sigler (@esigler)
Why did we start this project?
Our company, Servers.com is here for a purpose. The purpose is to provide you with the quality hosting services, including all the additional tools you may need. One great example is Prisma, a mobile app.
We have been Prisma’s hosting partner since the day the app was launched. Despite an explosive popularity growth of the app and hefty download numbers, we were able to support their needs in terms of provisioning new servers and balancing the loads. Later, when the app’s code was optimized, so that we could reuse the part of the hardware, we decided to create a new product: Prisma Cloud, which is dedicated GPU hosting infrastructure.
Prisma processed their pictures on Dell servers with NVIDIA Titan X and NVIDIA 1080 GPUs, so, that was our starting point.
What were the major problems?
Each video card exposes two devices in lspci:
42:00.0 VGA compatible controller: NVIDIA Corporation Device 1b80 (rev a1)
42:00.1 Audio device: NVIDIA Corporation Device 10f0 (rev a1)
You can easily remove the audio device through /sys
:
echo -n "1" > /sys/bus/pci/devices/0000\:42\:00.1/remove
Officially the NVIDIA GeForce GTX 1080 is supported by Linux via the NV proprietary driver as 367.18 Beta. At that time the driver was quite new and still not packaged, even as an experiment.
364.19-1 1
1 http://mirror.yandex.ru/debian experimental/non-free amd64 Packages
361.45.18-2 500
500 http://mirror.yandex.ru/debian sid/non-free amd64 Packages
So, we used a new driver from NVIDIA website:
chmod +x NVIDIA-Linux-x86_64-367.35.run
./NVIDIA-Linux-x86_64-367.35.run -a --dkms -Z -s
update-initramfs -u
modprobe nvidia-uvm
./cuda_8.0.27_linux.run --override --silent --toolkit --samples --verbose
and patched:
./cuda_8.0.27.1_linux.run --silent --accept-eula
NVIDIA is trying to limit virtualization inside kvm, so kvm=off is your friend. You are obliged to use qemu 2.1+. Later we faced another limitation with ffmpeg (only two concurrent flow per one 1080 card)
<kvm>
<hidden state='on'/>
</kvm>
What does it look like from the host?
Your host should provide SR-IOV and DMAR (DMA remapping). It can be switched on via BIOS/EFI:
dmesg|grep -e DMAR -e IOMMU
IOMMU (input/output memory management unit) should be turned on in kernel options:
iommu_intel=on
Drivers snd_hda_intel and nouveau should be blacklisted
modprobe.blacklist=snd_hda_intel,nouveau
And the VFIO driver should be loaded:
modprobe vfio
What does it look like from the OpenStack side?
You should define a PCIe device:
[DEFAULT]
pci_passthrough_whitelist = { "vendor_id": "10de", "product_id": "1b80" }
pci_alias = { "vendor_id":"10de", "product_id":"1b80", "name":"nvidia" }
apply proper filters:
scheduler_default_filters=AggregateInstanceExtraSpecsFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter,AggregateImagePropertiesStrictIsolation,AggregateCoreFilter,DiskFilter,PciPassthroughFilter
and set proper flavour settings:
meta: pci_passthrough:alias = nvidia:1 ( nvidia coming from pci_alias directive in nova.conf)
nova flavor-key GPU.SSD.30 set "pci_passthrough:alias"="nvidia:1" (1 - number of cards)
Migrate your instance in Openstack
An automated migration is still in the development, but you can migrate the instance manually for now.
Symptoms:
libvirtError: Requested operation is not valid: PCI device 0000:84:00.0 is in use by driver QEMU
And a simple migration process:
nova migrate uuid
nova reset-state uuid --active
nova stop uuid
nova start uuid
Removing source-node flag: rm -r /var/lib/instances/uuid_resize
No comments :
Post a Comment