GPU Passthrough for KVM

To use GPU hardware with OpenStack, KVM, and SCM, you need to make some manual changes to the default configurations. More details are available in the OpenStack document Pci passthrough. This document uses the nVidia K2 Grid card in examples.

nova.conf

Edit /etc/nova/nova.conf. You’ll need the product_id and vendor_id of your GPU. You can get these from the output of the lspci command, or a site like The PCI ID Repository or PCIDatabase.com. An example, using the Nvidia K2 card:

pci_alias = { 'name': 'K2_Grid', 'vendor_id': '10de', 'product_id': '11bf' }
pci_passthrough_whitelist = [ { 'vendor_id': '10de', 'product_id': '11bf' } ]

Also create a section for spice and enable it:

[spice]
enabled = True

nova-compute.conf

Add the spice section to /etc/nova/nova-compute.conf also:

[spice]
enabled = True

Flavor changes

Specify the maximum number of megabytes of video RAM and enable device passthrough with these flavor *extra_spec*s:

hw_video:ram_max_mb=64
"pci_passthrough:alias"="K2_Grid:1"
gpu=true

Do this by updating an existing flavor with nova flavor-key. Example:

# nova flavor-key my_existing_flavor set \
    hw_video:ram_max_mb=64 \
    "pci_passthrough:alias"="K2_Grid:1" \
    gpu=true

The first value on the right side the second item above (K2_Grid) should match the value of name in the pci_alias entry for nova.conf above. The second value (1) is the number of items passed through.

Image changes

For the GPU, set these properties on your image:

hw_video_model=qxl
hw_video_ram=8
configure_x=true

configure_x is only used by Linux GPU images; it gets cloud-init to generate an Xorg.conf file with the device’s PCI address.

Do this when you create the image with glance image-create. Example:

# glance image-create \
    --name my_new_image  \
    --is-public true \
    --file /root/images/my_raw_image.raw
    --disk-format raw \
    --container-format bare \
    --min-disk 15 \
    --min-ram 512
    --progress \
    --property image_type=loginnode \
    --property service_type=loginNode \
    --property hw_video_model=qxl \
    --property hw_video_ram=8 \
    --property configure_x=true \
    --property service_url='https://{}/'

You can also set these properties when you update an existing image with glance image-update:

# glance image-update \
    --property hw_video_model=qxl \
    --property hw_video_ram=8   \
    --property configure_x=true   \
    my_existing_image

The image_type and service_type properties are used by SCM to allow the image to be displayed in the Scyld Cloud Portal.

SCM needs to know when the image has booted into an active VM. The default is to attempt to ssh to the VM until it responds or times out. The service_url property is used if the image does not support ssh (such as Windows) or does not start the ssh server at boot time, but does have a web server that starts at boot time. The value is the URL of some page of this web server, typically the home page. The {} characters are changed to the IP of the VM as it boots, and attempts are made to access that web page.

Aggregates

In OpenStack Nova, a host aggregate is a group of hosts with similar metadata (key-value pairs). A host may belong to more than one aggregate. We’ll use this to help manage GPU and non-GPU hosts.

Edit /etc/nova/nova.conf again and add the AggregateInstanceExtraSpecsFilter to scheduler_default_filters:

scheduler_default_filters=AggregateInstanceExtraSpecsFilter,RetryFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter

Create three aggregates, the first being the availability zone:

# nova aggregate-create aggregate_Denver Denver
# nova aggregate-create GPU Denver
# nova aggregate-create nonGPU Denver

Next, set metadata on the aggregates:

# nova aggregate-set-metadata GPU gpu=true
# nova aggregate-set-metadata nonGPU gpu=false

Then add the host to the aggregate:

# nova aggregate-add-host nonGPU vmhost1
# nova aggregate-add-host nonGPU vmhost2
# nova aggregate-add-host GPU vmhost3

Next, set the matching metadata on the nova flavors:

# nova flavor-key 1 set gpu=false
# nova flavor-key 2 set gpu=false
# nova flavor-key 3 set gpu=false
# nova flavor-key 4 set gpu=true
# nova flavor-key 5 set gpu=true

Instead of omitting the gpu metadata for the non-GPU flavors, we’re explicitly setting them to false.

Cloud Controller

The Scyld Cloud Controller can check for GPU usage before starting a SCM VM.

In the OpenStack Galera database, accessible through Docker on one of the OpenStack controller systems, add a user with SELECT permissions for the nova database.:

# ssh cc1
# docker exec -it mariadb /bin/bash
# mysql -u root -pOPENSTACK_ROOT_DB_PASSWORD
# CREATE USER 'cc'@'%' IDENTIFIED BY 'SECURE_PASSWORD';
# GRANT SELECT ON nova.pci_devices TO 'cc'@'%';

On the Scyld Cloud Controller server, edit the cloudcontroller.ini file, and add the openstack.gpu_query setting, consisting of an URL string for connecting to the OpenStack database and user created.:

openstack.gpu_query = mysql://cc:SECURE_PASSWORD@openstack_host

Next, add the product_id of the GPUs being used as a space-separated list to openstack.gpu_pci_ids:

openstack.gpu_pci_ids = 0ff2 11bf 13f2

Touch the Scyld Cloud Controller’s wsgi file to make the change active.:

touch /var/www/wsgi/cloudcontroller.wsgi

SCM VMs that require GPUs will now show an error message when all available GPUs are in use.