Building GPU Accelerated Docker Containers
This guide will walk you through how to properly create and utilize a GPU Accelerated Docker Container via Nvidia CUDA
The idea of creating specific component to accelerate repetitive tasks performed by a computer has been a concept that has existed even before Bletchley Park. It just makes absolutely no sense for a computer’s CPU to continuously handle a task such as processing the refresh rate of your monitor every 1/120th of a second or your computer freezing just because it got bogged down an extremely expensive tasks like matrix multiplication, prime modulation, or many other common essential tasks. And, it’s no wonder, with the rise of performance we witness in our machines today, it’s clear that specialized sections within our machines have been engineered specifically to target these commonplace tasks. Components like GPUs, multimedia/cryptographic encoders and decoders, and many other specialized components, are just some of the many innovations jammed into a space sometimes smaller than an eye can see. Okay, maybe I’m exaggerating a little.
Since the revolution of designing many of these core components, new proprietary software which has now become more commonplace especially within the software scene has also begun to require complex and expensive computation, with a notable one being machine learning. Driven on a concept of multi-dimensional matrix manipulation and extreme repetition of tasks, it’s obvious that many wish to tap into the GPU to accomplish that task—using its powers to accelerate and offload that burden. Yet, even while many machine learning frameworks like PyTorch, TensorFlow, and many others have provided GPU acceleration right out of the box, utilizing that power within a containerized environment like Docker is still a confusing and nontrivial task.
This guide will walk you through how to properly create and utilize a GPU Accelerated Docker Container.
Using CUDA and GPU Acceleration in Docker
*please note, GPU acceleration is currently only limited to NVIDIA graphics cards with CUDA support. this is because no other graphic card manufacturers provide open source drivers that are widely supported and integrated within the development ecosystem. If you are not sure if your machine supports a NVIDIA graphics card, you can search system information on your windows machine, look at About This Mac on MacOS, or use the command
ubuntu-drivers devices on ubuntu.
Setting up the Container and Configuration
To truly utilize GPU acceleration within a containerized environment, a couple of things must be complete on both your local machine and within your containerized environment.
NVIDIA and CUDA Drivers
To first get started, you need to install NVIDIA’s graphic drivers, CUDA drivers, or the CUDA toolkit. If you are unsure if you have these drivers already installed, you can follow the steps below for your specific operating system which include instructions as to how to install the drivers. A quick google or stack overflow search can also provide you similar results as what I have here.
To check if you have the drivers properly installed, you can run
nvcc -V or
nvidia-smi within the terminal. You should get an output similar to one of the images below
If you do no see any of these however, you can install the drivers via the software installation tool (on later ubuntu versions) or run the command.
sudo ubuntu-drivers autoinstall
In case you wish to individually select a version of the drivers to install instead, you can run the command
sudo add-apt-repository ppa:graphics-drivers/ppa and then
sudo apt install nvidia-driver-XXX , with XXX being your desired version listed within your CLI.
Then, restart your computer for these to take effect. You should also run
sudo apt install nvidia-cuda-toolkit to install the CUDA toolkit as well.
There are also known cases however where this fails to work on certain versions. In this case, follow the steps given on the official NVIDIA Driver Installation quickguide instead. Please be careful however when pruning your system of nvidia keywords, as it may completely nuke your ubuntu installation.
To check if you have the drivers properly installed, you can find this under Windows Device Manager from your search bar. Or, you can try right clicking on your desktop screen and seeing if you witness a NVIDIA Control Panel option in the dropdown menu as well.
If you do not have the drivers installed however, you can simply download and run the installer from the NVIDIA Website.
To check if you have the drivers properly installed, you can find out under About This Mac → System Report → Extensions (located under the software tab).
If you do not locate any NVIDIA driver however, you can also install them here following the link here. M1 Macs, you guys are kind of at a loss here.
GPU Compatible Containers
NVIDIA’s CUDA drivers must also be installed within the container as well, not only on your local machine. To do so, I highly recommend utilizing a prebuilt container from NVIDIA’s Docker Hub as your base image. This can be easily done by replacing your base image of
FROM ubuntu:18.04 to something like
FROM nvidia/cuda:10.0-devel-ubuntu18.04 (differing versions and debugging tools). I also recommend using the cudnn variants as well (if you are utilizing ubuntu ≤ 20.04) as they are especially optimized for machine learning use cases.
However, in the off chance that this isn’t possible, viewing NVIDIA’s official Github pages with their source code and modifying it to fit your image would be your second best bet. However, many of these images displayed on Github do not work out of the bat, with many experiencing expired key problems as well as a variety of different issues. From my personal experience, it is much easier to manually install any other dependencies you wanted to use as your base image onto the base image of the NVIDIA drivers rather than the other way around. It would save you hours, if not days of development time taking this approach. I recommend cross-referencing questions on the NVIDIA forums as there do not current exist many Stack Overflow posts on this topic.
If you are ever unsure as to whether your container has successfully installed NVIDIA tools, you can simply modify your base image briefly to install the
devel flag instead of the runtime or base flag. Then, simply exec into your container and run the
nvidia-smi command. However, do not worry if you do not see your GPU within the menu at this time. That will be what we tackle in our next step. If, however, you see the error that the command is not found, you have probably failed to install CUDA properly within the container.
GPU and Docker Interconnect
Finally, once you have properly installed you must also have Docker and the NVIDIA Docker toolkit installed as well. You can run the following commands below on a Linux Machine or follow the official guide here to properly install it.
curl -s -L [<https://nvidia.github.io/nvidia-docker/gpgkey>](<https://nvidia.github.io/nvidia-docker/gpgkey>) | sudo apt-key add -distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L [<https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list>](<https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list>) | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt update sudo apt-get install -y nvidia-docker2
Then, once all these are completed and when you have restarted either your computer or your Docker Daemon (which can be done through
sudo systemctl restart docker), simply run your Docker run command as you would, except you add the flag
—gpus all in front to allow your GPU to be passed to the container.
A good command to test if everything is working is
docker run -it --gpus all nvidia/cuda:10.0-devel-ubuntu18.04 nvidia-smi
Or, another alternative if you are running python is that you can run the following commands in your container once you have run it, granted that you have PyTorch installed.
$ python3 >> import torch >> torch.cuda.is_available()
Before we end it off, you might be asking, why go through all this effort just to containerize our code if it could run just fine on our local machines?
And, the simplest reason being compatibility.
Installing multiple different dependencies (for example in systems like ROS) causes a lot of issues. Programs may begin crashing for no reason and things simply might not run on a different machine. By creating a container or image, you instead remove many issues that result any external factors (i.e. operating systems, other packages, etc), and easily allow your program to run anywhere. Also, containerization is what makes deployment to cloud services much simpler and smoother. Docker containers are becoming more and more prevalent as we see it solving the main issue of “it works on my computer”. In addition, docker containers are becoming a de facto way of deploying code to robotic, edge, and embedded systems and a way to further remote development and work.
Why use GPU?
Using the GPU to offload a variety of tasks helps speed up performance. A Lot. Just take a look at this object detection program we have here called YOLOv5. Just by passing the
—gpus all, we see over a 10x increase in performance with performance, with the CPU only build running at roughly 1 frame processed per .442 seconds while the GPU accelerated container processing a frame within .046 seconds.
So there you have it! In this article we went through how to set up a GPU-accelerated docker container. As always, feel free to shoot us an email at firstname.lastname@example.org if you have any problems. Also if you’re interested in robot developer tooling including things such as robot feature flagging, a/b testing, or robot infrastructure, we’d love to talk!