In this howto we will get CUDA working in Docker. And - as bonus - add Tensorflow on top! However, please note that you'll need following prereqs:
GNU/Linux x86_64 with kernel version > 3.10
Docker >= 1.9 (official docker-engine, docker-ce or docker-ee only)
NVIDIA GPU with Architecture > Fermi (2.1)
NVIDIA drivers >= 340.29 with binary nvidia-modprobe
We will install the NVIDIA drivers in this tutorial, so you should only have the right kernel and docker version already installed, we're using a Ubuntu 15.05 x64 machine here. For CUDA, you'll need a Fermi 2.1 CUDA card (or better), for tensorflow a >= 3.0 CUDA card...
Which Graphicscard Model do I own?
lspci | grep VGA
sudo lshw -C video
Output i.e.:
product: GF108 [GeForce GT 430]
vendor: NVIDIA Corporation
You should lookup on google if it works with cuda / Fermi 2.1, i.e. on https://developer.nvidia.com/cuda-gpus
GeForce GT 430 - Compute: 2.1
Ok, that one works!
I got additional infos from: https://www.geforce.com/hardware/desktop-gpus/geforce-gt-430/specifications
CUDA and Docker?
You can find out more about that topic on https://github.com/NVIDIA/nvidia-docker
Getting it to work will be the next step:
Download right CUDA / NVIDIA Driver
from http://www.nvidia.com/object/unix.html
I choose Linux x86_64/AMD64/EM64T, Latest Long Lived Branch version: 375.66, but please check in the description of the file, if your graphics card is supported!
After Download, install the driver:
chmod +x NVIDIA-Linux-x86_64-375.66.run
sudo ./NVIDIA-Linux-x86_64-375.66.run
It will ask for permission, accept it. If it gives info that the nouveau driver needs to be disabled, just accept that, in the next step, it will generate a blacklist file and exit the setup. Afterwards, run
sudo update-initramfs -u
and reboot your server. Then, rerun the setup with
sudo ./NVIDIA-Linux-x86_64-375.66.run
You can check the installation with
nvidia-smi
and get an output similar to this one:
Mon Jul 24 09:03:47 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 430 Off | 0000:01:00.0 N/A | N/A |
| N/A 40C P0 N/A / N/A | 0MiB / 963MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
which means that it worked!
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb
Test nvidia-smi from Docker
nvidia-docker run --rm nvidia/cuda nvidia-smi
should output:
Using default tag: latest
latest: Pulling from nvidia/cuda
e0a742c2abfd: Pull complete
486cb8339a27: Pull complete
dc6f0d824617: Pull complete
4f7a5649a30e: Pull complete
672363445ad2: Pull complete
ba1240a1e18b: Pull complete
e875cd2ab63c: Pull complete
e87b2e3b4b38: Pull complete
17f7df84dc83: Pull complete
6c05bfef6324: Pull complete
Digest: sha256:c8c492ec656ecd4472891cd01d61ed3628d195459d967f833d83ffc3770a9d80
Status: Downloaded newer image for nvidia/cuda:latest
Mon Jul 24 07:07:12 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 430 Off | 0000:01:00.0 N/A | N/A |
| N/A 40C P8 N/A / N/A | 0MiB / 963MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
Yep, you got it working in Docker!
Running an interactive CUDA session isolating the first GPU
NV_GPU=0 nvidia-docker run -ti --rm nvidia/cuda
Input our first Hello World program
echo '#include <stdio.h>
// Kernel-execution with __global__: empty function at this point
__global__ void kernel(void) {
// printf("Hello, Cuda!\n");
}
int main(void) {
// Kernel execution with <<<1,1>>>
kernel<<<1,1>>>();
printf("Hello, World!\n");
return 0;
}' > helloWorld.cu
Compile it within the Docker container
nvcc helloWorld.cu -o helloWorld
Execute it...
./helloWorld
and you get,...
Hello, World!
Congrats, you got it working!
Encore, Tensorflow
Getting Tensorflow to work is straight forward:
nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpu
It will output something like:
Copy/paste this URL into your browser when you connect for the first time, to login with a token:
http://localhost:8888/?token=d747247b33023883c1a929bc97d9a115e8b2dd0db9437620
you should do that 🙂
Then enter the 1_hello_tensorflow notebook and run the first sample:
from __future__ import print_function
import tensorflow as tf
with tf.Session():
input1 = tf.constant([1.0, 1.0, 1.0, 1.0])
input2 = tf.constant([2.0, 2.0, 2.0, 2.0])
output = tf.add(input1, input2)
result = output.eval()
print("result: ", result)
by selecting it and clicking on the >| (run cell, select below) Button.
This worked for me:
result: [ 3. 3. 3. 3.]
however... sadly not the GPU was calculating the results as shown by the Docker CLI:
Kernel started: 2bc4c3b0-61f3-4ec8-b95b-88ed06379d85
[I 07:31:45.544 NotebookApp] Adapting to protocol v5.1 for kernel 2bc4c3b0-61f3-4ec8-b95b-88ed06379d85
2017-07-24 07:32:17.780122: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-24 07:32:17.837112: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-07-24 07:32:17.837440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GT 430
major: 2 minor: 1 memoryClockRate (GHz) 1.4
pciBusID 0000:01:00.0
Total memory: 963.19MiB
Free memory: 954.56MiB
2017-07-24 07:32:17.837498: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-07-24 07:32:17.837522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-07-24 07:32:17.837549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] Ignoring visible gpu device (device: 0, name: GeForce GT 430, pci bus id: 0000:01:00.0) with Cuda compute capability 2.1. The minimum required Cuda capability is 3.0.
So, CUDA >= 3.0 devices only for tensorflow 🙁 - but, it still works, as it is using the CPU (however, not as fast as it could :/)
Infos taken from:
https://github.com/NVIDIA/nvidia-docker
https://developer.nvidia.com/cuda-gpus
https://hub.docker.com/r/tensorflow/tensorflow/