For some reason, the AWS Deep Learning AMI is using the old version of TensorFlow, even though the latest image was created in April 2017. Unfortunately, to fix that, a simple upgrade with ‘pip install’ on the TensorFlow library is not enough, as we need to upgrade the TensorFlow-GPU binary to the corresponding version. That, on the other hand, implies that we need to have CUDA 8.0 instead of the 7.5 version available by default, which makes the thing a bit more complicated.
To find out if you are using GPU device for your TensorFlow computations execute the following two lines inside the Python console:
import tensorflow as tf sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
If you see something similar to “Device mapping: no known devices.” it means you are not utilizing your GPU and TensorFlow is using only CPU to do its work (which, of course, is many times slower).
Installing the NVIDIA stack (Driver, CUDA, cuDNN):
Check your CUDA version:
email@example.com:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2015 NVIDIA Corporation Built on Tue_Aug_11_14:27:32_CDT_2015 Cuda compilation tools, release 7.5, V7.5.17
If you don’t see version number 8.0 or higher for your Cuda compilation tools, you will need to install CUDA 8 and appropriate NVIDIA drivers:
sudo apt update -y && sudo apt upgrade -y sudo apt install build-essential linux-image-extra-`uname -r` -y
Download the driver: http://www.nvidia.com/Download/driverResults.aspx/114708/en-us
(look for NVIDIA-Linux-x86_64-375.66.run file)
Download appropriate CUDA version: https://developer.nvidia.com/cuda-zone
(look for cuda_8.0.61_375.26_linux-run)
Download appropriate cuDNN version: https://developer.nvidia.com/cudnn - to download the cuDNN you will need to be logged in to a NVIDIA developer account
(look for cudnn-8.0-linux-x64-v5.1.tgz file)
Install the driver:
chmod +x NVIDIA-Linux-x86_64-375.66.run sudo ./NVIDIA-Linux-x86_64-375.66.run
(choose the ‘yes’/’ok’ options when asked, and you should be fine)
chmod +x cuda_8.0.61_375.26_linux-run ./cuda_8.0.61_375.26_linux-run --extract=`pwd`/extracts sudo ./extracts/cuda-linux64-rel-8.0.61-21551265.run
Please make sure that:
- PATH includes /usr/local/cuda-8.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root
For proper PATH and LD_LIBRARY_PATH settings you can execute:
echo -e "export CUDA_HOME=/usr/local/cuda\nexport PATH=\$PATH:\$CUDA_HOME/bin\nexport LD_LIBRARY_PATH=\$LD_LINKER_PATH:\$CUDA_HOME/lib64" >> ~/.bashrc
and ‘source .bashrc’ to refresh.
tar -xf cudnn-8.0-linux-x64-v5.1.tgz cd cuda sudo cp lib64/* /usr/local/cuda/lib64/ sudo cp include/cudnn.h /usr/local/cuda/include/
Once the NVIDIA driver, CUDA, and cuDNN are properly installed we can move on to upgrading our TensorFlow and TensorFlow GPU installations:
sudo pip install 'tensorflow==1.1.0' --force-reinstall sudo pip install 'tensorflow-gpu==1.1.0' --force-reinstall
Run the TensorFlow check again, you should see something similar to the output below:
Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GRID K520, pci bus id: 0000:00:03.0 2017-06-02 17:30:43.742925: I tensorflow/core/common_runtime/direct_session.cc:257] Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GRID K520, pci bus id: 0000:00:03.0
Upgrading the NVIDIA stack and TensorFlow itself becomes easy once you know how to match the versions of the driver, CUDA, cuDNN and TensorFlow. Using GPU acceleration for training or testing deep neural networks speeds things up considerably, e.g. from almost 9s per image while testing object detection (https://softwaremill.com/counting-objects-with-faster-rcnn/) to around 1s per image.