Prerequisites
Before starting, ensure your system meets these requirements:
- Linux operating system (Ubuntu 20.04 or later recommended)
- Apropriate NVIDIA GPU
- Apropriate NVIDIA driver
- At least 10GB of free disk space
Table of Contents
- Install NVIDIA Driver
- Install Docker
- Install NVIDIA Container Toolkit
- Install TensorRT Container
- Verification
- Troubleshooting
Install NVIDIA Driver
- Check if NVIDIA driver is installed:
nvidia-smi
- Install NVIDIA drivers:
sudo apt install nvidia-driver-### # Use latest recommended version sudo reboot
Install Docker
- Check if docker is install and remove old versions (if any):
dpkg -l | grep -i docker
- Install Docker Engine:
sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin
- Verify Docker installation:
sudo docker run hello-world
Install NVIDIA Container Toolkit
- Install NVIDIA Container Toolkit (if necessary):
Install TensorRT Container
- Pull the TensorRT container with TensorFlow support:
sudo docker pull nvcr.io/nvidia/tensorflow:##.##-tf#-py3
- Run the container:
sudo docker run --gpus all -it --rm nvcr.io/nvidia/tensorflow:##.##-tf#-py3
Verification
- Inside the container, verify TensorFlow can see the GPU:
import tensorflow as tf print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
- Verify TensorRT:
import tensorflow as tf from tensorflow.python.compiler.tensorrt import trt_convert as trt print(tf.__version__)
Troubleshooting
Common Issues and Solutions
-
Docker permission denied
- NVIDIA driver not found
sudo reboot
- Container fails to start with GPU error
# Check if NVIDIA runtime is properly configured sudo docker info | grep -i runtime
Additional Resources
Note: Version numbers in this guide may need to be updated based on your system requirements and the latest available versions.