Below is a complete, step‑by‑step guide for sending any program you run on a Linux box with two RTX‑2080 Ti cards (driver 580) to a specific GPU – whether the workload is CUDA, OpenCL, Vulkan, OpenGL, or any generic Linux‑GPU‑aware application.
1. Quick “cheat‑sheet” for the most common cases
| Workload type | How to pick the GPU | Example command |
|---|---|---|
| CUDA / cuBLAS / TensorRT / PyTorch / TensorFlow | CUDA_VISIBLE_DEVICES(or framework‑specific deviceflag) |
CUDA_VISIBLE_DEVICES=1 |
| OpenCL | CUDA_VISIBLE_DEVICESworks for the NVIDIA OpenCL ICD, or set GPU_DEVICE_ORDINALfor AMD‑style ICDs |
CUDA_VISIBLE_DEVICES=1 |
| Vulkan | VK_ICD_FILENAMES+ VK_DEVICE_INDEX (or VK_INSTANCE_LAYERS=) |
VK_ICD_FILENAMES= |
| OpenGL (X11) | Use GPU offload ( __NV_PRIME_RENDER_OFFLOAD=1,__GLX_VENDOR_LIBRARY_NAME=,__NV_PRIME_RENDER_) or run a headless EGL context (no X needed) |
__NV_PRIME_RENDER_OFFLOAD=1 |
| OpenGL (Wayland / headless) | EGL + EGL_DEVICE_EXT (EGL_EXT_device_base) – set EGL_PLATFORM_DEVICE_EXT |
EGL_PLATFORM_DEVICE_EXT=1 |
| General GPU‑aware apps (e.g., Blender, DaVinci Resolve) |
Most respect CUDA_VISIBLE_DEVICES;some have a command‑line flag ( --gpu 1) |
CUDA_VISIBLE_DEVICES=1 |
TL;DR – For any CUDA‑based work simply prependCUDA_VISIBLE_DEVICES=<index>where<index>is the zero‑based order shown bynvidia-smi.
For OpenGL/Vulkan you need a few extra environment variables, described in detail below.
2. Identify the GPUs and their indices
$ nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 2080 Ti (UUID: GPU-aaaaaaaa‑bbbb‑cccc‑dddd‑eeeeeeeeeeee)
GPU 1: NVIDIA GeForce RTX 2080 Ti (UUID: GPU-ffffffff‑1111‑2222‑3333‑444444444444)
- GPU 0 – the one driving the monitor (connected to the display).
- GPU 1 – the head‑less card (no cables attached).
The index (0 or 1) is what you will use with all the environment variables below.
If you prefer to address them by UUID (more robust against order changes) you can also use the NVML/CUDA APIs, but for shell scripts the numeric index is sufficient.
3. CUDA‑only workloads
3.1. One‑liner using CUDA_VISIBLE_DEVICES
CUDA_VISIBLE_DEVICES=1 ./my_cuda_app # runs on GPU‑1 (headless)
- The environment variable remaps the visible devices so that the process sees only the listed GPU(s).
- Inside the program,
cudaGetDeviceCount()will return 1, andcudaSetDevice(0)refers to the headless GPU.
3.2. Persisting the setting for a user or session
Add the line to your ~/.bashrc (or to a script that launches the whole workload):
export CUDA_VISIBLE_DEVICES=1 # or export CUDA_VISIBLE_DEVICES=0 for the display GPU
3.3. Framework‑specific ways
| Framework | Explicit device selection |
|---|---|
| PyTorch | torch.cuda.set_device(1)or model.to('cuda:1') |
| TensorFlow | tf.config.set_visible_devices(gpu[1], |
| JAX | export XLA_FLAGS=--xla_force_host_platform_device_count=0; |
| Numba | from numba import cuda; |
4. OpenCL (NVIDIA’s OpenCL implementation)
The NVIDIA OpenCL ICD honors CUDA_VISIBLE_DEVICES exactly like CUDA, so the same one‑liner works:
CUDA_VISIBLE_DEVICES=1 ./my_opencl_app
If you are using a vendor‑agnostic OpenCL loader (e.g., ocl-icd), you can also set:
export OCL_ICD_VENDORS=/etc/OpenCL/vendors/nvidia.icd
but the variable above is still the easiest.
5. Vulkan
Vulkan picks a physical device based on enumeration order – which matches nvidia-smi order. The driver ships a VK_DEVICE_INDEX environment variable that forces the selection:
export VK_DEVICE_INDEX=1 # 0 = display GPU, 1 = headless GPU
export VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json
./my_vulkan_app
If you use a launch script, combine the two lines:
VK_DEVICE_INDEX=1 VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json ./my_vulkan_app
Most modern Vulkan loaders also respect VK_LAYER_NV_optimus for hybrid‑GPU laptops; you can safely ignore it on a desktop.
6. OpenGL (X11) – the “Prime Render Offload” method
When you have one GPU attached to the display and a second headless GPU, the NVIDIA driver can offload OpenGL rendering to the headless GPU while still using the X server that runs on the display GPU.
6.1. Prerequisites
- You must be running an Xorg server (Wayland works only with EGL‑based headless contexts).
- The X server must be started on GPU‑0 (the display GPU). This is the default when you plug a monitor into it.
6.2. Environment variables for a single process
export __NV_PRIME_RENDER_OFFLOAD=1 # enable offload mode
export __GLX_VENDOR_LIBRARY_NAME=nvidia # force use of NVIDIA GLX lib
export __NV_PRIME_RENDER_OFFLOAD_PROVIDER=GPU-1 # UUID of the headless GPU
# optional: hide the display GPU from the app (so it never sees it)
export __GLX_RENDERER_STRING=GPU-1
./my_opengl_app
GPU-1is the UUID you got fromnvidia-smi -L. Using the UUID is safer than the numeric index because the order can change after a driver reinstall.- When you run the command, the X server still draws the window on the monitor (GPU‑0) but all GL commands are executed on GPU‑1. The result is composited back to the screen automatically.
6.3. Persistent configuration (system‑wide)
Add the following to /etc/profile.d/nvidia-offload.sh (or to the user’s ~/.profile):
#!/bin/sh
# Enable Prime Render Offload for GPU‑1 (headless)
export __NV_PRIME_RENDER_OFFLOAD=1
export __GLX_VENDOR_LIBRARY_NAME=nvidia
export __NV_PRIME_RENDER_OFFLOAD_PROVIDER=GPU-1 # replace with your UUID
Now any OpenGL program you launch from a terminal will automatically run on the headless GPU.
6.4. Verifying that the offload worked
$ glxinfo | grep "OpenGL renderer"
OpenGL renderer string: NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
If you see the UUID (or the model) of GPU‑1, the offload succeeded. You can also look at nvidia-smi while the app runs – you’ll see the headless GPU’s utilization climb.
7. OpenGL (headless) – EGL + EGL_EXT_device_base
If you don’t need any X window (e.g., for a compute‑oriented renderer or a server‑side image generator) you can skip X entirely and use EGL device extensions that talk directly to a GPU.
# Load the EGL device extension and ask for GPU‑1
export EGL_PLATFORM_DEVICE_EXT=1 # tells EGL to use a device, not a display
export EGL_DEVICE_INDEX=1 # same ordering as nvidia-smi
./my_egl_app
Your EGL program must call eglQueryDevicesEXT() and then eglGetPlatformDisplayEXT(EGL_PLATFORM_DEVICE_EXT, device, NULL). Most modern EGL‑based libraries (e.g., GLFW, SDL2, Qt, MoltenGL) already implement this fallback automatically when EGL_PLATFORM_DEVICE_EXT is set.
8. Managing the GPUs at the driver level
8.1. Persistence mode
Keep the driver loaded and avoid the “first‑use latency”:
sudo nvidia-smi -pm ENABLED
8.2. Compute exclusive mode (optional)
If you want to guarantee that only one process can use the headless GPU at a time:
sudo nvidia-smi -i 1 -c EXCLUSIVE_PROCESS
Replace 1 with the index of the headless GPU.
You can also set EXCLUSIVE_THREAD if you prefer exclusive‑thread mode.
8.3. Querying current usage
watch -n 1 nvidia-smi
or programmatically with NVML (nvidia-ml-py3 for Python) to log usage.
9. Putting it all together – a reusable launch script
Below is a generic Bash wrapper you can copy to /usr/local/bin/run_on_gpu1 (or any name you like). It auto‑detects the UUID of GPU‑1 and sets the correct variables for CUDA, OpenCL, Vulkan, OpenGL (X11) and EGL.
#!/usr/bin/env bash
# ------------------------------------------------------------
# run_on_gpu1 – launch a program on the *headless* RTX‑2080 Ti
# ------------------------------------------------------------
# 1. Identify the headless GPU (the one NOT attached to a monitor)
# We assume GPU 0 is the display GPU. Change if needed.
HEADLESS_IDX=1
HEADLESS_UUID=$(nvidia-smi -L | grep "GPU $HEADLESS_IDX:" | awk -F'UUID: ' '{print $2}' | tr -d ')')
# 2. Set the generic environment variables
export CUDA_VISIBLE_DEVICES=$HEADLESS_IDX
export __NV_PRIME_RENDER_OFFLOAD=1
export __GLX_VENDOR_LIBRARY_NAME=nvidia
export __NV_PRIME_RENDER_OFFLOAD_PROVIDER=$HEADLESS_UUID
export EGL_PLATFORM_DEVICE_EXT=1
export EGL_DEVICE_INDEX=$HEADLESS_IDX
export VK_DEVICE_INDEX=$HEADLESS_IDX
export VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json
# 3. Echo what we are doing (nice for debugging)
echo "=== launching on headless GPU $HEADLESS_IDX (UUID=$HEADLESS_UUID) ==="
echo "CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES"
echo "__NV_PRIME_RENDER_OFFLOAD=$__NV_PRIME_RENDER_OFFLOAD"
echo "__NV_PRIME_RENDER_OFFLOAD_PROVIDER=$__NV_PRIME_RENDER_OFFLOAD_PROVIDER"
echo "EGL_PLATFORM_DEVICE_EXT=$EGL_PLATFORM_DEVICE_EXT"
echo "VK_DEVICE_INDEX=$VK_DEVICE_INDEX"
# 4. Execute the command
exec "$@"
Make it executable:
sudo chmod +x /usr/local/bin/run_on_gpu1
Usage
run_on_gpu1 ./my_cuda_app # CUDA / OpenCL
run_on_gpu1 ./my_opengl_app # OpenGL (X11) offload
run_on_gpu1 ./my_egl_app # EGL headless
run_on_gpu1 ./my_vulkan_app # Vulkan
run_on_gpu1 python train.py # PyTorch / TensorFlow
All the required variables are set automatically; you can still override any of them on the command line if you need a different configuration.
10. Common pitfalls & how to debug them
| Symptom | Likely cause | Quick test / fix |
|---|---|---|
CUDA reports “device 0 is busy” even though I set CUDA_VISIBLE_DEVICES=1 |
The app spawns child processes before the env var is exported (e.g., via a systemd unit that sets its own environment). | Ensure the variable is set in the same process tree; use the wrapper script above or systemd Environment= directive. |
glxinfo still shows GPU‑0 as the renderer |
__NV_PRIME_RENDER_OFFLOAD_PROVIDER not set to the UUID (or the UUID is miss‑typed). |
Run nvidia-smi -L again, copy the exact UUID, and re‑export. |
| Vulkan crashes with “failed to enumerate physical devices” | VK_ICD_FILENAMES points to a non‑existent file (common on custom installations). |
ls /usr/share/vulkan/icd.d/ → confirm nvidia_icd.json exists. |
| EGL fails with “No EGL devices found” | The driver’s egl.cfg is missing or you’re running on a very old X server. |
Verify `apt list --installed |
| GPU utilization stays at 0 % while the app is running | The app is actually using the display GPU because it created its own X display (DISPLAY=:0). |
Unset DISPLAY for headless EGL (unset DISPLAY) or enforce offload variables before launching. |
nvidia-smi shows both GPUs in “Graphics” mode, but compute kernels run on GPU‑0 |
Compute exclusive mode is set to DEFAULT and another process has already taken GPU‑1. |
Run nvidia-smi -i 1 to see which processes own GPU‑1; kill stray processes or set EXCLUSIVE_PROCESS. |
11. “What about Windows?” (Brief)
If you ever need the same setup on Windows 10/11, the principle is the same but the tooling differs:
| Platform | Variable / API | Example |
|---|---|---|
| CUDA | CUDA_VISIBLE_DEVICES (supported in recent CUDA Toolkit) |
set CUDA_VISIBLE_DEVICES=1 |
| OpenGL / Vulkan | NVIDIA Control Panel → Manage 3D settings → Preferred graphics processor (set per‑application). |
Add your .exeand choose “High‑performance NVIDIA processor”. |
| DirectX 12 / Vulkan | Use the Graphics Settings page (Settings →)to assign the “High‑performance” GPU per app. |
Add the executable and set “High performance”. |
On Windows you cannot offload OpenGL to a headless GPU without a virtual display (e.g., using NVIDIA vGPU / VirtualGL), so most production pipelines stay on Linux for headless GPUs.
12. TL;DR – The one‑liner you need right now
Run any CUDA‑based or OpenCL‑based program on the headless GPU (GPU‑1):
CUDA_VISIBLE_DEVICES=1 ./your_app
Run an OpenGL program on the headless GPU while still displaying on the monitor:
export __NV_PRIME_RENDER_OFFLOAD=1
export __GLX_VENDOR_LIBRARY_NAME=nvidia
export __NV_PRIME_RENDER_OFFLOAD_PROVIDER=$(nvidia-smi -L | grep "GPU 1:" | awk -F'UUID: ' '{print $2}' | tr -d ')')
./your_opengl_app
Run a Vulkan program on the headless GPU:
export VK_DEVICE_INDEX=1
./your_vulkan_app
Run an EGL‑only (no window) program on the headless GPU:
export EGL_PLATFORM_DEVICE_EXT=1
export EGL_DEVICE_INDEX=1
./your_egl_app
You’re all set!
With the variables above (or the reusable wrapper script), you can direct any graphics or compute workload to the specific RTX‑2080 Ti you want, keeping the display GPU free for the UI and the headless GPU dedicated to heavy rendering or AI inference. Happy offloading!