1. Why NVIDIA Drivers Interfere with Suspend/Resume
When your Linux machine refuses to suspend, wakes up to a black screen, or drains battery during sleep, the GPU driver is often at fault. This happens because the proprietary NVIDIA driver:
- Doesn’t always power down the GPU correctly.
- Leaves the display controller in a half-active state.
- May clash with the Linux kernel’s power-management system.
Common Symptoms → Root Causes
| Symptom | Why It Happens |
|---|---|
| Freeze on suspend or no wake-up | GPU power state isn’t properly saved/restored. |
| Black screen after resume | DisplayPort/HDMI link not re-initialized; Xorg or Wayland may have crashed. |
| High battery drain while “asleep” | GPU stuck in P0 (full power) because the driver didn’t receive runtime PM signals. |
| Kernel oops / GPU lock-ups | Mismatched driver module vs. kernel or leftover nouveau modules. |
Key Concepts for Suspend/Resume
| Concept | Why It Matters |
|---|---|
| Runtime PM | Kernel asks driver to shut down GPU when idle; NVIDIA sometimes disables this. |
| Modesetting & DRM | Linux’s display manager (DRM) must hand off the display pipeline properly. |
| Hybrid Graphics (Optimus/Prime) | Misconfigured Intel/AMD + NVIDIA systems keep the dGPU powered or lose output. |
| Secure Boot | Unsigned modules get blocked, falling back to nouveau or partial drivers. |
| Kernel Module ABI | Driver built for an older kernel may load but mishandle power callbacks. |
2. Preparation & Safety Checklist
Before changing drivers, ensure you have:
| Item | Purpose | How to Do It |
|---|---|---|
| Backup | Protect your data in case of a failed install. | Use rsync, Timeshift, or disk images. |
| TTY/SSH access | Rescue your system if the GUI fails. | Ctrl+Alt+F3 or enable SSH (sudo systemctl enable ssh). |
| Package manager knowledge | Commands vary per distro. | Check /etc/os-release to identify yours. |
| Know your GPU model | Determines driver version. | lspci -nn |
| Check installed driver | So you know what you’re removing. | nvidia-smi or `modinfo nvidia |
| Disable Secure Boot temporarily | Unsigned modules won’t load otherwise. | In BIOS/UEFI disable “Secure Boot.” |
| Fallback kernel entry | Boot a rescue kernel if the new driver fails. | In GRUB add systemd.unit=rescue.target to an entry. |
3. Clean Removal of NVIDIA Drivers
The goal is to completely remove all NVIDIA software—kernel modules, configuration files, and blacklists—so you’re starting fresh.
3.1 Ubuntu/Debian
# Stop the display manager (GDM, LightDM, SDDM, etc.)
sudo systemctl stop gdm # replace gdm with your DM
# Purge all NVIDIA packages
sudo apt-get purge '^nvidia-.*' \
'^libnvidia-.*' \
'nvidia-driver' \
'nvidia-settings' \
'nvidia-prime' \
'nvidia-modprobe' \
'xserver-xorg-video-nvidia' \
'cuda-*' \
'libcuda*' \
'nvidia-container*' \
'libnvidia*' \
'nvidia-dkms*' \
'nvidia-opencl*'
# Remove leftover DKMS modules
sudo dkms remove nvidia/$(nvidia-smi --query-gpu=driver_version --format=csv,noheader) --all
# Delete configuration files
sudo rm -f /etc/X11/xorg.conf
sudo rm -rf /etc/X11/xorg.conf.d/10-nvidia.conf
sudo rm -f /etc/modprobe.d/blacklist-nouveau.conf
sudo rm -f /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
# Re‑install the default open‑source driver (optional)
sudo apt-get install --reinstall xserver-xorg-video-nouveau
# Regenerate initramfs (important for nouveau blacklist removal)
sudo update-initramfs -u
# Reboot
sudo reboot3.2 Fedora/RHEL/CentOS
# Stop graphical.target
sudo systemctl isolate multi-user.target
# Remove RPM Fusion NVIDIA packages
sudo dnf remove '*nvidia*' '*cuda*' '*opencl*'
# Remove DKMS if used
sudo dnf remove akmod-nvidia
sudo dkms remove nvidia/$(modinfo -F version nvidia) --all || true
# Clean up Xorg config & modprobe files
sudo rm -f /etc/X11/xorg.conf
sudo rm -rf /etc/X11/xorg.conf.d/10-nvidia.conf
sudo rm -f /etc/modprobe.d/blacklist-nouveau.conf
# Rebuild initramfs
sudo dracut -f
# Reboot
sudo reboot3.3 openSUSE
sudo systemctl isolate multi-user.target
sudo zypper rm -u $(zypper se -i nvidia | awk '{print $5}')
sudo rm -f /etc/X11/xorg.conf
sudo rm -rf /etc/X11/xorg.conf.d/10-nvidia.conf
sudo rm -f /etc/modprobe.d/blacklist-nouveau.conf
sudo mkinitrd
sudo reboot3.4 Arch Linux
# Stop display manager
sudo systemctl stop gdm # or sddm, lightdm, etc.
# Remove all nvidia packages
sudo pacman -Rns nvidia nvidia-utils nvidia-settings \
nvidia-dkms nvidia-390xx-dkms nvidia-340xx-dkms \
cuda lib32-nvidia-utils
# Remove leftover config
sudo rm -f /etc/X11/xorg.conf
sudo rm -rf /etc/X11/xorg.conf.d/10-nvidia.conf
sudo rm -f /etc/modprobe.d/blacklist-nouveau.conf
# Re‑generate initramfs
sudo mkinitcpio -P
# Reboot
sudo reboot3.5 Windows (Quick Note)
If you dual-boot, a corrupt Windows driver can leave the GPU in a weird state:
- In Device Manager → Display Adapters → NVIDIA → Uninstall Device (check Delete driver software).
- Run Display Driver Uninstaller (DDU) in Safe Mode.
- Reboot and install the latest driver.
4. Safe Driver Reinstallation
4.1 Choosing the Right Driver Version
Match your GPU to the recommended version (e.g. RTX 40 series: 525.105.17+). Never install a driver newer than your GPU’s “Supported” version on NVIDIA’s site.
4.2 Installation Methods
| Method | Pros | Cons | When to use |
|---|---|---|---|
| Distribution packages (e.g., nvidia-driver from Ubuntu repos) | Integrated with package manager, DKMS builds automatically, updates via normal updates. | May lag behind upstream; sometimes missing the newest CUDA. | Most users; especially on LTS releases. |
| Official .run installer (downloaded from NVIDIA) | Latest driver, optional CUDA toolkit, explicit control. | Manual updates, risk of breaking DKMS, conflicts with Secure Boot. | When you need a driver version not yet in repos or need the bundled CUDA toolkit. |
| Third‑party repos (e.g., “Graphics Drivers PPA” for Ubuntu, “RPM Fusion” for Fedora) | Usually newer than distro default, well‑maintained. | Still a step away from upstream; may have occasional regressions. | When you need a newer driver but want automatic updates. |
| Containerised GPU stacks (e.g., NVIDIA Container Toolkit) | Isolates driver version per‑container, great for CUDA workloads. | Still requires a host driver; not a solution for X/Wayland. | For AI/ML workloads; not for fixing suspend issues. |
4.3 Installing on Ubuntu/Debian
# Add the “Graphics Drivers” PPA (optional but gives newer drivers)
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
# Install the recommended driver (replace 525 with your version)
sudo apt install nvidia-driver-525
# Verify DKMS built correctly
dkms status | grep nvidia
# Reboot
sudo reboot4.4 Fedora (RPM Fusion)
# Enable RPM Fusion free + nonfree
sudo dnf install https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm
sudo dnf install https://mirrors.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm
# Install driver
sudo dnf install akmod-nvidia # pulls latest driver + dkms
# Or a specific version:
# sudo dnf install xorg-x11-drv-nvidia-525xx akmod-nvidia-525xx
# Rebuild initramfs (normally done automatically)
sudo dracut --force
# Reboot
sudo reboot4.5 Arch Linux
# Install the driver that matches your kernel (use `uname -r` to see kernel version)
sudo pacman -S nvidia nvidia-utils nvidia-settings
# For LTS kernel:
# sudo pacman -S nvidia-lts nvidia-utils-lts
# Rebuild initramfs (if you have custom hooks)
sudo mkinitcpio -P
# Reboot
sudo reboot4.6 Manual .run Installer (Generic Linux)
- Download
.runfile from NVIDIA. - Switch to TTY, stop GUI:
sudo systemctl isolate multi-user.target. - Blacklist nouveau and rebuild initramfs.
- Reboot.
Run installer:
sudo sh NVIDIA-Linux-x86_64-525.105.17.run --no-cc-version-check --dkms
Secure Boot: either disable or sign the module with MOK (see below).
# Download .run file from NVIDIA.com (e.g., NVIDIA-Linux-x86_64-525.105.17.run)
# Switch to a TTY, stop X/Wayland:
sudo systemctl isolate multi-user.target
# Remove any previously installed driver (see Section 3)
# Blacklist nouveau (if not already):
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
echo "options nouveau modeset=0" | sudo tee -a /etc/modprobe.d/blacklist-nouveau.conf
sudo update-initramfs -u # or mkinitcpio -P / dracut -f
# Run installer (add `--dkms` to build a DKMS module):
sudo sh NVIDIA-Linux-x86_64-525.105.17.run --no-cc-version-check --dkms
# Follow on‑screen prompts (accept the OpenGL libs, disable Nouveau if asked).
# Reboot
sudo reboot5. Targeted Fixes & Tweaks
Apply one fix at a time and test suspend/resume after each.
5.1 Kernel Parameters
Edit /etc/default/grub:
mem_sleep_default=deep→ force deep sleep (S3).nvidia-drm.modeset=1→ enable DRM-KMS.pci=noaer→ suppress PCIe error spam.
Update GRUB and reboot:
sudo update-grub && sudo reboot
5.2 Blacklisting Nouveau
# Create a permanent blacklist file
sudo tee /etc/modprobe.d/blacklist-nouveau.conf <<EOF
blacklist nouveau
options nouveau modeset=0
EOF
# Regenerate initramfs
sudo update-initramfs -u # Ubuntu/Debian
# or
sudo mkinitcpio -P # Arch
# or
sudo dracut -f # Fedora5.3 Enable nvidia‑drm early
sudo tee /etc/modprobe.d/nvidia-drm.conf <<EOF
options nvidia-drm modeset=1
EOF5.4 Add nvidia‑uvm to early‑load (helps CUDA + suspend)
echo "nvidia-uvm" | sudo tee -a /etc/modules-load.d/nvidia.conf5.5 Pre-Sleep Script to Unbind GPU
Create /lib/systemd/system-sleep/nvidia-suspend.sh:
#!/bin/sh
# This script unbinds the NVIDIA driver before suspend
# and re‑binds it after resume.
# Works for both laptops and desktops.
case $1/$2 in
pre/*)
# Pre‑suspend: unbind
echo "Unbinding NVIDIA GPU before suspend"
for dev in /sys/bus/pci/devices/*:0*/driver/unbind; do
echo "$(basename $(readlink $dev))" > "$dev"
done
;;
post/*)
# Post‑resume: bind back
echo "Re‑binding NVIDIA GPU after resume"
for dev in /sys/bus/pci/devices/*:0*/driver/bind; do
echo "$(basename $(readlink $dev))" > "$dev"
done
;;
esacMake it executable:
sudo chmod +x /lib/systemd/system-sleep/nvidia-suspend.sh
5.6 Hybrid Graphics Tools
If you have Intel/AMD integrated + NVIDIA discrete, you often need to tell the system which GPU should handle rendering and which should stay idle during suspend.
- Ubuntu/Debian:
# Install the “prime-select” utility
sudo apt install nvidia-prime
# Switch to NVIDIA mode (for better performance & proper suspend)
sudo prime-select nvidia
# Reboot
sudo rebootResult: The NVIDIA driver becomes the primary GPU, and the Intel/AMD GPU is turned off during suspend (reducing the chance of a “GPU busy” error).
- Arch:
sudo pacman -S optimus-manager optimus-manager-qt
# Enable the systemd service
sudo systemctl enable optimus-manager.service
# Start it
sudo systemctl start optimus-manager.serviceThen: Useoptimus-manager --switch nvidiaoroptimus-manager --switch integrated.
Why: optimus-manager does a clean hand‑off, powering down the unused GPU before suspend.5.7 NVIDIA DRM KMS & Wayland
If you run Wayland (e.g., GNOME 44 on Ubuntu 24.04), the NVIDIA driver must have nvidia-drm.modeset=1 and the nvidia-drm module loaded early.
# Enable KMS in the driver
sudo tee /etc/modprobe.d/nvidia-drm.conf <<EOF
options nvidia-drm modeset=1
EOF
# Regenerate initramfs
sudo update-initramfs -u # Ubuntu/Debian
# or mkinitcpio -P / dracut -f
# Ensure the module loads early:
sudo tee /etc/modules-load.d/nvidia-drm.conf <<EOF
nvidia-drm
EOF
# Reboot
sudo rebootNow test suspend. If the issue persists, proceed to the next fix.
5.8 Fixing GPU Busy Errors
Symptoms: systemd[1]: Failed to suspend: Device or resource busy and NVIDIA: GPU is busy in journalctl -b.
# Create a systemd sleep hook
sudo tee /etc/systemd/sleep.d/disable-nvidia.conf <<'EOF'
[Service]
Type=oneshot
ExecStart=/usr/bin/nvidia-smi -pm 1 # Enable persistence mode
ExecStop=/usr/bin/nvidia-smi -pm 0 # Disable persistence mode on wake
EOFAlternatively, you can manually stop services like PipeWire, KWin, or Xorg before suspending:
systemctl stop pipewire pipewire-pulse
systemctl suspend
systemctl start pipewire pipewire-pulseUse nvidia-suspend / nvidia-resume scripts
NVIDIA provides systemd units that handle power‑state transitions. Ensure they’re enabled:
# On Ubuntu/Debian:
sudo systemctl enable nvidia-suspend.service
sudo systemctl enable nvidia-hibernate.service
sudo systemctl enable nvidia-resume.service
# Verify:
systemctl status nvidia-suspend.service5.9 Fixing PCIe AER Spam
A spamming AER log can cause the kernel to abort suspend. If you see lines like:
AER: Corrected error received: id=00e0
AER: [ 123.456789] PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)Add the pci=noaer kernel parameter (see 5.1) or disable AER for the NVIDIA device:
# Identify the PCI address (e.g., 0000:01:00.0)
lspci -nn | grep -i nvidia
# Create a module parameter file
sudo tee /etc/modprobe.d/nvidia-aer.conf <<EOF
options nvidia NVreg_AEREnabled=0
EOF
# Regenerate initramfs
sudo update-initramfs -u # Ubuntu/Debian5.10 MOK (Machine Owner Key) for Secure Boot
If you must keep Secure Boot enabled (e.g., corporate laptops), you can sign the NVIDIA kernel module:
- Generate a key pair
sudo mokutil --import /root/MOK.der # later we'll enroll it
# Create a key (if not existing)
openssl req -new -x509 -newkey rsa:2048 -keyout MOK.priv -outform DER -out MOK.der -nodes -days 36500 -subj "/CN=NVIDIA MOK/"- Enroll the key – Reboot, you’ll see a blue MOK enrollment screen. Choose Enroll MOK, then Continue, then Yes, then Enter password (the one you set with
mokutil --import). - Sign the module
# After driver installation, locate the .ko file
sudo /usr/src/linux-headers-$(uname -r)/scripts/sign-file sha256 ./MOK.priv ./MOK.der $(modinfo -n nvidia)
# Verify
sudo modinfo nvidia | grep signerAlternative – Use the mokutil --disable-validation option to temporarily bypass MOK, but that defeats Secure Boot’s purpose.5.11 Power Management Daemons
| Daemon | Use | When to enable |
|---|---|---|
tlp |
Aggressive power‑saving for laptops (CPU scaling, USB autosuspend). | If you have a laptop and notice high power draw after resume. |
laptop-mode-tools |
Provides a “laptop mode” that tweaks disk and USB power on suspend. | If you need more granular control than TLP. |
powertop |
Diagnostic tool – can suggest wake‑up sources. | Run sudo powertop --auto-tune after each driver reinstall. |
Example (TLP on Ubuntu):
bashCollapseSaveCopy123sudo apt install tlp tlp-rdwsudo systemctl enable tlpsudo systemctl start tlp
sudo apt install tlp tlp-rdw
sudo systemctl enable tlp
sudo systemctl start tlpAfter enabling, run sudo tlp-stat -s to confirm it’s active.
5.12 BIOS / Firmware Updates
Many suspend/resume bugs are firmware‑level. Check the manufacturer’s website for a BIOS update that mentions “Improved sleep/hibernate” or “GPU power management”.
Laptop Model | Known Fix |
|---|---|
Dell XPS 13 9370 | BIOS 1.13.0 (2024) + acpi_osi= resolves S3 wake‑up. |
Lenovo ThinkPad X1 Carbon Gen 9 | BIOS 1.45 + nvidia-drm.modeset=1 + prime-select nvidia. |
HP Spectre x360 14 | BIOS 2.5.0 (2025) + disabling AER ( pci=noaer). |
If you cannot update BIOS (e.g., corporate lock‑down), you may need to request an exception from IT.
6. Testing Your Setup
After each change, reboot and test suspend:
systemctl suspend
# Wait a few seconds, then wake the machine (press power button or open lid)journalctl -b -1 # Shows logs from previous boot (pre‑suspend)
journalctl -b # Shows current boot (post‑resume)7. Rollback & Alternatives
- Nouveau: install open-source driver (
xserver-xorg-video-nouveau). - Version pinning: lock a known-good NVIDIA version.
- Containerised stacks: isolate CUDA inside containers (NVIDIA Container Toolkit).
Quick “One‑Liner” Checklist (Ubuntu/Debian)
sudo apt purge '^nvidia-.*' && sudo apt install nvidia-driver-560 nvidia-prime tlp
sudo prime-select nvidia
echo "options nvidia-drm modeset=1" | sudo tee /etc/modprobe.d/nvidia-drm.conf
echo "nvidia-drm" | sudo tee /etc/modules-load.d/nvidia-drm.conf
echo "nvidia-uvm" | sudo tee -a /etc/modules-load.d/nvidia.conf
sudo systemctl enable nvidia-suspend.service nvidia-resume.service
sudo systemctl enable tlp
sudo update-initramfs -u
sudo reboot8. Appendix
- Logs:
/var/log/syslog,journalctl -xe,dmesg. - DKMS status:
dkms status. - Power states:
cat /proc/driver/nvidia/gpus/0000:01:00.0/power.
This rewrite preserves your manual’s scope but presents it as a flowing, practical guide with clearer explanations and step-by-step formatting for readers.
Would you like me to also add screenshots and annotated diagrams (e.g., showing BIOS settings, GRUB edits) to turn it into a visual PDF/HTML guide?