How to check health of NVMe SSD on ubuntu

A practical, step-by-step guide to checking NVMe SSD health on Ubuntu using nvme-cli, smartctl, and GNOME Disks. Learn how to read SMART data, spot early warning signs, run self-tests, update firmware, and keep your data safe.

How to check health of NVMe SSD on ubuntu
Photo by Wesley Tingey / Unsplash

What this guide covers (quick map)

  1. Why check NVMe health and what to expect
  2. Tools you’ll use (nvme-cli, smartctl/smartmontools, GNOME Disks) — which to pick and why
  3. Step-by-step: find the device, read SMART (nvme-cli), interpret the necessary fields
  4. Alternative: use smartctl and the Disks GUI (how and why)
  5. Self-tests, logs, and continuous monitoring (smartd/systemd)
  6. Firmware updates, manufacturer utilities, and safe maintenance tips
  7. Troubleshooting common issues and a final checklist

1. Why check NVMe health (short answer)

NVMe SSDs are fast and reliable, but like all storage they wear over time. SMART (Self-Monitoring, Analysis, and Reporting Technology) exposed by NVMe controllers reports controller warnings, temperature, the percentage of life used, media errors, and read/write counters. Reading these regularly helps you spot early wear or errors so you can back up data and plan replacements. The standard low-level tool for NVMe on Linux is nvme-cli; smartctl from smartmontools is a longer-standing SMART tool that also supports NVMe. Ubuntu’s Disks app gives a friendly GUI view for quick checks. (NVM Express)

2. Tools & installation

nvme-cli communicates with NVMe controllers and exposes NVMe-specific log pages (like the SMART log) in a vendor-neutral way. Install it and you’ll be able to run nvme list, nvme smart-log, nvme id-ctrl, and more.

Install:

sudo apt update
sudo apt install nvme-cli

Alternative/additional: smartmontools / smartctl

smartctl is useful because it presents SMART output in a familiar format (smartctl -a) and supports scheduled tests via smartd. NVMe support exists and is widely used, although historically it was added later than SATA support — use it as a complementary tool. (Smartmontools)

Install:

sudo apt install smartmontools

GUI: GNOME Disks ("Disks" / gnome-disk-utility)

Good for quick checks and running self-tests when you prefer not to use the terminal. If you don’t have it:

sudo apt install gnome-disk-utility

Open “Disks” from the Activities menu to view SMART data and run self-tests. (Ubuntu Help)

3. Identify your NVMe device

Before you run checks, find the device path.

Using nvme-cli:

sudo nvme list

Example output (abbreviated):

Node             SN                Model                        Namespace Usage
/dev/nvme0n1     E200M00...        KINGSTON...                  1         2,000.2  GB / 2,000.2 GB

Device node you’ll use for nvme-cli commands is typically /dev/nvme0 (controller) and /dev/nvme0n1 (namespace / block device). Use /dev/nvme0n1 when mounting/partitioning, but nvme smart-log works against the controller path (e.g. /dev/nvme0). The nvme list result tells you the exact nodes.

You can also check with:

lsblk -o NAME,MODEL,SIZE

A.1 Read the SMART log (human readable)

Use the controller device (/dev/nvme0 — not the namespace path) for nvme smart-log:

sudo nvme smart-log /dev/nvme0

Typical output (annotated):

critical_warning                    : 0
temperature                         : 36 C
available_spare                     : 100
available_spare_threshold           : 10
percentage_used                     : 1
data_units_read                     : 12345
data_units_written                  : 67890
host_read_commands                  : 98765
host_write_commands                 : 54321
controller_busy_time                : 0
power_cycles                        : 12
power_on_hours                      : 234
unsafe_shutdowns                    : 0
media_errors                        : 0
num_err_log_entries                 : 0

A.2 How to interpret the key fields

  • critical_warning: 0 = OK. Any non-zero value signals a controller warning (temperature or reliability). If non-zero, stop and investigate immediately.
  • percentage_used: Wear indicator (manufacturer dependent). 0 means essentially brand new; 100 usually indicates the rated endurance has been reached. This is a primary wear metric.
  • temperature: Reported in °C. Keep sustained temps below recommended limits (generally below ~70°C for consumer NVMe, but check your SSD's datasheet).
  • available_spare / available_spare_threshold: Spare pool remaining. If available_spare <= threshold, replacement is recommended.
  • media_errors and num_err_log_entries: Non-zero values mean controller detected read/write/other errors — important red flags.
  • data_units_written/read: Units are in 512,000-byte blocks or controller-specific units (nvme spec uses 512 KiB or 512,000 bytes units historically) — combined with drive documentation you can estimate TBW.
  • power_on_hours / power_cycles / unsafe_shutdowns: Helps calculate usage patterns and correlate with errors.

If any of these show alarming values (critical_warning != 0, percentage_used high, media_errors > 0) back up immediately and consider replacement. The NVMe SMART log format and semantics come from the NVMe specification and nvme-cli presents them in readable form. (Debian Manpages)

A.3 Save the raw SMART log (for support)

sudo nvme smart-log /dev/nvme0 --output-format=json > nvme0_smart.json

Or raw binary:

sudo nvme smart-log /dev/nvme0 --raw-binary > nvme0_smart.raw

5. Method B: use smartctl from smartmontools (complementary)

smartctl can produce an -a report and run tests. For NVMe you typically use the controller device like /dev/nvme0:

B.1 Quick health check:

sudo smartctl -H /dev/nvme0

This returns a high-level health assessment.

B.2 Full report:

sudo smartctl -a /dev/nvme0

This shows many fields similar to nvme smart-log and some additional interpretation that smartctl provides.

B.3 Run a self-test (if supported)

sudo smartctl -t short /dev/nvme0
# or
sudo smartctl -t long /dev/nvme0

Check results after the test completes:

sudo smartctl -l selftest /dev/nvme0

Notes: NVMe support in smartmontools exists but historically arrived later than SATA support — it’s stable on modern distributions but you may occasionally find vendor-specific differences. Use nvme-cli for NVMe-native features and smartctl for compatibility/familiar output.

6. Method C: quick GUI check with GNOME Disks

  1. Open Activities → Disks (or run gnome-disks).
  2. Select your Kingston NVMe drive from the left column.
  3. Click the three-dot menu (top right) → SMART Data & Self-Tests.
  4. Review: overall assessment, attributes, and run self-tests.

This is ideal for quick checks or when showing an end-user a result. The Disks frontend simply reads the same SMART info and offers self-tests.

Gnome Disks - SMART Data & Self-Tests...
Gnome Disks - Start Self-Test
Gnome Disks - Fresh Results

7. Self-tests, logs, and ongoing monitoring

Run scheduled self-tests and monitoring with smartd (part of smartmontools)

smartd can watch devices and run tests on a schedule, emailing or logging problems. Configure /etc/smartd.conf and enable the smartd service:

sudo systemctl enable --now smartd.service

Example smartd.conf line (simple):

/dev/nvme0 -H -f -l error -m you@yourdomain.com

(Adjust options for your environment and read man smartd carefully.) smartd is powerful but needs careful configuration for NVMe (and to avoid spamming logs).

Use a light systemd timer script (alternative)

If you prefer simple periodic checks, you can create a small script that runs nvme smart-log and stores the JSON to /var/log/nvme-health/ once a day. Use a systemd timer to schedule it.

Example script (conceptual):

#!/bin/bash
outdir=/var/log/nvme-health
mkdir -p "$outdir"
timestamp=$(date -Iseconds)
sudo nvme smart-log /dev/nvme0 --output-format=json > "$outdir/nvme0_$timestamp.json"

Then add a systemd service and timer to run it daily. This gives you a historical record (very useful to spot trends).

8. Firmware updates and Kingston utilities

Manufacturers sometimes release firmware updates that fix bugs or improve reliability. Kingston publishes firmware & tools (Kingston SSD Manager) for their drives; for consumer NVMe drives you can check Kingston’s support page for model-specific utilities and firmware updates. If a firmware update is available, follow Kingston’s instructions—read them carefully. Back up before any firmware flash. (Kingston Technology Company)

Important: Many vendor firmware updaters run only on Windows. If that’s the case, either create a Windows PE USB to run the vendor tool, use vendor ISO update utilities, or check if vendor provides a Linux utility or a vendor-neutral tool (rare). Always backup first.

9. Safe maintenance tips & what to do when you see red flags

Backup first, always. If you see any non-zero critical_warning, non-zero media_errors, high percentage_used (close to 100), or repeated unsafe shutdowns, immediately back up any important data.

What to watch for (priority):

  1. critical_warning != 0 → immediate investigation and backup.
  2. media_errors > 0 or num_err_log_entries > 0 → serious concern.
  3. percentage_used climbing fast or near manufacturer rating.
  4. High/unstable temperature readings (sustained >70°C is bad for longevity).
  5. Unexpected increase in unsafe_shutdowns (may indicate power or cabling issues).

If you find a problem:

  • Stop heavy writes and back up data.
  • Try to reproduce (a single transient error may be benign; repeated errors are not).
  • Update firmware if the manufacturer recommends a fix.
  • Replace the drive if errors persist; SSD replacement is inexpensive compared to lost data.

10. Example troubleshooting scenarios & fixes

Scenario A — nvme smart-log shows percentage_used: 0 but media_errors > 0:
A few media errors can be caused by transient issues (power glitch, cable, controller hiccup). Immediately backup, run nvme error-log /dev/nvme0 and nvme smart-log again after a reboot. If errors persist, consider RMA.

Scenario B — laptop shows weird I/O errors after resume from sleep:
Check dmesg for NVMe/driver errors. Try a firmware update, check kernel versions (sometimes kernel NVMe fixes appear in newer kernels), and consider replacing the M.2 card if vendor advises.

Scenario C — Disks shows disabled SMART for NVMe:
Make sure smartmontools and nvme-cli are installed. Some desktop environments hide SMART options for NVMe — use the command line to be sure.

11. Practical commands cheat-sheet (copyable)

Identify device:

sudo nvme list
lsblk -o NAME,MODEL,SIZE

NVMe SMART (controller):

sudo nvme smart-log /dev/nvme0
sudo nvme id-ctrl /dev/nvme0

Save JSON:

sudo nvme smart-log /dev/nvme0 --output-format=json > nvme0_smart.json

smartctl:

sudo smartctl -H /dev/nvme0
sudo smartctl -a /dev/nvme0
sudo smartctl -t long /dev/nvme0
sudo smartctl -l selftest /dev/nvme0

View kernel messages (if I/O issues):

sudo dmesg | grep -i nvme

Check temp quickly (if nvme shows no temp):

nvme smart-log /dev/nvme0 | grep temperature

12. Final checklist before you sleep easy

  • Back up important data (always first).
  • Run sudo nvme smart-log /dev/nvme0 and save the JSON.
  • Check critical_warning, percentage_used, media_errors, temperature.
  • Install and configure smartd or a small systemd timer script to collect logs daily.
  • Check Kingston support for firmware updates and tools — only apply vendor firmware per instructions. (Kingston Technology Company)
  • If anything is alarming, reduce writes and prepare to replace the drive.

Read next

Exploring the Ubuntu Desktop Environment

The Ubuntu Desktop Environment is one of the most user-friendly and feature-rich desktop environments in the Linux world. If you're coming from a Windows or macOS background, you'll find that Ubuntu's desktop provides an intuitive, powerful, and highly customizable experience.

How to add an existing HDD to Ubuntu

How to Add an HDD to Ubuntu – A Step‑by‑Step Guide Prologue – Meet Pavel Pavel just bought a second hard‑drive (HDD) to store movies, backups, and a few old projects. The machine already runs Ubuntu 24.04 LTS,