deans-dox
 / Linux / IOMMU.md

Perfect — that’s the ideal case for this 👍 On current Proxmox (8.x) with an NVMe controller directly attached to the CPU, you’re in the “this actually works really well” zone.

Short answer again: yes, you can pass it through and boot the VM from it, and it’s a supported and common setup.

Here’s the practical reality.


What works (and what doesn’t) in Proxmox

PCIe passthrough of the NVMe controller

This is the method you want.

  • Proxmox uses VFIO
  • The VM sees the NVMe as if it were physically installed
  • You can:

    • Install the OS directly onto it
    • Boot from it
    • Get near-bare-metal performance

Because it’s CPU-direct (no PCIe switch), you avoid:

  • ACS grouping nightmares
  • Reset bugs common with some chipset-attached devices

Requirements checklist (important)

1️⃣ IOMMU enabled

In BIOS:

  • Intel: VT-d
  • AMD: SVM + IOMMU

Kernel params (usually already set on Proxmox 8):

intel_iommu=on
# or
amd_iommu=on

Check:

dmesg | grep -e IOMMU -e DMAR

2️⃣ NVMe controller in its own IOMMU group

Verify:

find /sys/kernel/iommu_groups/ -type l

You should see the NVMe controller alone or only with harmless stuff (rare if CPU-direct).

If it’s clean → green light.


3️⃣ Bind NVMe to vfio-pci

Get device ID:

lspci -nn | grep -i nvme

Bind it:

echo "options vfio-pci ids=XXXX:YYYY" > /etc/modprobe.d/vfio.conf

Update initramfs and reboot:

update-initramfs -u -k all
reboot

After reboot:

lspci -k

Driver should be vfio-pci.

⚠️ This removes the NVMe from the host entirely. Proxmox cannot use it for storage anymore.


VM configuration (the key part)

VM firmware

Use:

  • OVMF (UEFI) ✔️
  • Machine type: q35

Legacy SeaBIOS can work, but UEFI is more reliable for NVMe booting.

Add PCI device

  • Hardware → Add → PCI Device
  • Select the NVMe controller
  • Check:

    • ✔️ All functions
    • ✔️ PCI-Express
    • (ROM-Bar usually not needed)

OS installation & boot

Inside the VM:

  • Installer sees the NVMe normally (/dev/nvme0n1)
  • Install bootloader to it
  • VM boots directly from the passed-through device

No tricks required.


Performance & stability

You’ll get:

  • Native NVMe latency
  • Full SMART access
  • No virtio overhead

Stability is excellent as long as:

  • The VM is always shut down cleanly
  • You don’t try to hot-migrate (this setup is not migratable)

Important limitations (don’t skip this)

❌ No live migration

PCI passthrough = pinned hardware.

❌ No snapshots

Proxmox can’t snapshot a raw PCI device.

❌ Host crash = VM crash

Same as bare metal.


Optional but smart tweaks

  • CPU type: host
  • NUMA: enable if multi-socket
  • Disable ballooning
  • Pin VM to the same NUMA node as the NVMe (advanced but nice)

TL;DR

✔️ Proxmox 8 + CPU-direct NVMe = yes, absolutely ✔️ Booting from it works ✔️ Near bare-metal performance ❌ No migration/snapshots

If you want, tell me:

  • Intel or AMD?
  • What OS in the VM?
  • Single VM owning the disk or dual-boot-style usage?

I can give you a bulletproof step-by-step config for your exact setup.