Proxmox Virtual Environment (PVE) offers powerful virtualization capabilities but tinkering with PCIe devices can lead to unexpected challenges. If like me, you didn’t realize that messing around with PCIe devices in your Proxmox PVE host would break your VMs, keep reading.
In my case, I added a new NVMe SSD, but this can also happen when removing PCIe devices.
Access the Proxmox PVE host with mouse and keyboard and log in as a privileged user.
Gathering info: PCIe bus ids
It’s EXTREMELY likely that the device addresses change upon adding or removing any PCIe devices:
[For each restart] the PCI BIOS must walk the base PCI bus (starting at bus 0), subsequent bridges, and bridged devices to search and identify other PCI buses as if it were the first time.
Each time the PCI BIOS discovers another PCI bus after a physical configuration change is made, it increments the bus number and continues to walk the bus until all other buses are discovered.
As it discovers each bus and/or bridge, the PCI BIOS:
― Compaq, PCI Bus Numbering in a Microsoft Windows NT Environment
- Records each unique bus number
- Associates the bus number to a bus or bridged PCI device
In order to better understand how the PCIe bus ids have changed, lets run lspci to get a list of PCIe devices installed and their respective bus ids.
| |
The NVMe drive I just installed is 01:00.0. That bus id used to belong to the first Intel ethernet controller. So in this case, all NICs bus ids have increased by one.
01:00.0is now02:00.002:00.0is now03:00.0- …and so on
Potential issues
Issue 1: Network connectivity is gone
If your NICs are PCIe (which is very likely!), they would be affected by this:
The Linux kernel assigns names to network interfaces by combining a fixed prefix and a number that increases as the kernel initializes the network devices. […] If you add another network interface card to the system, the assignment of the kernel device names is no longer fixed. Consequently, after a reboot, the kernel can name the device differently.
When the consistent network device name feature is enabled, the
udevdevice manager creates the names of devices based on different criteria.― RedHat, RHEL 8 documentation. Chapter 1. Consistent network interface device naming
enfor Ethernet
[P<domain_number>]p<bus>s<slot>[f<function>][d<device_id>]
Run ip link (with the ethernet cable connected) in order to get the interface name for the Proxmox Linux Bridge.
| |
Leaving the loopback (lo) and Linux Bridge (vmbr0) interfaces aside, the new interface name for the Linux Bridge is enp5s0.
Comparing against the naming criteria, we can understand that enp5s0:
- Is an ethernet connection.
- Its bus is
5. - Its slot within the bus is
0.
Can now edit /etc/network/interfaces to reflect the change, and restart the networking service.
| |
| |
At this point, network connectivity should be restored. Check by pinging Google: ping -c3 google.com.
Issue 2: Broken Passthrough PCI devices
Now, one of Proxmox PVE features is the ability to do PCI Passthrough, allowing the guest VM to access host physical resources. As you have probably guessed by now, if any of your VMs use PCI passthrough, they will also be affected by the PCIe bus changes.
This is required for every VM that does PCI passthrough.
Retrieve the VM id of the relevant machine, with the following command:
| |
Then export the VM id as an environment variable, to avoid having to remember it every time: export VM_ID=100.
Ensure the VM is (gracefully) stopped, by running qm shutdown $VM_ID. Once stopped, check the VM config, with qm config $VM_ID.
| |
hostpci0, hostpci1 and hostpci2 are all NICs that are PCI passthrough devices. As mentioned earlier, the bus ids for the NICs has increased by one, so just need to edit the VM config file located in /etc/pve/qemu-server/${VM_ID}.conf in my case:
Remember to shutdown the machine first if you haven’t yet: qm shutdown $VM_ID
| |
Now you should just be able to either start the VM from the Proxmox Web UI (remember, we just restored connectivity to it!). Or by running the following command: qm start $VM_ID.
