Anyone who follows this website will notice that the tutorials are rather long. In these long tutorials I usually give reference specs, explain basic terms or processes, and expand on the how and why.
I wish I could write short, easy, step by step tutorials titled “GPU passthrough made easy” or the “Quick guide to VFIO bliss”. In fact, there are plenty of those out there in the great Internet. Some of the most popular ones are on Youtube, showing you how to get your Windows gaming VM up and running in no time.
Often enough those short tutorials work fine. And I’m glad to see that many hardware vendors have paid attention and actively support and promote virtualization technology.
What if it’s not working?
There is nothing more frustrating than following a tutorial step-by-step and fail. Creating a virtual machine with GPU passthrough, as an example, is a complicated procedure that has dozens of potential failure points.
Take this example: For some time I promoted the “driver override method” as a robust method to bind a GPU to the vfio-pci driver. Until some users (and I myself) discovered that it doesn’t work anymore on Ubuntu 20.04-based systems. I edited my tutorial and suggested another, grub-based solution.
But I also described why my “new” method cannot always replace my original method. Because if you have two identical GPUs in your PC, the grub-method won’t work.
It’s in the details
Taking the example of the “driver override method” above, one Linux distro may work fine whereas another will fail. I strongly recommend that you stick with your preferred distro. Usually there is a way to make it work.
Here is a list of details to pay attention to:
- Motherboard vendor and model
- Motherboard BIOS release – can influence IOMMU grouping and other things
- CPU model – different CPUs have different features, some are simply unsuitable
- GPU vendor and model
- Kernel release – remember, newer is NOT always better
- Kernel release – VFIO loaded as modules (older kernels) or built-in?
- Linux distribution and version
- QEMU release
- Network/Internet access via wire (Ethernet) or WiFi (wireless)
- Linux security settings such as Apparmor
- USB ports and their association with host/VM
- Screen, USB hub, keyboard, mouse, etc. and how and where they are connected to the PC
I could add more to the list above, but I think you got my point.
When running into problems, try to gather as many details as possible (see list above). See if you find some clues in the log files / journal.
The worst thing you can do is panic and start changing settings randomly without knowing what or why you are changing something. If you do change a setting, document it so you can retract.
My tutorials usually offer suggestions for common issues. You will also find links to other tutorials and forums that can help.
When turning to forums or user groups, provide at least the xml configuration, the Linux distro, kernel release, QEMU release, and some details on your hardware, together with a link to the tutorial you are following.
This allows others to go over your configuration and perhaps find the problem.
Remember: details matter.