NVMe I/O timeouts

I had the same issue as long as I powered my VF2 from a common 3A USB-C power supply and also with the 4A for the Pi4. Since I use a 65W max USB-C Multi-Slot power supply, I’m rid of those messages.
I assume, that there are short spikes in power consumption exceeding the 3A/4A border of the simpler power supplies and that then disturbs the PICe protocol on the bus causing retries.

3 Likes

Hello, everyone! Is there already a solution to this problem? I have an OrangePi RV JH7110 and NVMe WD SN740. I’ve tried different power supplies, but nothing helps. I’m currently using vanilla kernel 6.19-rc1 (because DTS for my development board has appeared there).

You could try with a different SSD. Have a look at the Compatibilty List
I am running a Toshiba KXG50ZNV256G on the VF2 and a “Samsung Electronics Co Ltd NVMe SSD Controller 980 (DRAM-less)” on the VF lite (can’t tell to much about the compatibility of the later, as it is still running from the eMMC, haven’t installed anything to the NVMe yet).
A Union Memory AM620 256GB was also detected on the VF lite.
If you don’t want to hunt for one of the devices on the list, get something on the lower performance spectrum, they tend to use less power. Smaller capacity M.2 with 80mm length should be easily available both new and used, even with the current AI induced shortage.

2 Likes

Thanks a lot for your answer. By the way, people here are experimenting with various kernel startup parameters. For me, the situation is slightly better when the parameters “pcie_port_pm=off pcie_aspm.policy=performance” are used, but timeout messages are still present.

You could try with a different SSD.

I found and ordered one of the NVMe drives from the list of supported devices. When it arrives, I will post the results here.

“pcie_port_pm=off pcie_aspm.policy=performance”

Without these options, the system doesn’t boot at all and endless timeout errors appear.

What are you using as a power supply ?

I did a quick search on the Western Digital SN740 NVMe SSD and it has a maximum power draw listed that varies by capacity, generally ranging from about 4.6 watts to 6.0 watts under maximum load (active use).

The 8 GB VisionFive 2 under maximum load (CPU/GPU cores all stressed) will probably consuming about 5W to 6W watts in isolation, with no peripheral hardware attached.

The maximum power in to VF2 is specified to be 30 watts via the USB-C cable.

Even if a cheaper power supply is actually rated to supply enough power (VF2 + NVMe + USB peripherals), it might not be able to ramp up from low power to maximum power usage (CPU+SSD) fast enough to match the change in power demand. Usually this delay is worked around by having large enough decoupling capacitors in modules or other peripherals, but to save money this can be skimped upon by using physically smaller lower capacitance capacitors that can not provide enough current for long enough to bridge any delays in the switched mode power supply ramping up the current (The assumption being made is that the SMPS will be high end to handle the problem with more expensive higher capacitance lower ESR capacitors). And that could be why the timeouts are happening. The easy solution is to use an older NVMe that do not have the same extremely rapid changes in power demands and/or use a higher quality power supply.

1 Like

What are you using as a power supply ?

I tried Zenwire YMS-608 100W and something called “Raspberry Pi 5 5.1V 5A Power Supply PD 27W USB Type C” from AliExpress. No difference at all, same behavior. I also found information that some ARM-based SBCs also have similar behaviour, and they also recommend installing an older NVMe, which consumes less power and, apparently, solves the problem. I also have an OrangePi RV2 (Spacemit K1), and there are no problems at all with either the power supplies or the NVMe.

Some NVMe SSDs actually can’t be used on the VF2; the XPG SX8200 Pro is one of them. It works fine on my mini PC (N6000), by the way.

Today I received a Kingston NV1 NVMe M.2 2280 SSD 250GB and copied the root partition to it. After booting, I ran find /gnu -type f | xargs md5sum (I use Guix System) and immediately got a timeout message. After I removed the pcie_port_pm=off pcie_aspm.policy=performance from kernel boot options and rebooted, the timeout errors stopped occurring. To be more precise, they appear one-two times during kernel boot and do not appear again even when actively working with the disk. This is true with every power adapter I own, no matter.

1 Like