Zram vs Optane SSD swap for large compile

WillR · August 3, 2023, 1:31am

After watching rust fail to link repeatedly last week, I resorted to creating a massive (110GB) swap partition on an Optane NVME drive to force the job through the OOM error. On the recommendation of @jiangtao9999 I tried doing it again with very aggressive zram settings. I gave zram 7GB of RAM and enabled zstd as the compression algorithm. After watching the memory usage of the linking phase with the Optane swap, I knew zstd only needed to accomplish 2:1 compression to get through the compile.

My board is a rev 1.3B 8GB model booting in SDIO.

The boot drive is an old micro SD card I pulled out of an underwater camera, a Lexar 32GB (it says 300x and I think Class 10). It’s running the cwt14 Archlinux image.

It’s running to host a chroot Gentoo environment I’ve been messing with on a Samsung Pro Endurance 256GB Micro SD attached to the VF2 via a USB3 card reader. This is the drive used for the compile and any other data files used.

The last drive of note is an Intel P1600X 118GB Optane NVME SSD.

I wrote a small bash script to log the CPU, RAM, and Swap usage every 1 minute over the 9-10 hour compile which was handled by emerge rust. I unfortunately forgot to log the CPU temperature, but I don’t think excessive temps influenced anything. My board currently resides on a 200mm fan that is the top exhaust of my desktop PC, so it gets pretty good airflow. CPU usage is loadavg * 1000 to make it visible on the graph.

This graph is a little busy, but I think it displays how well the VF2 performs with zram. The zram configuration completed the source unpacking, configure, compile, link, merge, and install in 543 minutes, while having to resort to an SSD for swap extended the same workload to 618 minutes, or 75 minutes slower or 14% longer.

The differences between the 2 configurations are minor until the linking stage needs over 6GB of swap, then the SSD starts to have trouble for the next ~2GB or hour. I don’t know why.

jiangtao9999 · August 3, 2023, 10:35am

VF2’s m.2 port is 1x PCI-E 2.0. Bandwidth is 500MB/s only.
And , I remember USB & PCI-E are use one AIX bus to connect main bus.
I think you was have a “bus limited issue”…

WillR · August 3, 2023, 11:51pm

My guess is the work changes from filling the swap to mostly reading back from it when those 2 lines (green and orange for the different swaps) diverge about 7.5-8 hours into the job. Zram is able to keep 2-3 processes fed for those 10-15 minutes before the ram usage falls off a cliff much faster than reading data back from the SSD that drags along for nearly an hour and a half (11 minutes vs 85 minutes if I’m looking at the right lines in the spreadsheet, so zram is 7-8x faster when, I assume, needing to read back data). There really isn’t anything to “fix” here. Both work and work well imo. I just thought others would find the comparison interesting as I did.

Your suggestion to give it a try is very appreciated. Not only is it a solution that doesn’t require additional hardware, but it was even faster than trying to add on a hardware solution. I don’t know what the pitfalls of giving as much of the system memory to zram as I did, so I’m not recommending it. But it worked extremely well for this use case. Thanks again!

jiangtao9999 · August 4, 2023, 10:20am

pitfalls is that zram will not recycle zero-page.
I suggest pre compile job is finished.
do as root:

swapoff /dev/zramX # close zram swap
echo 1 > /sys/block/zramX/reset # zram will be empty
echo xG >  /sys/block/zramX/disksize # reset the size

& mkswap && swapon

coltree · August 8, 2023, 10:49pm

Put your old uSD back in your GoPro and keep it for surfing, run off a decent sized nvme ssd.
The Optane is for Intel m-b/chipset/cpu, just try a standard ssd.
Put a heatsink and fan on your cpu.
log this
echo “temp $((`cat /sys/class/thermal/thermal_zone0/temp`/1000))C”
as you compile, or even make a pretty graph.
You didn’t mention your power supply, minimum 45W TypeC psu.
As regular setup, make swap partition (on ssd) a bit bigger than ram, 8-16GB.

WillR · August 9, 2023, 12:32pm

The microSD was actually pulled from a WaterWolf or maybe a GoFish Cam, so it’s probably been close to 100 meters under water in the Gulf of Mexico before. But it works fine for trying things out. It only gets 11MB/s through the onboard card reader while my fastest card gets around 20MB/s. Those same cards will do 70MB/s and 160MB/s on a USB3 card reader attached to the VF2.

I have a Solidigm P44 and a Patriot P300, both 1TB, I’ll keep one of on the board eventually. For now, the P1600X is perfect for testing stuff like using the NVME as swap. It works just like a normal PCI 3.0x4 NVME drive that is low latency and has extreme endurance.

I already have a heatsink on the CPU. I was changing the boot pins and NVME out often enough that shortly after putting the VF2 inside a case, I ended up removing it, and the fan from the kit I bought attaches to the case. So I just plopped it down on top of my desktop’s exhaust which is always on anyway. The board is nicely out of the way, has the TTL console connected, and air moving over it all the time. It’s a good solution for me. IIRC, temps stayed around 51C during the compiles whenever I looked at them.

I’m using the power brick that came with the kit. I think it’s only 15 watts, but I haven’t had any problems with the board that make me think it needs a bigger supply. I have a 65w GaN brick with USB C I can hook it to, but until I feel a need I likely won’t bother. 45 watts seems very overkill. When I had it on the Kill A Watt I rarely saw the board pull over 10 watts. I guess I can try again with the bigger adapter to see if the board ever wants more from it than the small power supply can provide.

coltree · August 12, 2023, 6:11am

Good value purchase, extra ~$17 for 16G emmc, 20W psu & wifi, won’t say how much I spent on a 45W psu )-:
how did it go with the 65W ?

WillR · August 16, 2023, 3:51am

Plus a clear case. I just recently bought the 65W GaN for $20.

I finally powered off the board for the first time tonight in 2 weeks and ran it through the Kill-A-Watt. In short, there was no difference using the 20W plug from the kit and this 65W GaN.

The board idled at 7 or 8 watts. I attached SD cards to all USB ports. Then I ran 8 simultaneous calls to hdparm to access the eMMC, onboard SD, the Optane NVME, and 5 additional micro SD cards in readers. That could push the board as high as 10 watts. Then I fired up compiling gcc for some CPU load. With that hitting all 4 cores, the board would only draw 9 or 10 watts. Adding the 8 threads of hdparm on top of that was hitting 12 or 13 watts max. I could see if one of my other NVME drives is more power hungry, but I doubt any would break 20 watts, especially if I removed the unreasonable amount of flash drives that won’t be staying connected.

All the power draws were within 1 watt or less between the two drives. I know the Kill-A-Watt isn’t the absolute best to measure these low power situations but it’s good enough for me.

Now, that said, I can get the board to draw 23 Volt-Amps, 19 with just the CPU load. That is something I did not think to check with the 20W PSU. So I may stick with the 65 for a while. The 20 was quite warm when I switched.

mzs · August 16, 2023, 8:28am

The thing about a NVMe is that some can have a peak current for about 10 microseconds of around 2.8 amps @ 3.3 volts (9.24 watts). And on a Kill-A-Watt you will never see that spike because it is so short and because the average power will be much lower even under full load.

But because it is so short in duration the decoupling capacitors (local energy storage) should help to reduce that peak current draw somewhat. There are 6 decoupling capacitors in total shown as being somewhere along the VCC3V3_PCIE trace in the VF2 1.3B board schematic ( C2353 22uF; C2354 100nF; C2355 100nF and C2364 22uF/10V; C2363 22uF/10V; C2362 100nF/10V ), in two groups of three.

WillR · August 17, 2023, 12:02am

Only seeing 9 or 10 watts under full load compiling my first thought was that would be plenty of headroom even for another 9 watts of load, but after looking at the VA and power factor, I’m starting to reconsider it.

Going to keep the board on the 65 watt PSU for a while to see if the NVME timeouts go away. They’re the only thing really “wrong” with the board that isn’t lack of software. The system recovers gracefully from them, so it hasn’t been a big concern. They’re also fairly infrequent, and I haven’t seen a pattern why they occur yet.