Waveshare way of booting the board? - Discussion

Hi

I just found the waveshare wiki for the VF2:

Whats interesting to me is this:

They use the UART Adapter just to boot up it seems. This makes not real sense to me because at the moment i’m under the impression, that you can not boot the newest image as the firmware needs to be updated first. and the old Image should just boot up i think.

And after you upgraded the firmware with the help of UART it should just boot up.

So what is going in here ?

Is this a middle way to boot while the old firmware is still in play ?

Thanks!

Maybe Waveshare will deliver boards with new firmware. What Waveshare offers does not have to be comparable with our Earlybirds.

Waveshare state “VisionFive2 is compatible with Raspberry Pi series boards…”
This is misrepresentation, as this board cannot even boot from an NVME SSD at the moment.
I am beginning to regret having ordered this board, nor do I like playing laboratory rat for these people.
Not impressed.

Aubrey

You do understand that early bird electronics typically means that the hardware is close enough to the final product. That the hardware is ready (or at least very close to the final product), but not all the software required for access to all the functionality is in place yet.

Even only 20 years ago, a product would only ship out to any customer once everything advertised was fully working and given at least one coat of polish. There was no such thing as early bird or super early bird hardware. Basically the board you have now is a small test run of the hardware that is also sent out to developers who are getting everything to work and adding the first coat of polish before the final product ships. But booting from NVMe was not actually an advertised feature, nothing I read prior to buying the board mentioned booting from anything other than QSPI NOR/NAND Flash, SD card, eMMC or UART.

The way I see it you have 2 options:

  1. Wait for the software feature you want/need to be added (I’m sure NVMe is a major selling point), the current hardware should be able provide that boot function.
  2. Sign NDA’s and gain access to the required technical reference documentation to create your own bootloader.

One thing to keep in mind at the end of the day it is only a x1 lane PCIe Gen2 interface, (maximum theoretical throughput 477MiB/sec).

For comparison:

UART: 
   12800 bytes/sec at 115200 baud 8-N-1 (8 data bits, no parity bit, and one stop bit)
QSPI FLASH:
   12.5 MB/sec (100 Mbit/sec)
SD 3.0:
   12.5 MB/sec (Default or UHSI_SDR12)
   25 MB/sec (High Speed or UHSI_SDR25)
   50 MB/sec (UHSI_SDR50 or UHSI_DDR50)
   104 MB/sec (SDR104)
TFTP:
   117 MB/sec after packet overheads at 1Gbit/sec
eMMc 5.0:
   250 MB/sec Sequential Read, 90 MB/sec Sequential Write, 7000 Random Read (IO/s), 13000 Random Write (IO/s)
USB 3.0:
   400 MB/sec (you could claim 500 MB/sec after 10/8 encoding of the raw signalling at 5Gbit/sec but in the real world that is never going to happen)
NVMe:
   500 MB/sec with 1x lane PCIe Gen2 

Ok i think i fugured it out. In the wiki they boot from UUART the jumpers are in that possition. But bootinf from SD is also possible.

My point is that I do not want to play the guinea pig for this board’s software/firmware teething problems.
Using UART to diagnose booting problems is abhorrent to me.
If they want to sell the device, very well, but at least provide a decent path to a workable bootable setup. I do not want to reinvent the wheel again.
“maximum theoretical throughput 477MiB/sec”
I get the same rate on an RP4 with UAS/USB to a NVME device.
I am finished commenting on all things at this stage.
I do not like moaning, but this board is not for me.

Aubrey

@mzs 500MB/s for NVM subsystem ain’t bad at all for this board. Several CPU performance tests I’ve done so far explicitly tell that SBC is slower comparing to RPi4. So this peak (500MB/s) would be a nice compromise for VisionFive2.

Its about the level of a Raspi3 which is ok for me.

I would disagree, because it depends on what you are comparing. CPU: RPi4 wins gold, VF2 wins silver, and RPi3 gets the Bronze. But for the GPU (once any 4K software driver/firmware issues are resolved): VF2 gold, RPi4 silver, RPi3 bronze.
GPU

  • VF2
    glmark2 Score: 182 (1920*1080p)
    glmark2 Score: 449 (default is 800x600)
  • RPi4B
    glmark2 Score: 40 (1920*1080p)
    glmark2 Score: 172 (default is 800x600)
  • RPi3B
    glmark2 Score: 29 (1920*1080p)
    glmark2 Score: 84 (default is 800x600)

CPU (I should probably add at DMIPS is only integer maths and CoreMark is a more valid but still totally synthetic benchmark)

  • RPi4B
    4 cores clocked at 1.8 GHz
    39672 CoreMark (5.51 x 4 x 1800)
    45360 DMIPS (6.3 x 4 x 1800 )

  • VisionFive2
    4 cores clocked at 1.5 GHz
    30720 CoreMark (5.12 x 4 x 1500)
    16860 DMIPS (2.81 x 4 x 1500 )

  • RPi3B+
    4 cores clocked at 1.4 GHz
    17920 CoreMark (3.2 x 4 x 1400).
    12544 DMIPS (2.24 x 4 x 1400 )

RPi4 - BCM2711 - 4x Cortex-A72 @ 1500 MHz (https://www.eembc.org/coremark/scores.php)
5.51 CoreMark/MHz (single core)
6.3-7.3 DMIPS/MHz
Caches: L1: 32 KiB data + 48 KiB instruction L1 cache per core. 1MiB L2 cache.
Memory: 1 GiB, 2 GiB, 4 GiB, or 8 GiB of LPDDR4-3200 SDRAM

VisionFive2 - 4x u74-mc @ 1.5 GHz Benchmark Scores (https://www.sifive.com/cores/u74-mc)
5.12 CoreMark/MHz (single core)
4.27/2.81 DMIPS/MHz (Best Effort/Legal)
Caches: 32kB data + 32kB instruction L1 cache per core. 2MB L2 cache.
Memory: 2 GiB, 4 GiB, or 8 GiB LPDDR4 SDRAM up to 2,800 Mbps

RPi3 - BCM2837 - 4x Cortex-A53 @ 1.2 GHz (https://www.eembc.org/coremark/scores.php)
3.2 CoreMark/MHz (single core)
2.24 DMIPS/MHz
Caches: L1: 16 KiB L1P (instruction) + 16KiB L1D (data) cache per core. 512 KiB L2 cache.
Memory: 1GiB LPDDR2 900MHz

Both the JH7110 and BCM2711 are manufactured with a TSMC 28 nm process (BCM2837 is a 40 nm process), both have a 32 KiB data cache. The JH7110 has a 2 MiB L2 cache, where as the BCM2711 has a 1 MiB L2 cache.

The RPi4 wins on CPU performance because the current cores are clocked at 1.8 GHz (20% faster), the L1 instruction cache is 50% larger and LPDDR4 is potentially 15% faster. But where the VF2 wins hands down is the GPU. The RPi4B has a 500 MHz Broadcom VideoCore VI, which to be fair has changed a lot since the VideoCore IV in the RPi1B, which was introduced in 2012, the clock was doubled from 250 MHz and the performance has at least doubled. But the Imagination BXE-4-32 GPU in the VF2 is in a totally different league by comparison.

  • VF2
    glmark2 Score: 449 rpi4-glmark2-es2-results (default is 800x600)
    glmark2 Score: 182 rpi4-glmark2-es2-results-fullscreen (1920*1080p)
    glmark2 Score: 390 rpi4-glmark2-es2-results-offscreen (default is 800x600)

From https://gist.github.com/janisozaur/baf7b07c5e5128826cdb7108f1a4dd54:

  • RPi4B
    glmark2 Score: 172 rpi4-glmark2-es2-results (default is 800x600)
    glmark2 Score: 40 rpi4-glmark2-es2-results-fullscreen (1920*1080p)
    glmark2 Score: 361 rpi4-glmark2-es2-results-offscreen (default is 800x600)

  • RPi3B
    glmark2 Score: 84 rpi3b-glmark2-es2-results (default is 800x600)
    glmark2 Score: 29 rpi3b-glmark2-es2-results-fullscreen (1920*1080p)
    glmark2 Score: 218 rpi3b-glmark2-es2-results-offscreen (default is 800x600)

Some 4K results might be nice anyone know if something like the following commands will work (no VF2 hardware here yet) and I do not own a RPi4.

$ glmark2-es2-drm --off-screen --size 3840x2160 --visual-config -red=8:green=8:blue=8:alpha=8:buffer=0'
$ glmark2-es2-drm --size 3840x2160 --visual-config -red=8:green=8:blue=8:alpha=8:buffer=0'
5 Likes

Here is the output from my tests: https://paste.ee/p/YVBOv
I ran the tests twice once with the power management at the default “ondemand” governor, and a second time with the “performance” governor.

Here is a summary of the results (see the above link for the details):

VisionFive 2 glmark2 Scores:
  >>> ondemand governor<<<  >>> performance governor<<<
$ glmark2-es2-wayland 
   >> glmark2 Score: 451 <<  >> glmark2 Score: 528 <<
$ glmark2-es2-wayland --off-screen --size 800x600 --visual-config -red=8:green=8:blue=8:alpha=8:buffer=0
   >> glmark2 Score: 463 <<  >> glmark2 Score: 516 <<
$ glmark2-es2-wayland --size 1920x1080 --visual-config -red=8:green=8:blue=8:alpha=8:buffer=0
   >> glmark2 Score: 227 <<  >> glmark2 Score: 229 <<
$ glmark2-es2-wayland --off-screen --size 1920x1080 --visual-config -red=8:green=8:blue=8:alpha=8:buffer=0
   >> glmark2 Score: 210 <<  >> glmark2 Score: 223 <<
>>> I do not own a 4k monitor so I can only run the off-screen test <<<
   >> glmark2 Score: ??? <<  >> glmark2 Score: ??? <<
$ glmark2-es2-wayland --off-screen --size 3840x2160 --visual-config -red=8:green=8:blue=8:alpha=8:buffer=0
   >> glmark2 Score: 71 <<   >> glmark2 Score: 74 <<

I noticed here that the RPi hardware used the “–fullscreen” option, instead of a window, so I only repeated the 1920x1080 onscreen tests with only that argument, but I varied the cpu governor to see what effect that would have on the gpu performance.

The output can be found here https://paste.ee/p/MwZTt#RLlFc4yuxikOrL0adiB6UwascwIL7xMN

But the summary (for 1920x1080p) is:

cpufreq governor      "glmark2-es2-wayland  --fullscreen"   temperature at end of test
--------------------- ------------------------------------  --------------------------
powersave              glmark2 Score: 211                   63.987°C (~147°F) 
schedutil              glmark2 Score: 230                   68.106°C (~155°F) 
ondemand               glmark2 Score: 250                   70.542°C (~159°F) 
performance            glmark2 Score: 287                   83.827°C (~183°F) 
conservative           glmark2 Score: 291                   70.774°C (~159°F) 

I suspect that while using the performance governor, the GPU was being throttled back due to high temperature. And that if I added cooling it could go higher.

I think that is a clear win for the VF2 when compared to RPi hardware.
VF2 : glmark2 Score: 291 glmark2-es2-wayland-results-fullscreen (1920x1080p)
RPi4: glmark2 Score: 40 rpi4-glmark2-es2-results-fullscreen (1920x1080p)
RPi3: glmark2 Score: 29 rpi3b-glmark2-es2-results-fullscreen (1920x1080p)

The Imagination Technologies PowerVR B-Series BXE-4-32 MC1 GPU (OpenGL ES 3.2 build 1.19@6345021) inside the VF2 (default frequency is 400 MHz, but can go up to 600MHz) is about an order of magnitude greater in performance than the VideoCore VI @ 250 MHz in the RPi3, and about seven times more performance than the VideoCore VI @ 500MHz in the RPi4, at least in terms of the glmark2 benchmark (version 2021.12) for an onscreen resolution of 1920x1080p.

If anyone has a RPi4 and a 4K monitor I would be interest in the onscreen “–fullscreen” results between the VF2 and RPi4 at 4K.
( VF2: sudo apt install glmark2-es2-wayland -y ; glmark2-es2-wayland --fullscreen )

3 Likes

Oh wow, that’s interesting, I would have expected the Pi4 to have a slightly better GPU than that.

The GPU in the VF2 is fairly low-end (BXE are “Entry” level GPUs for Imagination), and the 4-32 is for
4 = 4 pixel per clock
32 = 32 FP32 FLOPs/Clock

The TH1520 have a BXM-4-64, so we should expect roughly double the performances of the GPU in the VF2 as both are using the Rogue architecture.

Do you have a Speed LPi4A or a BeagleV where you could run these test?

1 Like

I do not, but it is easy enough to do. For wayland see the last line of my post.

1 Like

I run subset of these tests on Lchee Pi 4A, and it looks like this: at 800x600 and 1920x1080 settings its GPU is about 1.5 times faster than VF2. However on 4K test it’s about 20% faster. I have a guess that it maybe throttling. Over one week that i have this board I have not seen it’s going over 66C, no matter what. And this board runs hot with idle running at 50C, so it does not take a lot to get to that temperature. But it’s just a guess.

1 Like