Help for broken VF2? "Error inserting "<>" variable, errno=1"

Curiouser and curiouser … The printenv output on my system has just two entries - one for an ipaddr and one for netmask. I cannot update any of the ones that it’s complaining about (I get the same error, but other network related values e.g. serverip and gatewayip CAN be set and they are persistent across restarts after using saveenv). I wonder what’s making it accept some values and not others … I feel there’s a solution to my board’s problems here somewhere.

StarFive # printenv
ipaddr=192.168.0.91
netmask=255.255.255.0
serverip=192.168.0.38

Environment size: 65/65532 bytes
StarFive # setenv ipaddr 192.168.0.99
StarFive # printenv
ipaddr=192.168.0.99
netmask=255.255.255.0
serverip=192.168.0.38

Environment size: 65/65532 bytes
StarFive # setenv gatewayip 192.168.0.1
StarFive # printenv
gatewayip=192.168.0.1
ipaddr=192.168.0.99
netmask=255.255.255.0
serverip=192.168.0.38

Environment size: 87/65532 bytes
StarFive # setenv bootmode flash
## Error: flags type check failure for "bootmode" <= "flash" (type: i)
## Error inserting "bootmode" variable, errno=1
StarFive # 

Once today when I was playing with settings and restarting it came up with all the correct values set. I’m not aware of anything I did that would have made it work this one time. Unfortunately I didn’t have an SD card in at the time so I’ve no idea if it would have booted at that point, but I’ll past in the values here so that I have a record of them:

bootmode=flash
chip_vision=A
devnum=1
eth0addr=6c:cf:39:00:24:d7
eth1addr=6c:cf:39:00:24:d8
ethaddr=6c:cf:39:00:24:d7
fdtcontroladdr=fffc6aa0
ipaddr=192.168.0.91
memory_addr=40000000
memory_size=200000000
netmask=255.255.255.0
serial#=VF7110A1-2250-D008E000-00000966
serverip=192.168.0.38
stderr=serial@10000000
stdin=serial@10000000
stdout=serial@10000000
uboot_fdt_addr=0xfffc6aa0
ver=U-Boot 2021.10 (Jan 19 2023 - 04:09:41 +0800)

Environment size: 434/65532 bytes

Just had a quick look at uboot sources and how it does validate the data. Something is clearly wrong in your board as flag “i” is supposed to be for a variable that can only accept IP address.

All files relate to this repository/commit: https://github.com/starfive-tech/u-boot/blob/f1d959f0b02e16842181a4c1723ba3ea30d2e04a
(new user so cannot put more than two links, thanks discourse)

Like in such output:

## Error: flags type check failure for “stdin” <= “serial@10000000” (type: i)

See uboot sources where the flags are validated: /env/flags.c#L525

This message is more concerning though:

StarFive # setenv bootmode flash
## Error: flags type check failure for “bootmode” <= “flash” (type: i)
## Error inserting “bootmode” variable, errno=1

The flag should not be set to ‘i’ when running the command by hand.
Looking at uboot again, some variable name can be linked with a pre-determined type, but if you look in /include/env_flags.h

bootmode for example do not have a pre-determined type, so from this function here: /env/flags.c#L331
It should default to string and not to IP.

Something is clearly fishy in you setup, and my gut feeling here is potentially a problem with the DRAM, I don’t have y VF2 at hand, but does the compiled uboot come with the mem test tool?
If yes I would run it to make sure that the DRAM is working properly.

Something is clearly fishy in you setup, and my gut feeling here is potentially a problem with the DRAM, I don’t have y VF2 at hand, but does the compiled uboot come with the mem test tool?
If yes I would run it to make sure that the DRAM is working properly.

Hmmm that’s definitely a possibility. It has a random command and that seems to give me this when I run it (seemingly regardless of start value)

Unhandled exception: Store/AMO access fault
EPC: 00000000fff4fe80 RA: 00000000fff4fe80 TVAL: 0000000000000000
EPC: 0000000040208e80 RA: 0000000040208e80 reloc adjusted

SP:  00000000ff736a50 GP:  00000000ff736e00 TP:  0000000000000001
T0:  00000000ff736b40 T1:  0000000000000039 T2:  0000000000000000
S0:  0000000000000000 S1:  0000000000000100 A0:  ffffffffa5461cea
A1:  0000000000000000 A2:  0000000000000010 A3:  0000000000000000
A4:  00000000fffd9c28 A5:  ffffffffa5461cea A6:  0000000000000008
A7:  00000000fffa9a50 S2:  0000000000000000 S3:  0000000000000040
S4:  0000000000000004 S5:  00000000ffff0af4 S6:  0000000000000000
S7:  00000000ff7581e0 S8:  0000000000000000 S9:  0000000000000000
S10: 0000000000000000 S11: 0000000000000000 T3:  0000000000000010
T4:  0000000000000000 T5:  000000000001869f T6:  00000000ff736b20

Code: 9381 c533 0127 b76d 0a13 0044 20ef 4bf5 (c008)

It does occasionally come up with that when I try to boot it normally - maybe 1/20 times.

As a note to the previous points I’ve just reflashed with the older 2.4.4 and it doesn’t give me the flags type check failure messages but I guess the checks might have been added in a later version. It still doesn’t start from the SD card though and leaves me at thge Starfive# prompt again

Yes the flag check is not enabled by default, and as far as I can see in the default VF2 config is it not enabled, at least in the defconfig file for the visionfive.

Not sure why it was enabled in this build and why it is not in the default config.

Interestingly I’ve just tried a flash of 2.5.0 (I’m not sure which my board was initially shipped with) and it does start trying to boot from the SD card but gets stuck when attempting to decompress the kernel:

Model: StarFive VisionFive V2
Net:   eth0: ethernet@16030000, eth1: ethernet@16040000
switch to partitions #0, OK
mmc1 is current device
found device 1
bootmode flash device 1
Can't set block device
371 bytes read in 3 ms (120.1 KiB/s)
Importing environment from mmc1 ...
Hit any key to stop autoboot:  0 
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:2...
libfdt fdt_check_header(): FDT_ERR_BADMAGIC
Card did not respond to voltage select! : -110
Scanning mmc 1:3...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
875 bytes read in 4 ms (212.9 KiB/s)
U-Boot menu
1:      Debian GNU/Linux bookworm/sid 5.15.0-starfive
2:      Debian GNU/Linux bookworm/sid 5.15.0-starfive (rescue target)
Enter choice: 2
2:      Debian GNU/Linux bookworm/sid 5.15.0-starfive (rescue target)
Retrieving file: /boot/initrd.img-5.15.0-starfive
9684368 bytes read in 408 ms (22.6 MiB/s)
Retrieving file: /boot/vmlinuz-5.15.0-starfive
7893260 bytes read in 333 ms (22.6 MiB/s)
append: root=/dev/mmcblk1p3 rw console=tty0 console=ttyS0,115200 earlycon rootwait stmmaceth=chain_mode:1 selinux=0 single
Retrieving file: /usr/lib/linux-image-5.15.0-starfive/starfive/jh7110-visionfive-v2.dtb
46706 bytes read in 10 ms (4.5 MiB/s)
   Uncompressing Kernel Image

I did some research, which may help you check your DDR memory
I found this:
https://doc-en.rvspace.org/JH7110/TRM/JH7110_TRM/system_memory_map.html

Start Address	End Address	    Size	Attribute	Device/Description
0x00_2A00_0000	0x00_2A00_81FF	32.5KB	RX		    Boot ROM
0x00_4000_0000	0x02_3FFF_FFFF	8GB	    RWX C A N	DDR for memory port
0x04_4000_0000	0x06_3FFF_FFFF	8GB	    RWX N		DDR for system port

And this:
https://doc-en.rvspace.org/JH7110/TRM/JH7110_TRM/u74_memory_map.html

Start Address	End Address	    Size	Attribute	Device
0x00_0800_0000	0x00_081F_FFFF	2MiB	RWX A		L2 LIM
0x00_4000_0000	0x04_3FFF_FFFF	16GB			    Memory port
0x04_4000_0000	0x08_3FFF_FFFF	16GB	RW A		System port

From: http://doc-en.rvspace.org/VisionFive2/Developer_Guide/JH7110_Boot_UG.pdf
0x08000000 SRAM where u-boot-spl.bin.normal.out (SPL) is loaded
0x40000000 DDR where visionfive2_fw_payload.img is loaded (U-Boot + OpenSBI)
0x40200000 OpenSBI - Jump to 0x4020_0000 (located in DDR) to execute U-Boot

The Das U-Boot mtest command takes the following optional arguments:
“mtest [start [end [pattern [iterations]]]]”
Which might mean that the command you end up needing should look something like:
8 GiB VF2:
mtest 0x0040000000 0x023FFFFFFF
4 GiB VF2: (The default with no arguments, the comment says 8GiB but the size is 4GiB)
mtest 0x0040000000 0x013FFFFFFF

“CTRL+C” to stop it, will automatically repeat.

2 Likes

There’s 5v and 3.3v in the GPIO headers.

Could you measure them, as an extra precaution? If you had an oscilloscope, I’d use that to see if they drop at any point.

Power issue or bad RAM are what I am leaning towards with what we now know.

StarFive # mtest
Unknown command 'mtest' - try 'help'

Our u-boot is unfortunately built w/o.

As we’ve got some 16MB of SPI (or 15MB after 1MB, which is where we’re currently putting 2nd stage bootloader or opensbi+u-boot), we can probably enable everything or almost in u-boot.

The u-boot used is also relatively old. Maybe it’d be worth doing a build with a current stable version with the under-review patches to support this SoC/board applied.

1 Like

The default memory locations for mtest are configured (for a 4GiB board).

It could probably enable by adding “CONFIG_CMD_MEMTEST=Y” to u-boot/configs/starfive_visionfive2_defconfig and recompiling.

But if it is not enabled, there is probably a good reason.

There’s 5v and 3.3v in the GPIO headers. Could you measure them, as an extra precaution?

4.97V/3.29V with the two “good” power supplies I have (Don’t have access to an oscilloscope here so that’s just with my multimeter)

(As a point of reference, if I try using a standard low quality USB lead to power it I get 4.55V although the symptoms don’t change)

I am intrigued at the 2.5.0 firmware seemingly getting further though.

I get 4.94v and 3.30v on mine at the gpio 5v/3v3, while drawing just 5w from wall, for reference.

Your values definitely aren’t bad. I fear it might be the RAM :crying_cat_face:

1 Like

@mzs While I can’t use mtest I can use random and mw to write stuff to those ranges - it just fails with this unfortunately:

StarFive # mw 0x0040000000 0x00 0x013FFFFFFF
Unhandled exception: Store/AMO access fault
EPC: 00000000fff5027c RA: 00000000fff50256 TVAL: 0000000040000000
EPC: 000000004020927c RA: 0000000040209256 reloc adjusted

I am a little surprised that I’m not getting similar messages when it’s decompression the kernel on the Debian 55 image though - seems to just hang instead.

0x40000000 is the start of the DDR memory and where the visionfive2_fw_payload.img (filesize: 2797189 bytes ; 0x2AAE85) is loaded (from QSPI NOR FLASH) and parts of it are decompressed (either close to the start or at the end of memory or both, at a guess, I’ve not checked). My guess was that a well written mtest function would either move the critical binary program code around memory to protect it from being overwritten, or just prevent writing any pattern to that small section of RAM (probably ~0.04%) where it resides (Even if only testing 99.96% of the DDR memory, that might still highlight a problem).

But overwriting the running program that you are using to overwrite itself will generally generate some kind of Unhandled exception.

Although a better location to test all DDR RAM might be to execute binary code from inside the 2MIB L2 LIM (Loosely Integrated Memory, normally used for the CPU cache).

Maybe skip a chunk at the start (32MiB) and end (32MiB) of DDR RAM.
So from maybe 32 MiB to 8160 MiB might avoid overwriting it own data and code (I’m guessing, I have not checked).

0x0042000000 to 
0x013DFFFFFF

Starting from that address is a bit better, although it does fail when it starts to get up to 0x0100000000, but is ok up to 0x000F000000. The last line below hangs similar to what happens when decompressing the kernel, but anything between 0x0042000000 and 0x000F000000 seems to be writable without problems.

StarFive # random 0x00F0000000 0x000F000000
251658240 bytes filled with random data
StarFive # random 0x00F0000000 0x000FF00000

There are very few patches that have been pushed upstream for this board to mainline Das U-Boot (yet). So while you would gain new features in U-Boot you would effectively need to repeat any work that starfive are currently doing to push patches upstream. And that is not simple (for most people).

As for U-Boot being relatively old, any needed core regression fixes would have been applied. If you think about it, the vast majority of commits to the mainline Das U-Boot are going to be for other boards that are not using a JH7110 SoC, almost all of which are not even RISC-V. So the other way to look at it is that for JH7110 this fork is the very latest bleeding edge U-Boot release (for this board). Anything critical that is needed for the JH7110 would be back ported. The reason to (temporarily) stick with one version of U-Boot would be exactly the same as for the kernel. It is the most efficient method in terms of time spent coding to create working patches which will eventually be accepted upstream.

Not “repeat”, just apply these same patches.

U-boot isn’t Linux; they aren’t changing APIs and modifying things all over the place. There’s good chances patches apply cleanly, and an excellent chance the code can be easily amended if not.

1 Like

Did you read about reflashing with the uart and the recovery img ? its in the QSG Appendix 4. you can use xmodem and setting switches to reflash

My original post says I have flashed with xmodem, and my earlier comments have the logs from the session, so yes I know about doing the reflashing that way.

4 Likes

So from my understanding with the soemwhat working uboot you have at the moment you can read from the SD card?

Then there could be a way to load a memtest app from the SD card, load it into some memory like the LIM mzs is talking about and just jump to that memory and the memtest app would them start to run. The question is, I am not aware of such an app existing.

Oh wait just did a quick googling and it gave me that:

GitHub - atrosinenko/memtest86-plus-riscv: Port of original MemTest86+ v5.1 to other architectures (RISC-V for now)

I haven’t tried it, it may not work, and it seems to build for uboot “executable”, so could be worth a look at!
I also don’t know if it use the SBI for simple low level stuff or is somewhat hard coded to a board, but worth a look I think!