Recovery Bootloader Bugs?

Hello, I’ve been trying to use the recovery instructions in the wiki to copy and run a recovery image for the JH7100 core. I’ve run into a few problems and while I’ve got a path forward worked out, I figured I’d share my my findings in case anyone else finds themselves with a similar problem.

So the initial loader when connecting to the debug pins and holding boot provides a small list of commands. The load command is supposed to open up for an XMODEM connection to receive binary data and load it at a given address. In the recovery instructions, this is used to load the recovery to 0x18000000. Afterwards, a do command is used to enter this address. I’ve not been able to get the XMODEM connection to work for this particular item thus far, although it has worked in the past for the later stage boot loader restorations. I simply get some sort of ACK error, I’ll have to copy the text verbatim next time I try.

Another task the loader seems to have trouble with is reading 4-byte longs that occupy the second half of an 8-byte quad-word. The commands are r(b|w|l) and w(b|w|l) to read and write a byte, word, or long respectively. The byte command can read any, and the word command appears to need to be 16-byte word aligned, and will pass back an error if this isn’t the case, but the long command only succeeds on 8-byte boundaries, and fails to pass back an error or return on the odd 4-byte boundaries.

Finally, there is a show command that appears to be somewhat similar to “od” on Unix in that it is used to dump a range of memory in some sort of intelligible fashion. However, I’ve found using this function only prints one line as individual bytes and then hangs until an interrupt is sent. I haven’t tried in a few days, I will update this with the exact behavior, but I’m pretty sure if I left the address and length off in subsequent calls, it would print back another line of bytes but likewise hang after printing rather than printing the full length or exiting.

I can verify that the do command seems to enter the program text correctly, as I used the wb command to write in an arbitrary binary and enter it and observed what I expected to happen.

This component passes back a copyright notice of SiFive, so the component itself may not be in direct control of StarFive.

The XModem implementation in the on-chip ROM apparently has a bug that prevents it from working with some XModem implementations. You basically have to use this tool as linked to in the recovery guide:

See the comment in the source code:

If you’re interested in the the recovery shell, so to speak, I found that it resembles the one implemented in this file:

I could not verify whether or not this is exactly the same version, but it could be a good place to start looking.

Oooh, I didn’t think I’d see code as a response to this. That’s great, that’ll definitely help me along in what I’m working on. Good to see it’s on Github too, that way if I find any fixes for these bugs I can stage PRs and hopefully help others out.

Glad to know they explicitly mention lrzsz tools, that’s precisely what I was using and had failing. I was a little iffy on how much the specific instructions needed to be followed on the wiki. They’re using specific tools for interacting with the serial port (TeraTerm and Minicom on WinNT/Unix respectively).

While helpful, the bare minimum you need to interact with the device, at least on Unix, is stty and cat, the former to set the baud and other terminal characteristics and the latter to open I/O to the port. I’ve found with the right stty settings, something to the effect of “cat /dev/ttyUSB0 & cat > /dev/ttyUSB0” will setup a rudimentary communication channel. It’s not as easy as using, say, screen, or one of the listed tools, but it gets the job done, and is especially helpful if you want to do each in a separate terminal and keep a clean stdout log uninterrupted by local input data in your terminal. Only snag I really hit there is cat > /dev/ttyUSB0 takes exclusive ownership of the input of the TTY so that if you pop open another terminal and try to sx > /dev/ttyUSB0, it won’t let the XMODEM attempt through, or it will take control of the /dev/ttyUSB0 so you can’t issue the device-side listen for connection. I’ll continue to take jabs at this, as one of my goals is to have this process defined with as few external dependencies as possible, hence trying to get base-level communication to work well with just cat and stty, as every Unix under the sun should have those.

I’ll report back with any other relevant findings.

I have tried bootloader recovery by following VisionFive’s Quick Start Guide but I have not managed to pass the stage where it says: “Waiting for XMODEM request[C]…”

The program gets stuck at that point.

I added some printfs to the code in order to see what’s happening and it seems that it doesn’t receive ‘C’ but instead it receives values 10, 10, 35, 32

As ASCII these are newline, newline, #, space

I have also changed baud rate, tried most common values, but that didn’t help.

(I use Debian at the host in case this has any significance)

Any ideas how to get it working?

Are you using the xmodem app above or a normal xmodem uploader? (See above for why this one is required.)

I dont’ know if it applies to the VisionFive board, but BeagleV had an interesting issue where the 232 voltage levels for the bootloader flash were more restrictive than those passing through the normal CH340-like UART. (This WAS something that we agreed shoudl have been fixed in later chips/boards, but who knows?) We had cases where brand X of USB/Serial adapter would work but brand Y wouldn’t because the thresholds of what constituted a high and a low didn’t quite match. Unfortunately, the primary documentation for this seems to be on the Beagleboard forum that was lost. AHA! There may be clues at suggestion to improve debug header signal levels · Issue #11 · beagleboard/beaglev-starlight · GitHub

If that conversation makes your head hurt, just try different USB/Serial adapters with different chips…

Which serial port are you connected to? The two serial ports indicated by the arrows in the figure below,1’s baud rate is115200 and 2’s baud rate is 9600.

Does your host have other OS? Like windows 10 or Ubuntu 20.04? You can try it under these two OS.

1 Like

Thanks for the answers.

I followed RiscVFan’s advice and bought me another USB-TTL adapter. This solved the problem.

The adapter which works with the debug port is ADA954 and the one which doesn’t work is HW-597.

Now I’m struggling to get to the upgrade menu. I managed to get there once but after that hitting “any key” after switching the board on doesn’t have any effect.

1 Like

So here’s how this would work in an ideal world:

cu -l /dev/ttyUSB0 -s 9600

Then while in a cu session, to begin an XMODEM-CRC transfer, you must type

~$sx -b jh7100_recovery_boot.bin

Although given the XMODEM-CRC bug, you can’t use lrzsz’s sx tool, so when performing this transfer, you’ll have to do it how GitHub - kprasadvnsi/JH71xx-tools: Bootloader recovery and updater tool for StarFive JH71x0 SoCs. does it.

First you must open the /dev/tty as a file descriptor. The TTY attributes are as follows:

	cfsetospeed(&tty, B9600);
	cfsetispeed(&tty, B9600);

	tty.c_iflag &= ~(IXON | IXOFF | IXANY);

	tty.c_oflag = 0;

	tty.c_cflag |= (CLOCAL | CREAD | CS8);
	tty.c_cflag &= ~(PARENB | PARODD | CSIZE | CSTOPB | CRTSCTS);

	tty.c_lflag |= ICANON;

	tty.c_cc[VMIN]  = 1;
	tty.c_cc[VTIME] = 10;

Note CRTSCTS is not POSIX so mileage may vary depending on OS.

With an open TTY connection and the device issuing ‘C’ (i.e. you’ve sent '“load 0x18000000”), you then need to start issuing packets in the format:

struct xmodem_crc_packet {
    uint8_t soh;
    uint8_t sectno;
    uint8_t sectno_inv;
    uint8_t sector[128];
    uint16_t checksum;
};

This data structure must be tightly packed as a 133-byte packet (i.e. must be in order and no padding). For all packets, this.soh = SOH character.

Each subsequent packet must have a “sectno” value equal to the last + 1. The count may wrap around to 0 and continue counting up. sectno_inv must be the two’s compliment of sectno (so every packet, this.sectno_inv = ~this.sectno;).

The packet is the next 128 bytes of the binary data, padded out with 0xff on the last packet. Finally, the checksum is a 16-bit CRC. We know this is what is needed as the system issues the character ‘C’ as the poll for XMODEM connection, different characters are sent for different XMODEM variations. NAK would instead mean conventional XMODEM, for instance, which is an 8 bit checksum instead.

So for each packet, you write the packet on the serial line, then should receive either ACK or NAK, ACK meaning the packet was processed and NAK meaning some sort of issue. After each successful ACK, the next packet can be transmitted. Finally, you close with an EOT transmission.

At this point, the XMODEM-CRC transfer should be complete and you can issue a “do 0x18000000” to load the binary payload.

Granted, this is just my digestion of the code, I haven’t exercised this with absolute success yet. I have managed to get the jh7100-recover utility to somewhat work in the past, in that I can get it (with some slight modifications) to send over the recovery bin which then does successfully load and load the other two files…but then when I cycle the device and connect to TTL on the GPIO header instead, I just get the ddrinit message but none of the rest of the boot process, and I just haven’t had the time or motivation to figure out what is going on there. My current plan is to just build a bare-metal RISCV executable with some hard-coded UART stuff to verify arbitrary code is making it over XMODEM-CRC and staging effectively, and then I might start on some troubleshooting as to why I can’t get this recovery process to fully work myself.

Anywho, hope that info is helpful. I’ve been experimenting with the debug serial interface sporadically with hopes of eventually recovering my VisionFive and getting Linux running on it again, but I’ve been having the worst luck at actually getting the existing recovery binaries to work correctly or at least give feedback as to why they don’t.

As stated on my previous messages, I managed to recover the bootloader and ddr init via debug port by following the official instructions (though using Debian) and buying me another USB-TTL adapter.

After that booting gave just few lines of text and the counter because uboot was still corrupted. I had difficulties entering to the upgrade menu but I succeed by using Screen instead of minicom or cu.

With Screen it’s possible to send files by XMODEM using sx this way:
press Ctrl-A, then :
then type a following command
exec !! sx <file path/name>

1 Like

I’ve unfortunately had little luck with either the cu & sx route or using jh7100-recovery. The former doesn’t seem to be able to navigate whatever the XMODEM bug is. The latter does install the components if I use the xypron fork (+ redefine c as int so it gets EOF). However, upon rebooting I don’t receive the ddr init memory messages, just the line:

bootloader version:211102-0b86f96

When I find time I intend to build a riscv toolchain that can build secondBoot and ddr init so I can try and figure out where it’s getting stuck if nothing else. This process does appear to flash something, as I do get that message, but that’s it. If I discover anything before someone else does I’ll post here.