So in my general lack of success of actually getting ddrinit to restore and recover correctly, I’ve turned to some bare metal programming, sending binaries over the debug UART since I can’t really seem to get anywhere else. I’m taking it as a challenge rather than defeat. In any case, spent yesterday evening preparing an environment for and successfully developing and running a simple exercise in text transmission. I was typing up some notes and realized it’d probably be valuable to someone trying to learn similar things to share the details and experience.
First, for my environment. My primary workstation these days is a Raspberry Pi 400 running a stripped out version of the stock RPi Linux distro (I run TWM for instance…don’t ask…). In any case, I’ve installed the distro standard riscv64-linux-gnu binutils and GCC, although my experiments haven’t featured C yet except prototyping and -S to learn my RISC-V better. I’m connecting to the device over a trivial USB<->TTY. I’m currently using the jh7100-recover tool and just sending my own stuff instead of the recovery loader. Seems to work just fine. I’m working on an sx-ish tool at present, something that can be called from a TTY dialer but also mitigates whatever XMODEM-CRC bug is present in the SiFive recovery shell. With those powers and some source code combined, anything soon becomes possible. I want to write a more traditional tool to integrate this process with utilities such as cu and screen which as I recall did work with the ddrinit, etc. over UART0. Details details, onto the hardware.
Tools assembled, I set out to find the necessary references to write text out on the TX line of the debug header (UART3), as this is the only consistent entrypoint into the device I have right now. I first took a look at how the bootloader recovery tool does it, and it all boils down to a series of character writes on a transmission register on the UART. Additionally, there is a status register to check a bit on, just to verify we can actually write on the UART at the moment. While I haven’t worked with UART before, this is much like the data transmission register on the Ti TMS9918 and descendants, better known as the VDPs of early Sega hardware. You just write a word, wait for clear (or hope you’re slow enough) and write the next. The device scoops it up and sends it along, hopefully faster than you can put another but if not, usually gives you a way to know to wait if you’re careful enough to check. Granted, that IC also had DMA, can you even DMA a UART…
The similarity is likely no coincidence, this UART on the JH7100 is a Ti PC16550D. There is a reference somewhere in the bootloader recovery to the manual, SNLS378C. The manual explains all of the necessary programming interfaces of the UART, although all we’re really interested in are the transmission and line status registers. Page 17 indicates that the transmission register, THR, is at index 0, and the line status register, LSR, is at index 5.
Unfortunately, knowing what registers on the device to use, even with the manual, only shows an IC in isolation. The next step was to find the UART’s location in the JH7100 memory map. This information is in the JH7100 datasheet: STARFIVE-KH-SOC-PD-JH7100-2-01-V01.01.04-EN. Page 30 shows the base address for UART3 as 0x12440000. This, coupled with the specifics of the UART above, provides a transmission register at 0x12440000 and line status register at 0x12440005.
Now the task of writing code to perform this task in an efficient but cautious way. I opted for assembly to understand the primitives involved, although I imagine the process described herein could be applied purely in C. This code is simple. It only issues one character, but it does so in a predictable manner. That’s all one needs to then build string functions which then open up feedback mechanisms.
First, simply, we have an entry point. The name doesn’t truly matter, the linker might just bark at you later depending on the name. The SiFive loader doesn’t see nor care what the name is.
.text
.globl _start
_start:
addi sp, sp, -8
sd rd, 0(sp)
Gotta save the return address, RISC-V clobbers it on every function call. Convention is to grab a stack entry and stash it there. A micro-optimization might be to drop this on the leaf nodes of call stacks. Does your favorite compiler do this? Who knows…
Next, descent into the actual UART write of a character:
li a0, 'A'
call _putchar
That does the real magic, explained in a moment.
After that’s done, that return address comes in handy:
ld ra, 0(sp)
addi sp, sp, 8
ret
Easy stuff, just take back the return address and give back the stack. This little application should result in the emission of one single ‘A’ on the UART TX line then a return to the SiFive loader. Now for the write itself:
.text
.globl _putchar
_putchar:
addi sp, sp -8
sd ra, 0(sp)
You do the hokey pokey, you put your return address in and shake it all about.
li t0, UART3_LSR /* 0x12440005 */
1:
lbu t1, (t0)
andi t1, t1, PC16550D_LSR_THRE /* 0b00100000 */
bne t1, zero, 1b
This loads up the address of the line status register then enters a tight loop where I grab the latest status, check it for the transfer blocked bit, and repeat until it’s clear. This is “optional” if you want to assume the UART TX is always open. Perhaps it recovers gracefully from an attempt to write when it isn’t available, perhaps it doesn’t. When in doubt, most details are in a register somewhere.
Now the moment we’ve all been waiting for! Let’s write that byte:
li t0, UART3_THR /* 0x12440000 */
sb a0, (t0)
That’s it, all that work for two lines of code. I could probably put this right after _start along with loading an immediate character to a0 and get something on the UART, but a little structure helps. Finally, this was predictable, but I didn’t actually have to do any of this here:
ld ra, 0(sp)
addi sp, sp, 8
ret
Never altered that return address, but if I ever want to adjust how putchar works, it’s that much easier to skip the boilerplate.
So there you have it! That’s the code to print an arbitrary character over the debug UART on the JH7100 and return to the loader. However, assembly code is not a running binary image on the system. There’s still the matter of building the binary to send with jh7100-recover. This will entail a makefile and a linker script. First, the linker script. The only purposes of this script are to set an entry address, but access to the various mapping features of the linker may prove valuable when using complex memory layouts.
MEMORY {
intRAM0 (rx) : org = 0x18000000, len = 128k
}
A very simple memory layout. I’m only running PIC code and have no data, so one executable segment suffices. I haven’t experimented with where all the SiFive loader will load from, but 0x18000000 (the base of intRAM0 per page 28 of the JH7100 datasheet) is where the recovery loader is sent to and run from, so why not.
SECTIONS {
.text : {
*(.text)
} > intRAM0
.data : {
*(.data)
} > intRAM0
.bss : {
*(.bss)
} > intRAM0
}
About as generic as it gets. Just puts the memory segments sequentially in intRAM0 at the location we’re going to enter. One unfortunate quirk of the RISC-V implementation of GNU ld is it inexplicably doesn’t support “OUTPUT_FORMAT(binary)” or the equivalent command-line option, complaining that it can’t be done. As will be shown in the makefile, another binutils tool does this easily. Is this the UNIX philosophy at work? Different tools for different jobs but the authors themselves can’t even make them talk to each other.
On to the makefile. This makefile is pretty generic, like the above linker script. It’ll be brief:
main.bin: main.elf
objcopy -O binary main.elf main.bin
main.elf: src/main.o src/putchar.o
ld -T link.ld -o main.elf src/main.o src/putchar.o
This, of course, would only work if you’re running on a riscv64. Otherwise, prefixes will be necessary on the utility names. In any case, an effective makefile uses variables to represent most repreated text, but this is just to demonstrate the bare essentials. The assumption here is that the two functions described above are in their own assembly files, src/main.s and src/putchar.s. I’m using GNU make, so mileage may vary on relying on built-in rules elsewhere, POSIX doesn’t require .s.o so I’m cheating a little bit.
As mentioned above, I handle the lack of OUTPUT_FORMAT(binary) support by using objcopy to map the typical elf generated by ld into a binary address space and flatten it out. It’s a shame this is necessary, the same is required when producing flat aarch64 binaries for similar purposes. This directive has worked just fine for m68k and sh targets in the past.
So with all of that, I get main.bin, which I then send via
jh7100-recover -D /dev/ttyUSB0 -r main.bin
and I get:
A
Don’t take my word for it. Recreate this and try for yourself.
I’m currently working up a little text library for this programming scenario as well as a better XMODEM-CRC tool to allow better interactive use of the debug port via cu. Whenever there’s meaningful progress on either I’ll be sure to share those either in this thread or a new one depending on the passage of time.