Is the source code for the boot rom for the VisionFive 2 available?

The valid SHA256 hashes for the bootrom are:
cbd05621d738d5a913512b63846a9d45b408ab88403306e1740de05f0304ed63 32768
and 237a3303ac6c70c084a8c51aa58b89becc3f951970a02fd92d15b8f15f2ab4cf 28624 bytes length (last string location).

The easiest way of dumping it is to have latest U-Boot installed, then, with inserted empty sdcard where loss of the data on it will not cause any trouble type
mmc write 2a000000 0 40 (NOTE 40 not 8000: the mmc command accepts numbers of 512B blocks, not bytes!!) to write entire bootrom to sdcard, then eject it and insert into any Linux box and read first 32K blocks from it’s corresponding block device.

I’m work in progress with full decompilation of it as part of my RISC-V learning process (8147 insns, 148+ observable functions), but currently it is stuck due to IRL problems.

@Michael.Zhu is there any inside info about will be ever bootrom source code released? For example, Milk-V project and SG2042 manufacturers already released their ZSBL source.

EDIT correction of last observable meaningful byte location to 0x6fd0 (28624) thanks to @mzs

3 Likes

Your SHA256 hash perfectly matches mine (see below). It is very good to have confirmation, thank you.

I ended up running “md.l 2A000000 1000”

The copied the output from 2A000000 to 2a007ff0 into a text file:

$ cat > rom_dump_hex.l.txt
(pasted content from scrollback)
^D
$  wc -l rom_dump_hex.l.txt
2048 rom_dump_hex.l.txt
$ cat rom_dump_hex.l.txt | head -2
2a000000: 00000297 12628293 30529073 30005073  ......b.s.R0sP.0
2a000010: 30405073 41014081 42014181 43014281  sP@0.@.A.A.B.B.C
$ sudo apt install xxd -y
$ cat rom_dump_hex.l.txt | awk -F' ' '{print $2,$3,$4,$5}' | xxd -r -p > rom_bad_endian.bin
$ /usr/bin/riscv64-linux-gnu-objcopy -I binary -O binary --reverse-bytes=4 rom_bad_endian.bin rom_good_endian.bin
$ xxd rom_good_endian.bin | head -2
00000000: 9702 0000 9382 6212 7390 5230 7350 0030  ......b.s.R0sP.0
00000010: 7350 4030 8140 0141 8141 0142 8142 0143  sP@0.@.A.A.B.B.C
$ ls -l rom_good_endian.bin rom_dump_hex.l.txt
-rw-r--r-- 1 user user 131072 Jun 23 23:27 rom_dump_hex.l.txt
-rw-r--r-- 1 user user  32768 Jun 23 23:52 rom_good_endian.bin
$ strings -n 4 rom_good_endian.bin
R0sP
0sP@0
Es%@
s&@4!
sP@4
"#8!##41##0A##<Q!
@ ..
%P#$
E3#$
0D}U
P0<*
P M*
'}Wc
dBiEa
kBl%a
dBiEa
G     G4
\AL
`>}Y
p/}4
t8    J
jBkaa
`>}\
$3tT
$3tT
`O}=
lBmea
Gc]w
yAc[0
W&@:
`%}4m
!}4u
\alU>
}9cK
&H#&
X}Wc
e7'
'}7}
e7'
'}7}
iLQ"
)3wy
)AtX>
eO,Y
F#8!G#41G#0AG#<QE#8aE#4qE#0
'}8c
F}X1
GqS)
;AcI
bv;T
By7:
A"iJ
;     +A
;     +A
K     bw#
W #"
EMMC
SDIO
Main section boot fail,use backup section
All section boot fail,please check your Image
CRC Error, Try again
 Error
(C)StarFive
error 0x%x, try again
xmodem erro, try again
Main section boot fail,use backup section,error 0x%x
BOOT fail,Error is 0x%x
common_flash
mmc_bread
MMC: block number 0x,  exceeds max(0x  )
%s: Failed to set blocklen
mmc fail to send stop cmd
%s: Failed to read blocks
MMC: SET_DSR failed
Status Error: 0x%08X
Timeout waiting card ready
dwmci_data_transfer
dwmci_send_cmd
dwmci_setup_bus
dwmci_init
%s[%d] Fail-reset!!
%s: Timeout on data busy
%s: Timeout.
%s: Response Timeout.
%s: Response Error.mask:0x%x
%s: stat:0x%x
%s: DATA ERROR!
%s: Timeout waiting for data!
%s: Didn't get source clock value.
%s: Timeout!
%s: Timeout!
Caution! Your devices Erase group  ,
mmc erase failed
MMC: block number 0x%d,exceeds max(0x%d)
write blkcnt = %d
data.blocksize = %d
mmc write failed
mmc fail to send stop cmd
B c0
s2R"
b$C4
S6r&
>2.Q
P%@Fpg`
RwbVr
ftGd$T
Wfvv
F4VL
DXeH
uJTZ7j
ld\EL
n6~UNt^
invalid address 0x%x (need < 0x%x && align to 4)
data: null pointer
exceed otp size

$ ls -l rom_good_endian.bin rom_dump_hex.txt

-rw-r--r-- 1 user user  32768 Jun 23 23:08 rom_good_endian.bin
-rw-r--r-- 1 user user 126976 Jun 23 22:54 rom_dump_hex.txt

$ md5sum rom_good_endian.bin
4968f93d03a463b0d3bdfd5b7bb6f5fa rom_good_endian.bin
$ sha256sum rom_good_endian.bin
cbd05621d738d5a913512b63846a9d45b408ab88403306e1740de05f0304ed63 rom_good_endian.bin
$ b2sum rom_good_endian.bin
029f8d5673de52027317a45cba02dccd9a25e488f1232cacd0b133cfdf9b3accd545bf707601b80f257f368cacab94d11097ec49d7209a0f672d71622054977f rom_good_endian.bin

3 Likes

It’s quite a painful method of dumping binaries over ascii over serial. I used it in some situations mainly bcause host U-Boot lacked required commands to save over more convenient media. But I highly recommend using proper binary transfer commands that will guarantee bit to bit safety.

3 Likes

Your way is far simpler, so many less steps.

My next task is to generate some assembly:

$ /usr/bin/riscv64-unknown-elf-objdump -D -b binary -m riscv --adjust-vma=0x2a000000  --disassembler-options="numeric,no-aliases" rom_good_endian.bin | head -105

rom_good_endian.bin:     file format binary


Disassembly of section .data:

000000002a000000 <.data>:
    2a000000:   00000297                auipc   x5,0x0
    2a000004:   12628293                addi    x5,x5,294 # 0x2a000126
    2a000008:   30529073                csrrw   x0,mtvec,x5
    2a00000c:   30005073                csrrwi  x0,mstatus,0
    2a000010:   30405073                csrrwi  x0,mie,0
    2a000014:   4081                    c.li    x1,0
    2a000016:   4101                    c.li    x2,0
    2a000018:   4181                    c.li    x3,0
    2a00001a:   4201                    c.li    x4,0
    2a00001c:   4281                    c.li    x5,0
    2a00001e:   4301                    c.li    x6,0
    2a000020:   4381                    c.li    x7,0
    2a000022:   4401                    c.li    x8,0
    2a000024:   4481                    c.li    x9,0
    2a000026:   4501                    c.li    x10,0
    2a000028:   4581                    c.li    x11,0
    2a00002a:   4601                    c.li    x12,0
    2a00002c:   4681                    c.li    x13,0
    2a00002e:   4701                    c.li    x14,0
    2a000030:   4781                    c.li    x15,0
    2a000032:   4801                    c.li    x16,0
    2a000034:   4881                    c.li    x17,0
    2a000036:   4901                    c.li    x18,0
    2a000038:   4981                    c.li    x19,0
    2a00003a:   4a01                    c.li    x20,0
    2a00003c:   4a81                    c.li    x21,0
    2a00003e:   4b01                    c.li    x22,0
    2a000040:   4b81                    c.li    x23,0
    2a000042:   4c01                    c.li    x24,0
    2a000044:   4c81                    c.li    x25,0
    2a000046:   4d01                    c.li    x26,0
    2a000048:   4d81                    c.li    x27,0
    2a00004a:   4e01                    c.li    x28,0
    2a00004c:   4e81                    c.li    x29,0
    2a00004e:   4f01                    c.li    x30,0
    2a000050:   4f81                    c.li    x31,0
    2a000052:   d7100197                auipc   x3,0xd7100
    2a000056:   0ae18193                addi    x3,x3,174 # 0x1100100
    2a00005a:   00000297                auipc   x5,0x0
    2a00005e:   05a28293                addi    x5,x5,90 # 0x2a0000b4
    2a000062:   30529073                csrrw   x0,mtvec,x5
    2a000066:   00100293                addi    x5,x0,1
    2a00006a:   04028563                beq     x5,x0,0x2a0000b4
    2a00006e:   7c145073                csrrwi  x0,0x7c1,8
    2a000072:   f14022f3                csrrs   x5,mhartid,x0
    2a000076:   02b2                    c.slli  x5,0xc
    2a000078:   d7102117                auipc   x2,0xd7102
    2a00007c:   b8810113                addi    x2,x2,-1144 # 0x1101c00
    2a000080:   40510133                sub     x2,x2,x5
    2a000084:   4581                    c.li    x11,0
    2a000086:   f1402573                csrrs   x10,mhartid,x0
    2a00008a:   04b51d63                bne     x10,x11,0x2a0000e4
    2a00008e:   01100337                lui     x6,0x1100
    2a000092:   011023b7                lui     x7,0x1102
    2a000096:   00033023                .4byte  0x33023
    2a00009a:   0321                    c.addi  x6,8 # 0x1100008
    2a00009c:   fe734de3                blt     x6,x7,0x2a000096
    2a0000a0:   f0018293                addi    x5,x3,-256
    2a0000a4:   f0018313                addi    x6,x3,-256
    2a0000a8:   00628e63                beq     x5,x6,0x2a0000c4
    2a0000ac:   f2018393                addi    x7,x3,-224
    2a0000b0:   00737a63                bgeu    x6,x7,0x2a0000c4
    2a0000b4:   0002be03                .4byte  0x2be03
    2a0000b8:   01c33023                .4byte  0x1c33023
    2a0000bc:   02a1                    c.addi  x5,8
    2a0000be:   0321                    c.addi  x6,8
    2a0000c0:   fe736ae3                bltu    x6,x7,0x2a0000b4
    2a0000c4:   f2018313                addi    x6,x3,-224
    2a0000c8:   d7101397                auipc   x7,0xd7101
    2a0000cc:   a9838393                addi    x7,x7,-1384 # 0x1100b60
    2a0000d0:   00737763                bgeu    x6,x7,0x2a0000de
    2a0000d4:   00033023                .4byte  0x33023
    2a0000d8:   0321                    c.addi  x6,8
    2a0000da:   fe734de3                blt     x6,x7,0x2a0000d4
    2a0000de:   61d000ef                jal     x1,0x2a000efa
    2a0000e2:   a81d                    c.j     0x2a000118
    2a0000e4:   4621                    c.li    x12,8
    2a0000e6:   30461073                csrrw   x0,mie,x12
    2a0000ea:   10500073                wfi
    2a0000ee:   34402673                csrrs   x12,mip,x0
    2a0000f2:   8a21                    c.andi  x12,8
    2a0000f4:   da7d                    c.beqz  x12,0x2a0000ea
    2a0000f6:   020004b7                lui     x9,0x2000
    2a0000fa:   f1402573                csrrs   x10,mhartid,x0
    2a0000fe:   00251913                slli    x18,x10,0x2
    2a000102:   9926                    c.add   x18,x9
    2a000104:   00092023                sw      x0,0(x18)
    2a000108:   0ff0000f                fence   iorw,iorw
    2a00010c:   34405073                csrrwi  x0,mip,0
    2a000110:   4615                    c.li    x12,5
    2a000112:   00c56363                bltu    x10,x12,0x2a000118
    2a000116:   bfd1                    c.j     0x2a0000ea
    2a000118:   080002b7                lui     x5,0x8000
    2a00011c:   f1402573                csrrs   x10,mhartid,x0
    2a000120:   0001                    c.addi  x0,0
    2a000122:   4581                    c.li    x11,0
    2a000124:   9282                    c.jalr  x5
    2a000126:   a001                    c.j     0x2a000126

But then I need to translate that with reference to the hardware memory addresses into more generic algorithms, which will definitely force me to learn some really low level RISC-V assembly.

3 Likes

I did a “rgrep -i” on some of the strings found in the dumped bootrom with the source code for https://github.com/sifive/freedom/tree/master/bootrom and https://github.com/sifive/freedom-u540-c000-bootloader but I am not finding many of the strings that I would expect to find matching (no emmc/xmodem/… I did find some generic hits like flash/crc) So the code could well be written from scratch and not covered by any open source license. And that would make me very hesitant to backward engineer the code, or if I did choose to do so not to make my backward engineered code publicly accessible, because I would be breaking copyright. And I do not want any legal problems.

But even if I could not release my backward engineered code, I could audit the code. But it is not likely that a ROM can be patched, without a new spin of silicon.

4 Likes

So the code could well be written from scratch and not covered by any open source license

You’re not right, clearly, there are strings indicating at least drivers/mmc/mmc.c from JH7110_u-boot was used:

MMC: block number 0x,  exceeds max(0x
%s: Failed to set blocklen
%s: Failed to read blocks
MMC: SET_DSR failed
Status Error: 0x%08X
Timeout waiting card ready

, and drivers/mmc/dw_mmc.c:

dwmci_data_transfer
dwmci_send_cmd
dwmci_setup_bus
dwmci_init
%s: Timeout on data busy
%s: Timeout.
%s: Response Timeout.
%s: Response Error.mask:0x%x
%s: stat:0x%x
%s: DATA ERROR!
%s: Timeout waiting for data!
%s: Didn't get source clock value.
%s: Timeout!
%s: Timeout!

Guess what’s license for these two files?:

// SPDX-License-Identifier: GPL-2.0+
/*
 * Copyright 2008, Freescale Semiconductor, Inc
 * Copyright 2020 NXP
 * Andy Fleming
 *
 * Based vaguely on the Linux code
 */
// SPDX-License-Identifier: GPL-2.0+
/*
 * (C) Copyright 2012 SAMSUNG Electronics
 * Jaehoon Chung <jh80.chung@samsung.com>
 * Rajeshawari Shinde <rajeshwari.s@samsung.com>
 */

Given “viral” nature of GPL2+, I now wonder if it is legal for SF itself to ask you to never tap into this code.

3 Likes

You might try using radare2 with r2dec plugin at least to partition this binary into functions and data sections. r2dec is actually a decompiler generating C-like listings. Nice thing to get started with RE.

I have plans to heavily modify r2dec for RV because code it generates now is utter buggy uncompilable trash from very beginning, and I had developed a C header system around it to actually make these listings compilable. But it’s inner JavaScript core is only what stops me from modding it lol. And alot of missing functionality like proper guessing of number of arguments passed to function.

3 Likes

I was thinking that https://github.com/NationalSecurityAgency/ghidra might be another option.since they have added support for RISC-V!

2 Likes

Interesting. I recall it generated trashy code even for x86, but I also used it as radare2 plugin, maybe that’s why. But my current work is heavily bound to r2dec anyway, so I’ll check it out for future projects I plan.

3 Likes

28608 bytes=0x6FC0

2A006FA0: 0700 0000 0800 0000 0804 0804 0100 F8BF  ................
2A006FB0: 0002 0000 0000 0040 0002 0000 0000 FFFF  .......@........
2A006FC0: FFFF 0000 0000 7F7F 7F7F 7F7F 7F7F 0000  ................
2A006FD0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
2A006FE0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

To me it looks like you have truncated some bytes. I probably would have called the end 0x2A006FCF which would be a file size of 28624 bytes.

1 Like

Do you notice any strong similarity between the following snipets of code:
From u-boot/drivers/mmc/mmc.c

if (mmc_read_blocks(mmc, dst, start, cur) != cur) {
  pr_debug("%s: Failed to read blocks\n", __func__);
  return 0;
 }
 blocks_todo -= cur;
 start += cur;
 dst += cur * mmc->read_bl_len;

gcc C source code generated by Ghidra from the dumped BOOTROM’s machine code.

 if (uVar7 != uVar1) {
   FUN_2a0012a6((byte *)"%s: Failed to read blocks\r\n",
         (ulong *)"mmc_bread",plVar4,lVar5,param_4,uVar7,param_5,param_6);
   return 0;
  }
  uVar8 -= uVar1;
  param_1 += uVar1;
  param_3 += uVar1 * _DAT_01100904;

They are not exactly the same, but they are extremely similar. More similar than random chance alone. Enough to make me believe that there definitely is unreleased GPL source code used inside the BOOTROM.

I have to say that I’m shocked at the quality of code generated by Ghidra (even though it is not perfect). I achieved in minutes what would have taken me days or weeks if I was using pencil and paper. But the downside of using such a powerful tool is that you actually learn far less low level knowledge (Most of which I have gained thanks to backward engineering 6502 and 8086 machine code many many years ago, in the dark days long before the Internet was accessible by many).

3 Likes

Indeed some interesting finds! I had previously looked into but haven’t really bothered looking at the SPI/eMMC/etc drivers, as I assumed they did what I expect them to.

The part that really interested me, however, seems to be that it would read some of the one-time-programmable bits, which could enable a ‘secure-boot’ mode. The cryptography was implemented with the ‘security engine’ and by referencing the addresses and commands with those found in the StarFive-provided Linux drivers. I was reasonably sure that in secure-boot mode there would be extra data found in the various otherwise unused bytes in the SPL header, like a hash and signature or something, and it would be decrypted/verified with AES-256-CBC and ECDSA-SHA256.

Given how StarFive seemed to be guarding the secrets of their secure boot mechanism (see: source code, document behavior · Issue #1 · starfive-tech/Tools · GitHub) I think StarFive would be guarding the source code of their ROM as well. Which is unfortunate.

In the non-secure-boot case though, as far as I can tell there really isn’t much going on. It is pretty much just the bare minimum to read from the boot device and jumping to it. And the driver code is probably all somewhere in u-boot. So if you’re not looking into permanently modifying your chip by setting the OTP to secure-boot mode, I don’t think we’ve been missing much.

3 Likes

I can see one reason why: https://www.starfivetech.com/en/site/new_details/970:

"In terms of the smart gas pipeline system and network security, StarFive and WINICSSEC, based on JH7110, will jointly develop industrial Internet security products with fully independent IPR. At present, this product is in field testing and will be promoted to cover urban gas. The product empowers Towngas Smart Energy to strengthen cyber security protection of the gas network, and consolidate data security for critical facilities.

Dan Xi, VP for Engineering Quality and Construction of Towngas, said that Towngas Smart Energy operates pipeline gas projects in more than 300 cities and towns across Mainland China, with more than 40 million customers. Applying JH7110 to the company’s industrial-grade cyber security system can ensure information security for its critical facility - the operation platform. The application strengthens the security level to realize a safe, reliable, and stable gas supply to ensure the implementation of smart gas projects."

For security systems the algorithms should all be public so that the can be audited (security through obscurity has always failed +++) the only thing that should be protected are the keys.

+++ EDIT: A recent example would be TETRA:BURST. Where in public TErrestrial Trunked RAdio has been referred to as 80-bit encryption key-length (~1.2 x 10^24 keys) for nearly the the past 30 years, in reality after the secret TEA1 (European commercial Use), TEA2 (European public safety organisations - police, emergency services), TEA3 (Public safety organizations outside EU) and TEA4 (Commercial organisations outside EU) encryption algorithms where all finally dumped and backward engineered using zero-day exploits it was finally revealed to be effectively 32-bit encryption key-length (4,294,967,296 keys), which can be brute forced after capturing only four packets in less than a minute using a laptop with a medium to high end GPU via OpenCL (Ultimately the secret encryption algorithms were deliberately broken by design, to allow government agencies a backdoor to provide the ability to access “confidential” communications). :person_facepalming: :woman_facepalming: :man_facepalming:

4 Likes

Damn ghidra is so huge, but it’s results are quite useful. At least with these, I’ll pinpoint “public” functions (borrowed gpl code) cause almost all functions I disassembled are unnamed and not tied to anything.

1 Like

Running Ghidra, in Windows->Memory Map
I defined from 0x2a000000 to 0x2a006784 Length 0x6785 as readable and executable (code)

I defined from 0x2a006785 to 0x2a006fd0 Length 0x84c as readable (data). I then manually went through all of the data and defined any obvious strings of text as strings.

I initially defined from 0x2a006fd1 to 0x2a007fff Length 0x102f as not readable, not writable, not executable (null). And then I generated a new firmware with all the null data truncated, because it was only a distraction.

2 Likes

Thanks.
Although I’m amazed by amount of work put into today’s disassemblers (and compared to proprietary tools of previous era like IDA Pro HexRays back in 2008 when I started getting interested in this topic), jugding by ghidra output I feel it’s still very far from what I would like it to be.

Same obscure and very hard readable teeth-crushing casts-through-casts like *(int *)(long), mixed together with decompiler messy naming of intermediates like uVar1 (why they just can’t adopt some common programming alphabet around or even use register names?). Forget about decompiling libgcc/gcc jumptables right, these register calls confuse everyone. Meh, I’ve seen this already with r2ghidra, it appears just decompiler extracted directly from it.

On the bright side, they seem to get function parameters counting done right, so I can polish my c-asm output there. And some strings references. But these I would get done even with my bare hands anyway.

EDIT I was wrong and I have to admit that ghidra is quite valuable tool once you start reconstructing the logic of function (refactoring all the goto’s into readable if-else blocks). Once you provide all the required information, it starts doing some other magic like dereferencing function pointers and at least trying to get through jumptab garbage. Often not correct, but something at least.

Also, if you compare two pieces of code with reference source you have on the hands (like those GPL functions we’ve got stuck into this bootrom), it actually helps to sieve them out so they will not annoy you.

Still, the default naming of variables can make you mad, but it’s to your taste.

Modern code don’t even needs to be much obfuscated. Want obfuscate the code and release the dump? Just compile it with insane optimization switches, then pass it through decompiler and attach some C-asm header. Remember to turn off jumptabs with -fno-jump-tables or it won’t run.

3 Likes

i find myself continuously referring back to the raw assembly output when when I see something odd.

e.g references to memory addresses like 0x1100???, 0x16000???, 0x170????, 0x400?????, 0x800?????, etc.

I suspect that I need to define more blocks of memory to make Ghidra slightly more happy. Most of which at a guess become inaccessible after the BOOTROM hands over control to the next boot stage. Or at best are remapped to somewhere else at different memory addresses.

3 Likes

Funny thing is that most of these addressess are freely available even after OpenSBI. Even if 0x40000000 is not accessible in normal mode after OpenSBI had explicitly protected it, it is still accessible from M-mode (from U-Boot SPL) once you patch OpenSBI out or write your own micro SPL in place of U-Boot SPL, so I believe every address is freely accessible once BootROM gives up control to SPL. That’s good.

Even more: earlier when I received my boards in Dec22, I discovered that 0x40000000 is actually simply readonly-aliased to 0x240000000 in S-mode at least on my two 8GB boards of different revisions, v1.2a and v1.3b, and you can see OpenSBI code contents at this address! (this address wrapping phenomena is common on embedded actually)

0x01100000 is not listed anywhere in TRM nor in U74 reference tho. Strange, because it contains some data actually and physically ends at 0x01102000 after that, load fault occur. But BootROM is riddled with this offset almost everywhere!

0x16000000 and 0x17000000 are normally accessible from S-mode.

0x01700000 also accessible from S-mode and 0x80000000 is normally an S-mode boot base, also accessible.

I keep digging into BootROM, slowly. Just locating strings and standard GPL functions for now, plus µ-printf at 0x2a0012a6. Fixed quite fundamental bugs in r2dec on my side. Hopefully there are no any restrictions left, and I’ll finally reconstruct it to plain C code someday.

4 Likes

Interesting thread.

I think its quite common for soc`s to not protect anything in non secure boot mode. I reverse engineered allwinner H3 secure boot, and in that chip it was the same.

When secure mode bit was set in OTP flash, a completely different boot rom was mapped to the reset vector for chip start up. It locked down all the sensible things for later boot stages (including itself).

Luckily there is a bug, and I could dump the secure boot rom.

3 Likes

Yeah, VF2 and based on JH7110 are development boards so what’s the point of locking it away.

To me all that secure boot fuss is quite useless. Anything man made can be broken by someone else. There should be open and community reviewed mechanism if you want real security, like it’s done with cryptographic ciphers today, period. That said, I’m glad you found the bug.

3 Likes