Boot with custom kernel and get "Unhandled exception: Illegal instruction"

deependra · September 3, 2023, 1:18pm

I try to boot form a simple custom kernel. This kernel has few RISCV assembly lines.

I get this error when I try to boot via TFTP

StarFive # setenv ipaddr 192.168.68.118
StarFive # setenv serverip 192.168.68.108
StarFive # setenv loadaddr 0x80100000
StarFive # tftpboot ${loadaddr} main.elf
Using ethernet@16040000 device
TFTP from server 192.168.68.108; our IP address is 192.168.68.118
Filename ‘main.elf’.
Load address: 0x80100000
Loading: #
23.4 KiB/s
done
Bytes transferred = 1168 (490 hex)
StarFive # go 0x80100000

Starting application at 0x80100000 …

Unhandled exception: Illegal instruction
EPC: 0000000080100000 RA: 00000000fff4795a TVAL: 00000000464c457f
EPC: ffffffffc03bb000 RA: 000000004020295a reloc adjusted

SP: 00000000ff734a60 GP: 00000000ff734e00 TP: 0000000000000001
T0: 00000000ff734b40 T1: 00000000fff46288 T2: ffffffffffffffff
S0: 0000000080100000 S1: 0000000000000002 A0: 0000000000000001
A1: 00000000ff74a368 A2: 00000000ff74a368 A3: fffffffffffffffe
A4: 0000000000000002 A5: 0000000080100000 A6: 000000000000000f
A7: 00000000fffa8308 S2: 00000000ff74a360 S3: 00000000ff74a360
S4: 0000000000000002 S5: 00000000ffff04f4 S6: 0000000000000000
S7: 00000000ff74a3c0 S8: 0000000000000000 S9: 0000000000000000
S10: 0000000000000000 S11: 0000000000000000 T3: 0000000000000010
T4: 0000000000000000 T5: 000000000001869f T6: 00000000ff734b20

Code: 0000 0000 0000 0000 0000 0000 0000 0000 (457f 464c)

rvalles · September 3, 2023, 11:54pm

It’s an ELF. Entry point is not at the start of the file.

Please review u-boot documentation for adequate way to start ELF kernels, and let us know what worked while at it. (I have no idea from the top of my head)

deependra · September 5, 2023, 6:42pm

I found the entry point using readelf and the objdup and tried to load the run the kernel but I could not success.

Error :

StarFive # setenv loadaddr 0x80100000
StarFive # tftpboot ${loadaddr} main.elf
Using ethernet@16040000 device
TFTP from server 192.168.68.108; our IP address is 192.168.68.118
Filename ‘main.elf’.
Load address: 0x80100000
Loading: #
11.7 KiB/s
done
Bytes transferred = 1168 (490 hex)
StarFive # go 0x801100e8

Starting application at 0x801100E8 …

Unhandled exception: Illegal instruction
EPC: 00000000801100e8 RA: 00000000fff4795a TVAL: 0000000000000000
EPC: ffffffffc03cb0e8 RA: 000000004020295a reloc adjusted

SP: 00000000ff734a60 GP: 00000000ff734e00 TP: 0000000000000001
T0: 00000000ff734b40 T1: 00000000fff46288 T2: ffffffffffffffff
S0: 00000000801100e8 S1: 0000000000000002 A0: 0000000000000001
A1: 00000000ff74a368 A2: 00000000ff74a368 A3: fffffffffffffffe
A4: 0000000000000002 A5: 00000000801100e8 A6: 000000000000000f
A7: 00000000fffa8308 S2: 00000000ff74a360 S3: 00000000ff74a360
S4: 0000000000000002 S5: 00000000ffff04f4 S6: 0000000000000000
S7: 00000000ff74a3c0 S8: 0000000000000000 S9: 0000000000000000
S10: 0000000000000000 S11: 0000000000000000 T3: 0000000000000010
T4: 0000000000000000 T5: 000000000001869f T6: 00000000ff734b20

Code: 0000 0000 0000 0000 0000 0000 0000 0000 (0000)

RiscVFan · September 5, 2023, 9:58pm

It’s odd to get an exception on the very first opcode. You should look at the other registers like MCAUSE to see what the exception is. I’d also dump memory from segment start to a few bytes after the opcode and verify that the bytes at address …0e8 match the opcode you THINK is at …0e8.

Also, “custom operating system” has red flags all over it. Maybe the binary isn’t getting generated the way you think it is. Verify this works with some other OS image.

It seems very odd for the loader to know enough about ELF to map the right sections into pages built with the correct mapping and yet not be able to read the 8 bytes from the e_start header. That’s low-hanging fruit for a program loader.

I also don’t know what Code is, but I know from memory that 45 46 and 4c are ‘fXle’ which is “other endian” for ascii ELF followed by a non-ASCII character and THAT sets off yet more red flags.

Are you SURE that this bootloader is reading ELF?

RiscVFan · September 5, 2023, 10:01pm

Hint: the first sentence from the first hit is "U-Boot uses special format for bootable images. "

You have more homework left to do…

mzs · September 5, 2023, 10:53pm

Maybe I’m reading the information you have provided wrong, but it looks to me like you have uploaded a binary file of size 1168 bytes into memory from 0x80100000 to 0x8010048f and then you are telling the CPU to start executing machine code at 0x801100e8. I could be wrong but it looks like it is beyond where the machine code was loaded into RAM.

The output from the following three commands might be useful for people trying to help you.

$ file main.elf
$ ls -l main.elf
$ hexdump -C main.elf | head -40

A website that is useful for encoding/decoding one RISC-V instruction at a time is: https://luplab.gitlab.io/rvcodecjs/

RiscVFan · September 6, 2023, 12:59am

@mzs, you’re looking too low. So was I. We both assumed the developer had done their homework.

The loader is looking for a format that isn’t ELF at all. (This makes a bit of sense for cases like netboot where you may want the loaded image to also contain a filesystem…and a module to uncompress it or otherwise have more than just an exe to start.) I linked to relevant doc.

deependra · October 28, 2023, 1:05pm

I managed to boot my VisionFive2 with the help of several blogs.

I have committed my effort here.

Thank you all.

bwooster0 · October 28, 2023, 2:15pm

Your source includes a line with the comment:

make all HARTS except 0 wait

but the actual code makes all HARTS except # 1 wait

deependra · October 28, 2023, 2:52pm

Thank you, I fixed it now.

32 bit core 0 could not access memory, therefore I had to switch to a one of the 64 bit core.

bwooster0 · November 1, 2023, 1:50pm

According to the datasheet here are the specs of the cores (all are 64 bits):

• 4 × RV64GC U74 Application Cores
• 1x RV64IMAC S7 Core

deependra · November 1, 2023, 2:19pm

Thank you for pointing out that.

S7 core cloud not access memory. I did not dig deep.

mzs · November 1, 2023, 2:26pm

Technically you missed one, the E24 core is RV32IMFC

The E24 is accessed via a bus, it is separate from the fully-coherent U74MC+S7 cores inside the JH7110. Anyhow there is one RV32 core inside the JH7110. But with the default dts/dtb (currently) used, it is not normally accessible.

bwooster0 · November 1, 2023, 4:06pm

This code works fine on hart 0, what do you mean that hart 0 can’t access memory?

.section .text

.equ MSTATUS_MIE_BIT_MASK,8
.equ UART0_THR_ADDRESS,0x10000000
.equ UART0_LSR_ADDRESS,0x10000014

.global _start

.balign 4

_start:

csrr	t0,mhartid
li		t1,0
bne		t0,t1,hartWFILoop

csrci	mstatus,MSTATUS_MIE_BIT_MASK

mainLoop:

delay a bit

li		t1,0x100000 * 1500

0:
addi t1,t1,-1
bnez t1,0b

li		t0,UART0_LSR_ADDRESS
li		t3,UART0_THR_ADDRESS
li		t4,'!'

0:
lbu t1,(t0)
andi t2,t1,0x20
beqz t2,0b

sb		t4,(t3)

j		mainLoop

hartWFILoop:

wfi

j		hartWFILoop

deependra · November 1, 2023, 4:57pm

I will try this. Thank you.

bwooster0 · November 2, 2023, 12:53am

If you have access to the right hardware (I use a JLink Edu) you can use the JLink / openocd / gdb software to:

write elf files into the memory of the vf2
issue a reset instruction
set the $pc to where you wrote the elf file into memory
using openocd to start and stop cores and change the cores $pc so that they run your code

This is much faster than repeatedly flashing your custom image onto the board and allows for the use of a debugger.