Boot with custom kernel and get "Unhandled exception: Illegal instruction"

I try to boot form a simple custom kernel. This kernel has few RISCV assembly lines.

I get this error when I try to boot via TFTP

StarFive # setenv ipaddr 192.168.68.118
StarFive # setenv serverip 192.168.68.108
StarFive # setenv loadaddr 0x80100000
StarFive # tftpboot ${loadaddr} main.elf
Using ethernet@16040000 device
TFTP from server 192.168.68.108; our IP address is 192.168.68.118
Filename ‘main.elf’.
Load address: 0x80100000
Loading: #
23.4 KiB/s
done
Bytes transferred = 1168 (490 hex)
StarFive # go 0x80100000

Starting application at 0x80100000 …

Unhandled exception: Illegal instruction
EPC: 0000000080100000 RA: 00000000fff4795a TVAL: 00000000464c457f
EPC: ffffffffc03bb000 RA: 000000004020295a reloc adjusted

SP: 00000000ff734a60 GP: 00000000ff734e00 TP: 0000000000000001
T0: 00000000ff734b40 T1: 00000000fff46288 T2: ffffffffffffffff
S0: 0000000080100000 S1: 0000000000000002 A0: 0000000000000001
A1: 00000000ff74a368 A2: 00000000ff74a368 A3: fffffffffffffffe
A4: 0000000000000002 A5: 0000000080100000 A6: 000000000000000f
A7: 00000000fffa8308 S2: 00000000ff74a360 S3: 00000000ff74a360
S4: 0000000000000002 S5: 00000000ffff04f4 S6: 0000000000000000
S7: 00000000ff74a3c0 S8: 0000000000000000 S9: 0000000000000000
S10: 0000000000000000 S11: 0000000000000000 T3: 0000000000000010
T4: 0000000000000000 T5: 000000000001869f T6: 00000000ff734b20

Code: 0000 0000 0000 0000 0000 0000 0000 0000 (457f 464c)

It’s an ELF. Entry point is not at the start of the file.

Please review u-boot documentation for adequate way to start ELF kernels, and let us know what worked while at it. (I have no idea from the top of my head)

1 Like

I found the entry point using readelf and the objdup and tried to load the run the kernel but I could not success.

Error :

StarFive # setenv loadaddr 0x80100000
StarFive # tftpboot ${loadaddr} main.elf
Using ethernet@16040000 device
TFTP from server 192.168.68.108; our IP address is 192.168.68.118
Filename ‘main.elf’.
Load address: 0x80100000
Loading: #
11.7 KiB/s
done
Bytes transferred = 1168 (490 hex)
StarFive # go 0x801100e8

Starting application at 0x801100E8 …

Unhandled exception: Illegal instruction
EPC: 00000000801100e8 RA: 00000000fff4795a TVAL: 0000000000000000
EPC: ffffffffc03cb0e8 RA: 000000004020295a reloc adjusted

SP: 00000000ff734a60 GP: 00000000ff734e00 TP: 0000000000000001
T0: 00000000ff734b40 T1: 00000000fff46288 T2: ffffffffffffffff
S0: 00000000801100e8 S1: 0000000000000002 A0: 0000000000000001
A1: 00000000ff74a368 A2: 00000000ff74a368 A3: fffffffffffffffe
A4: 0000000000000002 A5: 00000000801100e8 A6: 000000000000000f
A7: 00000000fffa8308 S2: 00000000ff74a360 S3: 00000000ff74a360
S4: 0000000000000002 S5: 00000000ffff04f4 S6: 0000000000000000
S7: 00000000ff74a3c0 S8: 0000000000000000 S9: 0000000000000000
S10: 0000000000000000 S11: 0000000000000000 T3: 0000000000000010
T4: 0000000000000000 T5: 000000000001869f T6: 00000000ff734b20

Code: 0000 0000 0000 0000 0000 0000 0000 0000 (0000)

It’s odd to get an exception on the very first opcode. You should look at the other registers like MCAUSE to see what the exception is. I’d also dump memory from segment start to a few bytes after the opcode and verify that the bytes at address …0e8 match the opcode you THINK is at …0e8.

Also, “custom operating system” has red flags all over it. Maybe the binary isn’t getting generated the way you think it is. Verify this works with some other OS image.

It seems very odd for the loader to know enough about ELF to map the right sections into pages built with the correct mapping and yet not be able to read the 8 bytes from the e_start header. That’s low-hanging fruit for a program loader.

I also don’t know what Code is, but I know from memory that 45 46 and 4c are ‘fXle’ which is “other endian” for ascii ELF followed by a non-ASCII character and THAT sets off yet more red flags.

Are you SURE that this bootloader is reading ELF?

1 Like

Hint: the first sentence from the first hit is "U-Boot uses special format for bootable images. "

You have more homework left to do…

1 Like

Maybe I’m reading the information you have provided wrong, but it looks to me like you have uploaded a binary file of size 1168 bytes into memory from 0x80100000 to 0x8010048f and then you are telling the CPU to start executing machine code at 0x801100e8. I could be wrong but it looks like it is beyond where the machine code was loaded into RAM.

The output from the following three commands might be useful for people trying to help you.

$ file main.elf
$ ls -l main.elf
$ hexdump -C main.elf | head -40

A website that is useful for encoding/decoding one RISC-V instruction at a time is: https://luplab.gitlab.io/rvcodecjs/

1 Like

@mzs, you’re looking too low. So was I. We both assumed the developer had done their homework.

The loader is looking for a format that isn’t ELF at all. (This makes a bit of sense for cases like netboot where you may want the loaded image to also contain a filesystem…and a module to uncompress it or otherwise have more than just an exe to start.) I linked to relevant doc.

1 Like