This shows the memory bandwidth for the Pine64 Star64, based on the JH7110 SoC, configured with 8GB RAM.
Very interesting, thanks for sharing.
The two steps would be explained by:
32-KiB L1 Data cache
2 MiB LIM data/instruction cache
The only reason I can think of for the copy being half that of sequential/random reads or writes, is because both the incoming new data needs to be cached and the old data that waiting to be written to SDRAM needs to be cached which is in effect halving the available cache for both.
That would also explain why the throughput is half. But I remember seeing there there were two ports to the memory at different memory addressed.
System Memory Map
The following table shows the memory map of the JH7110 system.
Table 1. System Memory Map|Start Address|End Address|Size|Attribute|Device/Description|
Start Address End Address Size Attribute Device/Description
0x00_4000_0000 0x02_3FFF_FFFF 8GB RWX C A N DDR for memory port
0x02_4000_0000 0x04_3FFF_FFFF 8GB Reserved
0x04_4000_0000 0x06_3FFF_FFFF 8GB RWX N DDR for system port
0x06_4000_0000 0x08_3FFF_FFFF 8GB Reserved
I wonder, in theory is it possible to use some kernel function to double all the throughput shown above.
Write to both “DDR for memory port” and “DDR for system port” in parallel.
Read from both “DDR for memory port” and “DDR for system port” in parallel.
For copy, read from “DDR for memory port”/“DDR for system port” and write to “DDR for system port”/“DDR for memory port”.
Looking at the spec for the JH7110 I don’t see two ports to RAM. Maybe that was regarding the RAM being 16-bits wide each and there are two chips? I do see where they say the DDR memory bus is only 32 bits wide.
It does appear to have two ports in this diagram but … one bus looks good (AXI4 128 bit @ 800 MHz) and the other access would be through the apbs bus (possibly 32 bit @ 100 MHz) which would probably cause problems for all the other hardware on that bus. And you would push 32 times the amount of data through the AXI4 bus compared to the apbs bus, so it would just cause a lot of problems with very little benefit (~3% extra throughput).
Devices which under apbs bus are low bandwidth. I do not think that will cause problems .
Abps & AXI bus are NOC BUS’s port.
We need to care about how much bandwidths NOC BUS can provides.