So - final thoughts for gcc 12.2.1
COMMON_FLAGS="-O2 -pipe -fomit-frame-pointer"
OPT_FLAGS="--param l1-cache-size=32 --param l2-cache-size=2048"
CFLAGS="-mabi=lp64d -march=rv64imafdc_zicsr_zba_zbb -mcpu=sifive-u74 -mtune=sifive-7-series ${COMMON_FLAGS} ${OPT_FLAGS}"
CXXFLAGS="-mabi=lp64d -march=rv64imafdc_zicsr_zba_zbb -mcpu=sifive-u74 -mtune=sifive-7-series ${COMMON_FLAGS} ${OPT_FLAGS}"
There is no performance difference measured with nbench between 0.93 and 1.0 zba / zbb