给visionfive优化ncnn性能

StarFive VisionFive V1 RV64GC 1.5GHz x 2

20220529
拼算力的 mobilenet 模型上依旧打不赢有v扩展的单核d1
不过凭借访存优势在 shufflenet 模型上赢了d1

model baseline pr3857 diff
squeezenet 1697.78 609.41 -64.11%
mobilenet 2239.54 993.87 -55.62%
mobilenet_v2 1468.94 728.17 -50.43%
mobilenet_v3 1191.72 674.72 -43.38%
shufflenet 623.26 358.98 -42.40%
shufflenet_v2 608.18 303.73 -50.06%
mnasnet 1379.11 735.94 -46.64%
proxylessnasnet 1523 871.64 -42.77%
efficientnet_b0 2147.2 1355.94 -36.85%
efficientnetv2_b0 3967.31 1325.71 -66.58%
regnety_400m 1718.47 878.48 -48.88%
blazeface 192.36 106.69 -44.54%
googlenet 9846.36 2484.78 -74.76%
resnet18 12075.38 2198.83 -81.79%
alexnet 4575.12 2252.52 -50.77%
vgg16 146658.11 18000.95 -87.73%
resnet50 20093.45 5499.88 -72.63%
squeezenet_ssd 8924.95 1904.07 -78.67%
mobilenet_ssd 4967.77 2230.24 -55.11%
mobilenet_yolo 13190.93 6003.25 -54.49%
mobilenetv2_yolov3 5637.48 2685.14 -52.37%
yolov4-tiny 29967.02 4873.81 -83.74%
nanodet_m 1589.48 836.73 -47.36%
yolo-fastest-1.1 849.2 437.98 -48.42%
yolo-fastestv2 670.07 393.94 -41.21%
5 Likes