Flops: ancient (quite) but portable quick & dirty FPU benchmark

In comparison:
gcc -O2 -DUNIX -Wall -Wextra -pedantic flops.c -s -o flops.gcc.rv64gc (most software probably is built this way, including GPU binaries):

   FLOPS C Program (Double Precision), V2.0 18 Dec 1992

   Module     Error        RunTime      MFLOPS
                            (usec)
     1     -7.6739e-13      0.0627    223.2070
     2     -5.7021e-13      0.0399    175.6143
     3     -2.4314e-14      0.0434    391.9548
     4      6.8612e-14      0.0400    374.6673
     5     -1.6209e-14      0.0832    348.5732
     6      1.3961e-13      0.0447    648.6781
     7     -3.6152e-11      0.1311     91.5045
     8      8.9373e-15      0.0494    607.5640

   Iterations      =  256000000
   NullTime (usec) =     0.0000
   MFLOPS(1)       =   214.2803
   MFLOPS(2)       =   190.3339
   MFLOPS(3)       =   321.1960
   MFLOPS(4)       =   512.7001

clang --driver-mode=gcc -O2 -DUNIX -Wall -Wextra -pedantic flops.c -s -o flops.clang.rv64gc:

   FLOPS C Program (Double Precision), V2.0 18 Dec 1992

   Module     Error        RunTime      MFLOPS
                            (usec)
     1     -7.6739e-13      0.0681    205.7057
     2     -5.7021e-13      0.0412    169.9322
     3     -2.4314e-14      0.0434    391.9756
     4      6.8612e-14      0.0400    374.6767
     5     -1.6209e-14      0.1072    270.4843
     6      1.3961e-13      0.0687    421.9714
     7     -3.6152e-11      0.1388     86.4404
     8      8.9373e-15      0.0734    408.7417

   Iterations      =  256000000
   NullTime (usec) =     0.0000
   MFLOPS(1)       =   208.5551
   MFLOPS(2)       =   172.1992
   MFLOPS(3)       =   270.5593
   MFLOPS(4)       =   403.5019

See the difference.

3 Likes