Compilers and CPU benchmarks

Data

First number is timing in seconds (lower number is better)
Second number is factor relative to the best number (in red) for each of A, B, C, D, JAC regardless of the platform
number in brackets <> is speedup over single CPU timing
OS is GNU/Linux-2.4.2X, various distributions
CHARMM is c31a2, includes 12 DEC 2003 (R2) version of GAMESS for QM calculations
pref.dat was used
Altix (ia64): 16 CPUs
Pentium4 (ia32): P4 3.2GHz, 8 boxes (CPUs), GigE
AMD Opteron (x86_64): 2 X Opteron 244
MDGRAPE-2S: GRAPE (ia32 after MDGRAPE line is the time for no cutoff on the host only)

NOTE: None of the relative performance factors are set yet

1 CPU
machinecompilerABCDJACSHC5
x86_64-2.2GHzgcc-3.437.1,1.0065.1,1.00
ia32-3.2GHzgcc-3.445.8,1.0089.5,1.00515.3,1.002592.4,1.00707.6,1.00804.6,1.00
ia32-3.2GHzifort-8.040.9,1.0083.1,1.00399.7,1.002208.0,1.00672.1,1.00768.5,1.00
ia64-1.4GHzgcc-3.499.7,1.00146.3,1.001061.0,1.007832.5.7,1.001406.4,1.001372.9,1.00
ia64-1.4GHzifort-8.078.7,1.00107.2,1.00698.3,1.002769.3,1.00619.0,1.001120.7,1.00
x86_64-1.8GHzgcc-3.448.0,1.0081.9,1.00452.3,1.002725.2,1.00772.3,1.00779.5,1.00
x86_64-1.8GHzpathf90-1.351.7,1.0076.4,1.00702.7,1.00
x86_64-1.8GHzpgf77-5.148.785.2RT/ERT/E712.9,1.00786.9,1.00
x86_64-1.8GHzifort-8.054.597.5465.6 897.3,1.00950.5,1.00
Mac-G5-2.0GHzxlf-8.199.9 
Mac-G5-2.0GHzgcc-3.4114.3 
IBM-Pwr4-1GHzxlf-8.1100.5,1.00159.9,1.00 1267.9,1.001282.5,1.00
IBM-Pwr4-1GHzgcc-3.2(64)150.9,1.00248.1,1.00 
IBM-Pwr4-1GHzgcc-3.2(32)164.7,1.00251.7,1.00 
MDGRAPE-2Sifort-8.033.8,1.00 N/AN/A 2294.9,1.00
ia32-3.2GHzifort-8.0712.1,21.07 N/AN/A 60148.4,26.21
2 CPUs
x86_64-2.2GHzgcc-3.418.7,1.00<1.98>33.0,1.00<1.97>  
ia32-3.2GHzgcc-3.424.3,1.00<1.88>52.1,1.00<1.72>270.8,1.00<1.90>1286.1,1.00<2.01>429.5,1.00<1.65>410.0,1.00<1.96>
ia32-3.2GHzifort-8.021.8,1.00<1.88>48.3,1.00<1.72>207.0,1.00<1.93>1141.5,1.00<1.93>407.9,1.00<1.65>385.0,1.00<2.00>
ia64-1.4GHzgcc-3.450.5,1.00<1.97>74.6,1.00<1.96>537.7,1.00<1.97>4050.1,1.00<1.93>728.6,1.00<1.93>699.5,1.00<1.96>
ia64-1.4GHzifort-8.039.8,1.00<1.98>53.6,1.00<2.00>354.4,1.00<1.97>1458.6,1.00<1.90>331.9,1.0<1.87>580.0,1.00<1.93>
x86_64-1.8GHzgcc-3.424.6,1.00<1.95>44.1,1.00<1.86>244.0,1.00<1.85>1376.7,1.00<1.97>  
Mac-G5-2.0GHzgcc-3.463.5,1.00<1.80> 
IBM-Pwr4-1GHzxlf-8.151.8,1.00<1.94>83.3,1.00<1.92> 658.0,1.00<1.93>657.8,1.0<1.95>
MDGRAPE-2Sifort-8.018.5,1.00<1.83> N/AN/A 1161.1,1.0<1.98>
ia32-3.2GHzifort-8.0360.3,19.48<1.98> N/AN/A 30348.0,26.09<1.98>
4 CPUs
x86_64-2.2GHzgcc-3.49.6,1.00<3.86>17.4,1.00<3.74>  
ia32-3.2GHzgcc-3.414.0,1.00<3.27>32.1,1.00<2.79>133.7,1.00<3.85>656.2,1.00<3.95>274.7,1.00<2.58>219.0,1.00<3.67>
ia32-3.2GHzifort-8.012.6,1.00<3.25>30.4,1.00<2.73>106.4,1.00<3.76>578.0,1.00<3.82>264.8,1.00<2.54>200.9,1.00<3.83>
ia64-1.4GHzgcc-3.426.0,1.00<3.83>38.2,1.00<3.83>275.4,1.00<3.85>1997.6,1.00<3.92>379.4,1.00<3.71>362.1,1.00<3.79>
ia64-1.4GHzifort-8.020.3,1.00<3.88>28.2,1.00<3.80>182.0,1.00<3.84>719.2,1.00<3.85>176.1,1.00<3.52>295.5,1.00<3.79>
x86_64-2.2GHzgcc-3.418.7,1.00<1.98>33.0,1.00<1.97>  
IBM-Pwr4-1GHzxlf-8.127.3,1.00<3.68>44.1,1.00<3.63> 362.6,1.00<3.50>408.9,1.0<3.14>
MDGRAPE-2Sifort-8.011.1,1.00<3.05> N/AN/A 593.2,1.00<3.87>
ia32-3.2GHzifort-8.0184.8,16.65<3.85> N/AN/A 15077.0,25.18<3.99>
8 CPUs
x86_64-2.2GHzgcc-3.45.6,1.00<6.63>11.3,1.00<5.76>  
ia32-3.2GHzgcc-3.49.0,1.00<5.09>23.2,1.00<3.86>71.2,1.00<7.24>350.4,1.00<7.40>204.5,1.00<3.46>125.3,1.00<6.42>
ia32-3.2GHzifort-8.08.3,1.00<4.92>22.2,1.00<3.74>58.5,1.00<6.83>301.5,1.00<7.32>198.1,1.00<3.39>119.8,1.00<6.41>
ia64-1.4GHzgcc-3.413.5,1.00<7.39>20.8,1.00<7.03>143.7,1.00<7.38>1031.9,1.00<7.59>211.6,1.00<6.65>188.0,1.00<7.30>
ia64-1.4GHzifort-8.010.8,1.00<7.29>16.0,1.00<6.70>94.9,1.00,<7.36>369.3,1.00<7.50>107.7,1.00<5.75>154.9,1.00<7.23>
MDGRAPE-2Sifort-8.07.7,1.00<4.40> N/AN/A 315.3,1.00<7.29>
ia32-3.2GHzifort-8.093.7,12.17<7.60> N/AN/A 7533.0,23.29<7.98>
16 CPUs
ia64-1.4GHzgcc-3.47.7,1.00<12.95>13.1,1.00<11.17>78.1,1.00<13.59>519.8,1.00<15.07>135.4,1.00<10.39>105.3,1.00<13.04>
ia64-1.4GHzifort-8.06.4,1.00<12.30>10.8,1.00<9.93>50.8,1.00<13.75>191.3,1.00<14.46>85.0,1.00<7.28>88.4,1.00<12.68>

Notes:


Compile options:
See also the older page and Spatial decomposition benchmarks
Milan Hodoscek
Last modified: Sun Feb 20 09:47:27 CEST 2005