Performance on Cray versus IBM
General
- Compared to IBM, performance of Cray is consistently better at
higher cores
- Still, higher optimisation on Cray to be tested
ATMOS+CLASSIC
- Best with 2 threads (I understand from JC that Paul Selwood & Co
are working on running UM at 4 threads)
- For N216, Cray/IBM = 1.7 (256 cores) - 0.67 (4,096 cores)
- For N96, Cray/IBM = 4.62 (128 cores) - 1.5 (256 cores) -
0.91 (1,344 cores)
Offline Oxidants
- Best with 1 thread ≤512 cores
- Probably better with 2 threads somewhere, not far, above this
- Mark Richardson has added OpenMP to GLOMAP-mode, but needs
debugging
- For N216, not done timings yet
- For N96, Cray/IBM = 1.6 (128 cores) - 1.43 (896 cores)
- For 30min Δt jobs failing. Almost every 30min Δt
job except David Water's job is failing.
Full chemistry
- Best with 1 thread
- For N216, Cray/IBM = 1.3 (256 cores) - 0.96 (2,048 cores)
- For N96, Cray/IBM = 1.1 (128 cores) - 0.83 (1,344 cores)
NEMO+CICE
- For ORCA025, Cray/IBM ~ 1.5 (392 - 608 cores)
- ORCA1 unavailable at NEMO3.6
MEDUSA (NEMO3.4)
- For ORCA025, Cray/IBM = 0.85 (192 cores) - 1.2 (1,080 cores). (I
don't understand why performance compared to IBM seems to drop off
for MEDUSA at ORCA025, and not any other models)
- For ORCA025, Cray/IBM = 0.97 (96 cores) - 0.92 (320 cores)