Performance on Cray versus IBM
General
- Cray seems to scale better than IBM - drop off on performance with
more cores is less pronounced than for IBM.
- Higher optimisation on Cray are still to be tested
ATMOS+CLASSIC
- Best with 2 threads (I understand from JC that Paul Selwood & Co
are working on running UM at 4 threads)
- For N216, Cray/IBM = 1.7 (256 cores) - 0.67 (4,096 cores)
- For N96, Cray/IBM = 4.62 (128 cores) - 1.5 (256 cores) -
0.91 (1,344 cores)
Offline Oxidants
- It's still failingfor many cores for both N216 and
N96 with dt=30mins
(except David Water's job)
- 1 thread seems to be slightly better than 2 threads
- For N216, Cray/IBM = 1.7 (256 cores) - 1.4 (2,048 cores)
- For N96 (dt=20mins), Cray/IBM = 1.8 (256 cores) - 1.5 (1,344 cores)
- Mark Richardson has added OpenMP to GLOMAP-mode, but needs
debugging
- For N216, not done timings yet
- For N96, Cray/IBM = 1.6 (128 cores) - 1.43 (896 cores)
Full chemistry
- Best with 1 thread
- For N216, Cray/IBM = 1.3 (256 cores) - 0.96 (2,048 cores)
- For N96, Cray/IBM = 1.1 (128 cores) - 0.83 (1,344 cores)
NEMO+CICE
- For ORCA025, Cray/IBM ~ 1.5 (392 - 608 cores)
- ORCA1 unavailable at NEMO3.6
MEDUSA (NEMO3.4)
- For ORCA025, Cray/IBM = 0.85 (192 cores) - 1.2 (1,080 cores). (I
don't understand why performance compared to IBM seems to drop off
for MEDUSA at ORCA025, and not any other models)
- For ORCA025, Cray/IBM = 0.97 (96 cores) - 0.92 (320 cores)