Performance on Cray versus IBM

General

  • Cray seems to scale better than IBM - drop off on performance with more cores is less pronounced than for IBM.
  • Higher optimisation on Cray are still to be tested

ATMOS+CLASSIC

  • Best with 2 threads (I understand from JC that Paul Selwood & Co are working on running UM at 4 threads)
  • For N216, Cray/IBM = 1.7 (256 cores) - 0.67 (4,096 cores)
  • For N96, Cray/IBM = 4.62 (128 cores) - 1.5 (256 cores) - 0.91 (1,344 cores)

Offline Oxidants

  • It's still failingfor many cores for both N216 and N96 with dt=30mins (except David Water's job)
  • 1 thread seems to be slightly better than 2 threads
  • For N216, Cray/IBM = 1.7 (256 cores) - 1.4 (2,048 cores)
  • For N96 (dt=20mins), Cray/IBM = 1.8 (256 cores) - 1.5 (1,344 cores)
  • Mark Richardson has added OpenMP to GLOMAP-mode, but needs debugging
  • For N216, not done timings yet
  • For N96, Cray/IBM = 1.6 (128 cores) - 1.43 (896 cores)

Full chemistry

  • Best with 1 thread
  • For N216, Cray/IBM = 1.3 (256 cores) - 0.96 (2,048 cores)
  • For N96, Cray/IBM = 1.1 (128 cores) - 0.83 (1,344 cores)

NEMO+CICE

  • For ORCA025, Cray/IBM ~ 1.5 (392 - 608 cores)
  • ORCA1 unavailable at NEMO3.6

MEDUSA (NEMO3.4)

  • For ORCA025, Cray/IBM = 0.85 (192 cores) - 1.2 (1,080 cores). (I don't understand why performance compared to IBM seems to drop off for MEDUSA at ORCA025, and not any other models)
  • For ORCA025, Cray/IBM = 0.97 (96 cores) - 0.92 (320 cores)