GC3 runs

Steve Rumbold needs to do some pre-industrial runs, and he's currently trying to get these with GC3 which is running on 22 Broadwell nodes (36 cores per node). He has `urgent' queue access to 25 Broadwell nodes. What can we do to speed it up?

GC3 on 22 nodes

Steve's job is u-ab577 and we're running on XCE01. One node is used for XIOS. Steve was running ATMOS as 16*18, so I've swapped that to 18*16.

NMPPE*NMPPN (total, nodes) NEMO_IPROC*NEMO_JPROC (total) Model run length Total time (s) Speed (model years per day) Comment
18 * 16 (288, 8) 18 * 26 (468, 13) 1 month 4,476 1.72 Run in evening
4,845 1.49 Run during day

The first run was run after work, while second run was in the morning so the load on HPC was probably lower for the first run.

Estimated performance

Estimates for nemoCiceOrca025 are based on vn3.4, because I don't have times for vn3.6

offOxN96 nemoCiceOrca025 Comment
Cores Expected speed Expected speed when coupled* Cores Expected speed Expected speed when coupled*
288 2.13 1.92 468 2.82 2.54 1.92 yrs/day is lot faster than I found
390 2.43 2.19 364 2.43 2.19 if nodes could be shared
396 (22x18) 2.61 2.35 360 (18x20) 2.28 2.05 Best configuration predicted by my code

* We're assuming that coupling will slow model by 10% (I'm mulitplying expected speed by 0.9)

Trying more nodes for ATMOS

NMPPE*NMPPN (total, nodes) NEMO_IPROC*NEMO_JPROC (total) Model run length Total time (s) Speed (model years per day) Comment
22 * 18 (396, 11) 18 * 20 (360, 10) 1 month 5,848 1.23 Run during day (bizarre, this ran 10mins slower than same configuration except one less node for ATMOS)
6,085 1.18 Run during day

Looks like the ocean at NEMO3.6 is much slower than the timings we have from NEMO3.4 suggest.