GLOMAP
Stratospheric chemistry + GLOMAP
Details of run
- facei (mstringe) copied from aofni (frjy)
- Stratospheric chemistry + GLOMAP
- 192 PEs (16*12 rather than 16 * 8 = 128 PEs, which was used
before).
- Grid size is 192*144*85
- Runs for 1 month.
I've run Backward-Euler with a similarly configuration for comparison
(facej).
Profiling for UM_SHELL
Backward-Euler
Routines |
UM_SHELL (3,103s) |
U_MODEL_4A (3,099s) |
ATM_STEP_4A* (2,117s) |
UKCA_MAIN1 (844s) |
MEAN- CTL (28s) |
ATMOS _PHYS- ICS1 (898s) |
EG_ COR- RECT_ TRAC- ERS (19s) |
ATMOS _PHYS- ICS2 (294s) |
EG_ SL_ HELM- HOLTZ (165s) |
TR_ SET_ PHYS _4A* (64s) |
EG_CORRECT _TRACERS _UKCA (110s)
|
SL_ TRAC- ER1_ 4A (121s) |
EG_ SL_ MOI- STURE (52s) |
EG_SL_ FULL_WIND (92s) |
⇓ |
UP- DATE _M_ STAR (59s) |
ATM_ STEP_ STASH (56s) |
⇓ |
⇓ |
See profiling for UKCA_ MAIN1 below |
Profile for ATMOS_ PHYSICS1 and EG_CORRECT _TRACERS not shown here
|
See profile for ATMOS_ PHYSICS2 and EG_SL_ HELMHOLTZ below
|
EG_SL_WIND_U, EG_SL_WIND_V & EG_SL_WIND_W (26 + 25 + 30 = 81s)
|
⇓ |
EG_Q_ TO_MIX (60s) |
⇓ |
STASH (160s) |
Itself (79s) |
EG_INTERPOLATION _ETA (157s) |
DEP- ARTURE_ POINT _ETA (58s) |
EG_SWAP_ BOUNDS_DP (131s) |
STWORK (160s) |
EG_ CUBIC_ LAG- RANGE (64s, itself) |
EG_VERT_ WEIGHTS_ ETA (11s, itself) |
MONO_ ENFORCE (13s, itself) |
Itself (24s) |
See profile for SWAP_ BOUNDS _DP below |
SPA- TIAL (55s) |
PP_ HEAD (54s) |
EXP- PXI (34s, itself) |
*should also link to SWAP_BOUNDS_DP, like many other
returns.
Stratospheric chemistry + GLOMAP
Routines |
UM_SHELL (8,218s) |
U_MODEL_4A (8,214s) |
ATM_STEP_4A* (3,287s) |
UKCA_ MAIN1 (3,195s) |
MEAN- CTL (83s) |
ATMOS _PHYS- ICS1 (923s) |
EG_ COR- RECT_ TRAC- ERS (19s) |
ATMOS _PHYS- ICS2 (456s) |
EG_ SL_ HELM- HOLTZ (176s) |
TR_ SET_ PHYS _4A* (196s) |
EG_CORRECT _TRACERS _UKCA (612s)
|
SL_ TRAC- ER1_ 4A (387s) |
EG_ SL_ MOI- STURE (55s) |
EG_SL_ FULL_WIND (106s) |
⇓ |
UP- DATE _M_ STAR (33s) |
ATM_ STEP_ STASH (81s) |
⇓ |
⇓ |
See profil- ing for UKCA_ MAIN1 below |
AC- UMPS (72s) |
Profile for ATMOS_ PHYSICS1 and EG_CORRECT _TRACERS not shown here
|
See profile for ATMOS_ PHYSICS2 and EG_SL_ HELMHOLTZ below
|
EG_SL_ WIND_U, EG_SL_ WIND_V & EG_SL_ WIND_W (26 + 25 + 31 = 82s)
|
⇓ |
EG_Q_ TO_MIX (34s) |
⇓ |
STASH (407s) |
GEN- ERAL_ GATHER _FIELD (66s) |
Itself (461s) |
EG_INTERPOLATION _ETA (342s) |
DEPAR- TURE_ POINT _ETA (62s) |
EG_SWAP_ BOUNDS_DP (118s) |
STWORK (406s) |
STASH_ GATHER _FIELD (66s) |
EG_ CUBIC _LAG- RANGE (140s,
itself) |
EG_VERT_ WEIGHTS_ ETA
(11s, itself) |
MONO _EN- FORCE (32s, itself) |
Itself (24s) |
See profile for
SWAP_ BOUNDS _DP below |
SPA- TIAL (110s) |
PP_ HEAD (158s) |
EXP- PXI (99s, itself) |
GATHER _FIELD (65s) |
GATHER _FIELD _MPL (65s, itself) |
*should also link to SWAP_BOUNDS_DP, like many other
returns.
The path below MEANCTL didn't show up the calling trees before,
because the time spent into the routine was sufficiently small
to be ignored (i.e. below 50s).
Profiling for ATMOS_PHYSICS2
Backward-Euler
ATMOS_PHYSICS2 (294s) |
NI_CONV_CTL (129s) |
NI_IMP_CTL (59s) |
SWAP_BOUNDS, SWAP_BOUNDS_2D_MV & SWAP_BOUNDS_MV
(see table below)
|
GLUE_CONV_6A (94s) |
IMP_SOLVER (31s) |
Itself (37s) |
MID_CONV_6A (34s) |
Itself (11s) |
Stratospheric chemistry + GLOMAP
ATMOS_PHYSICS2 (456s) |
NI_CONV_CTL (231s) |
NI_IMP_CTL (87s) |
SWAP_BOUNDS, SWAP_BOUNDS_2D_MV & SWAP_BOUNDS_MV
(see table below)
|
GLUE_CONV_5A (175s) |
IMP_SOLVER (47s) |
Itself (95s) |
MID_CONV_5A (46s) |
Itself (20s) |
The difference looks largely to be in SWAP_BOUNDS - so it's
probably just caused by barrier call in here and because the
stratospheric chemistry + GLOMAP run is more imbalanced.
Backward-Euler
Routines |
Total mean time |
UKCA_MAIN* (843s) |
843s |
UKCA_AERO_CTL (572s) |
UKCA_ ACTIVATE (95s) |
667s |
UKCA_AERO_STEP (539s) |
UKCA_ ABDULRAZZAK_ GHAN (88s) |
627s |
UKCA_COAGWITHNUCL (238s) |
UKCA_ CONDEN (95s) |
UKCA_ CHECK_ MD_ND (48s, itself) |
UKCA_ CALCNUCRATE (46s) |
UKCA_ VOLUME_ MODE (34s) |
Itself (84s) |
545s |
Itself (198s) |
UKCA_ SOLVECOAGNUCL _V (40s, itself) |
UKCA_ COND_ COFF_V (61s, itself) |
Itself (31s) |
UKCA_ BINAPARA (43s, itself) |
Itself (17s) |
522s |
*UCKA_MAIN also calls STASH
Stratospheric chemistry + GLOMAP
Routines |
UKCA_MAIN* (3,072s) |
Code in offline oxidants |
Code for the extra chemistry |
UKCA_AERO_CTL (539s) |
UKCA_ ACT- IVATE (95s) |
UKCA_CHEMISTRY_CTL (1,489s) |
UKCA_FASTJX (544s) |
UKCA _EMI- SSION _CTL (51s) |
UKCA_AERO_STEP (508s) |
UKCA_ ABDUL- RAZZAK _GHAN (88s) |
ASAD_CDRIVE (1,381s) |
UKCA_ STRAT _PHOT- OL (51s) |
FASTJX_PHOTOJ (541s) |
UKCA_COAG- WITHNUCL (209s) |
UKCA_ CONDEN (94s) |
UKCA_ CHECK _MD_ ND (47s, itself) |
UKCA_ CALC- NU- CRATE (46s) |
UKCA_ VOL- UME_ MODE (33s) |
Itself (84s) |
ASAD_SPMJPDRIV (1,274s) |
⇓ |
INI- JTAB (51s) |
FASTJX_OPMIE (251s) |
FL- INT (136s, itself) |
Itself (105s) |
Itself (169s) |
UKCA_ SOLVE- COAG- NUCL _V (40s, itself) |
UKCA_ COND_ COFF _V (63s, itself) |
Itself (30s) |
UKCA_ BIN- APARA (43s, itself) |
Itself (16s) |
ASAD_SPIMPMJP (1,261s) |
⇓ |
SET- TAB (51s) |
FASTJX_ MIESCT (155s) |
Itself (96s) |
SP- LIN- SLV2 (546s, itself) |
SP- FUL- JAC (479s, itself) |
Itself (99s) |
ASAD_ DIFFUN (128s) |
Itself (13s) |
BLKSLV (154s) |
ASAD_ PRLS (119s, itself) |
Itself (66s) |
MAT- INW (54s, itself) |
|
|
*UCKA_MAIN also calls STASH, and probably quite large ~ 300s
10 day run with kcdt changed from 3600s to 1200s
I've carried out a run where run time is reduced by a third, but
most of the calls to chemistry are increased by three - so most
chemistry is called the same amount. Trying to see which
routines converging quicker by calling them every timestep.
Time in ATM_STEP_4A is 1,084s so about a third of the 3,287s
shown for run above.
Routines |
UKCA_MAIN* (2,729s) |
Code in offline oxidants |
Code for the extra chemistry |
UKCA_AERO_CTL (517s) |
UKCA_ ACT- IVATE (83s) |
UKCA_CHEMISTRY_CTL (1,405s) |
UKCA_FASTJX (546s) |
UKCA _EMI- SSION _CTL (17s) |
UKCA_AERO_STEP (486s) |
UKCA_ ABDUL- RAZZAK _GHAN (77s) |
ASAD_CDRIVE (1,297s) |
UKCA_ STRAT _PHOT- OL (52s) |
FASTJX_PHOTOJ (543s) |
UKCA_COAG- WITHNUCL (207s) |
UKCA_ CONDEN (83s) |
UKCA_ CHECK _MD_ ND (47s, itself) |
UKCA_ CALC- NU- CRATE (46s) |
UKCA_ VOL- UME_ MODE (31s) |
Itself (72s) |
ASAD_SPMJPDRIV (1,190s) |
⇓ |
INI- JTAB (51s) |
FASTJX_OPMIE (252s) |
FL- INT (137s, itself) |
Itself (106s) |
Itself (169s) |
UKCA_ SOLVE- COAG- NUCL _V (38s, itself) |
UKCA_ COND_ COFF _V (53s, itself) |
Itself (30s) |
UKCA_ BIN- APARA (43s, itself) |
Itself (16s) |
ASAD_SPIMPMJP (1,177s) |
⇓ |
SET- TAB (51s) |
FASTJX_ MIESCT (157s) |
Itself (95s) |
SP- LIN- SLV2 (507s, itself) |
SP- FUL- JAC (449s, itself) |
Itself (92s) |
ASAD_ DIFFUN (121s) |
Itself (13s) |
BLKSLV (156s) |
ASAD_ PRLS (112s, itself) |
Itself (67s) |
MAT- INW (54s, itself) |
|
|
Calling tree for Stratospheric chemistry + GLOMAP by
maximum time
Until now, the calling trees have been shown by mean time, but
as the full chemistry is very imbalanced it maybe instructive to
see the maximum time. The bold numbers show the maximum times
which significantly larger than the mean times.
Routines |
UKCA_MAIN* (s) |
Code in offline oxidants |
Code for the extra chemistry |
UKCA_AERO_CTL (564s) |
UKCA_ ACT- IVATE (123s) |
UKCA_CHEMISTRY_CTL (1,671s) |
UKCA_FASTJX (583s) |
UKCA _EMI- SSION _CTL (53s) |
UKCA_AERO_STEP (532s) |
UKCA_ ABDUL- RAZZAK _GHAN (117s) |
ASAD_CDRIVE (1,563s) |
UKCA_ STRAT _PHOT- OL (53s) |
FASTJX_PHOTOJ (580s) |
UKCA_COAG- WITHNUCL (216s) |
UKCA_ CONDEN (104s) |
UKCA_ CHECK _MD_ ND (50s, itself) |
UKCA_ CALC- NU- CRATE (47s) |
UKCA_ VOL- UME_ MODE (34s) |
Itself (112s) |
ASAD_SPMJPDRIV (1,456s) |
⇓ |
INI- JTAB (52s) |
FASTJX_OPMIE (278s) |
FL- INT (150s, itself) |
Itself (111s) |
Itself (175s) |
UKCA_ SOLVE- COAG- NUCL _V (43s, itself) |
UKCA_ COND_ COFF _V (72s, itself) |
Itself (32s) |
UKCA_ BIN- APARA (45s, itself) |
Itself (17s) |
ASAD_SPIMPMJP (1,443s) |
⇓ |
SET- TAB (52s) |
FASTJX_ MIESCT (178s) |
Itself (119s) |
SP- LIN- SLV2 (622s, itself) |
SP- FUL- JAC (551s, itself) |
Itself (115s) |
ASAD_ DIFFUN (149s) |
Itself (13s) |
BLKSLV (177s) |
ASAD_ PRLS (136s, itself) |
Itself (80s) |
MAT- INW (59s, itself) |
|
|
*UCKA_MAIN also calls STASH
The imbalance here isn't as much as I expected given the wait
time after UKCA_MAIN1, which is 1,570 mean seconds. But maybe this
suggests that the imbalance is worse onto a time step by time step
basis - such as which points are in sun - rather than averaging
of whole days.