GLOMAP
Newton-Raphson with GLOMAP
The page covers
Details of run
- faceb (mstringe) copied from aofni (hrjy)
- 128 PEs
- Grid size is 192*144*85
- Runs for 1 month
Comparison with Backward-Euler
The wall time for this run is 5,256s, which is 223s longer than
facec (5,033s).
Routines | faceb | facec | faceb - facec |
UM_SHELL | 5,225 | 5,000 | 225 |
total | 5,225 | 5,000 | 225 |
U_MODEL_4A | 5,221 | 4,997 | 224 |
total | 5,221 | 4,997 | 224 |
ATM_STEP_4A | 3,425 | 3,374 | 51 |
UKCA_MAIN1 | 1,472 | 1,464 | 8 |
total | 4,897 | 4,838 | 59 |
SWAP_BOUNDS_EW_DP | 393 | 393 | 0 |
SWAP_BOUNDS_NS_DP | 340 | 340 | 0 |
total | 733 | 733 | 0 |
ATMOS_PHYSICS1 | 1,474 | 1,493 | -19 |
ATMOS_PHYSICS2 | 423 | 429 | -6 |
EG_SL_HELMHOLTZ | 238 | 227 | 11 |
EG_CORRECT_TRACERS | 214 | 214 | 0 |
STASH | 209 | 173 | 36 |
SL_TRACER1_4A | 154 | 162 | -8 |
EG_SL_FULL_WIND | 137 | 135 | 2 |
EG_SL_MOISTURE | 80 | 80 | 0 |
TR_SET_PHYS_4A | 76 | 76 | 0 |
EG_Q_TO_MIX | 0 | 75 | -75 |
ATM_STEP_STASH | 86 | 60 | 26 |
The extra branch added to UKCA_MAIN1 is described below
UKCA_MAIN1 |
UKCA_CHEMISTRY_CTL (172s) |
ASAD_CDRIVE (158s) |
ASAD_SPMJPDRIV (149s) |
ASAD_SPIMPMJP (148s) |
SPLINSLV2 (104s) |
While this clearly adds to the running time, it is cutting elsewhere
because UK_MAIN1 is only 59s slower for the faceb run than facec. Most
of the 224s (165s in fact) which faceb is slower than face comes
from the extra small routines listed in UM_MODEL_4A, which aren't
covered by ATM_STEP_4A or UKCA_MAIN1.
The disk usage
Header fields |
|
Min |
Mean |
Max |
(Max-Min) |
Instrument overhead (%) |
2.23 (PE 82) |
3.44 |
4.6 (PE 76) |
2.37 |
Heap (MB) |
1468 (PE 5) |
1481.23 |
1610 (PE 0) |
142 |
RSS (MB) |
612 (PE 67) |
618.47 |
741 (PE 0) |
129 |
Stack (MB) |
244 (PE 0) |
244.00 |
244 (PE 0) |
0 |
Paging |
401 (PE 33) |
741.70 |
1587 (PE 82) |
1186 |
Wall Time (s) |
5227.23 (PE 2) |
5255.88 |
5357.79 (PE 51) |
130.56 |
Thread#1 (s) |
5224.63 (PE 0) |
5224.63 |
5224.63 (PE 0) |
0 |
Thread#1 (%) |
97.51 (PE 51) |
99.41 |
99.95 (PE 2) |
2.44 |
Thread#2 (s) |
556.98 (PE 0) |
813.26 |
1143.6 (PE 68) |
586.62 |
Thread#2 (%) |
10.65 (PE 0) |
15.47 |
21.78 (PE 68) |
11.13 |
Routines by total time
Ordering routines by total: mean |
|
Min |
Mean |
Max |
(Max-Min) |
UM_SHELL@1 |
5224.63 (PE 1) |
5224.63 |
5224.635 (PE 114) |
0 |
U_MODEL_4A@1 |
5221.358 (PE 0) |
5222.58 |
5223.56 (PE 43) |
2.20 |
ATM_STEP_4A@1 |
3424.759 (PE 66) |
3425.01 |
3425.232 (PE 0) |
0.47 |
UKCA_MAIN1@1 |
1472.102 (PE 2) |
1610.01 |
1679.955 (PE 76) |
207.85 |
ATMOS_PHYSICS1@1 |
1474.237 (PE 0) |
1515.72 |
1550.137 (PE 122) |
75.90 |
UKCA_AERO_CTL@1 |
837.511 (PE 2) |
862.63 |
887.744 (PE 64) |
50.23 |
UKCA_AERO_STEP@1 |
790.897 (PE 2) |
816.03 |
840.997 (PE 64) |
50.10 |
SWAP_BOUNDS@1 |
370.029 (PE 68) |
588.72 |
907.36 (PE 81) |
537.33 |
RAD_CTL@1 |
366.521 (PE 2) |
489.55 |
590.868 (PE 84) |
224.35 |
RADIANCE_CALC@2 |
348.466 (PE 2) |
467.36 |
573.493 (PE 68) |
225.03 |
RADIANCE_CALC@1 |
351.513 (PE 1) |
466.33 |
576.996 (PE 84) |
225.48 |
ATMOS_PHYSICS2@1 |
423.008 (PE 32) |
441.43 |
456.956 (PE 102) |
33.95 |
HALO_EXCHANGE:SWAP_BOUNDS_EW_DP@1 |
245.302 (PE 117) |
392.65 |
661.832 (PE 82) |
416.53 |
UKCA_COAGWITHNUCL@1 |
352.003 (PE 2) |
359.72 |
365.956 (PE 61) |
13.95 |
LW_RAD@2 |
276.992 (PE 2) |
355.01 |
435.467 (PE 68) |
158.47 |
LW_RAD@1 |
280.194 (PE 1) |
353.92 |
436.341 (PE 84) |
156.15 |
NI_GWD_CTL@1 |
108.57 (PE 68) |
347.99 |
656.098 (PE 82) |
547.53 |
HALO_EXCHANGE:SWAP_BOUNDS_NS_DP@1 |
157.542 (PE 67) |
339.93 |
537.404 (PE 4) |
379.86 |
MICROPHYS_CTL@1 |
81.528 (PE 82) |
326.58 |
466.075 (PE 68) |
384.55 |
LS_PPN@1 |
74.393 (PE 82) |
319.31 |
458.916 (PE 68) |
384.52 |
G_WAVE_5A@1 |
79.676 (PE 68) |
313.85 |
628.434 (PE 82) |
548.76 |
LS_PPNC@1 |
66.054 (PE 82) |
309.78 |
448.558 (PE 68) |
382.50 |
HALO_EXCHANGE:SWAP_BOUNDS_EW_H1_DP@1 |
142.173 (PE 117) |
279.24 |
550.243 (PE 82) |
408.07 |
UKCA_RADAER_BAND_AVERAGE@2 |
168.927 (PE 1) |
272.03 |
361.764 (PE 68) |
192.84 |
UKCA_RADAER_BAND_AVERAGE@1 |
172.18 (PE 1) |
269.00 |
365.513 (PE 84) |
193.33 |
EG_SL_HELMHOLTZ@1 |
221.913 (PE 116) |
237.50 |
257.537 (PE 32) |
35.62 |
EG_INTERPOLATION_ETA@1 |
226.292 (PE 103) |
236.39 |
260.305 (PE 5) |
34.01 |
EG_CORRECT_TRACERS@1 |
211.783 (PE 90) |
213.53 |
215.371 (PE 63) |
3.59 |
UKCA_READ_OFFLINE_OXIDANTS_CTL@1 |
205.395 (PE 122) |
212.17 |
217.041 (PE 0) |
11.65 |
UKCA_OFFLINE_OXIDANTS_UPDATE@1 |
204.726 (PE 122) |
211.53 |
216.371 (PE 0) |
11.65 |
STASH@1 |
206.692 (PE 4) |
208.72 |
211.216 (PE 64) |
4.52 |
STWORK@1 |
206.178 (PE 4) |
208.20 |
210.683 (PE 64) |
4.50 |
NI_CONV_CTL@1 |
105.202 (PE 4) |
198.50 |
295.486 (PE 84) |
190.28 |
LSP_ICE@1 |
32.378 (PE 82) |
195.24 |
329.083 (PE 68) |
296.71 |
LSP_ICE@2 |
21.271 (PE 82) |
191.44 |
342.467 (PE 68) |
321.20 |
UKCA_CHEMISTRY_CTL@1 |
159.885 (PE 4) |
171.54 |
184.956 (PE 86) |
25.07 |
EG_CORRECT_TRACERS_UKCA@1 |
165.945 (PE 76) |
167.70 |
169.122 (PE 82) |
3.18 |
SL_TRACER1_4A@1 |
154.424 (PE 36) |
159.65 |
163.898 (PE 72) |
9.47 |
ASAD_CDRIVE@1 |
145.95 (PE 4) |
157.52 |
170.989 (PE 86) |
25.04 |
SOLVE_BAND_K_EQV@1 |
114.574 (PE 82) |
153.08 |
178.775 (PE 118) |
64.20 |
SOLVE_BAND_K_EQV@2 |
122.032 (PE 37) |
151.07 |
179.389 (PE 118) |
57.36 |
ASAD_SPMJPDRIV@1 |
138.137 (PE 4) |
149.28 |
162.386 (PE 86) |
24.25 |
ASAD_SPIMPMJP@1 |
136.972 (PE 4) |
148.12 |
161.218 (PE 86) |
24.25 |
GLUE_CONV_5A@1 |
59.057 (PE 10) |
147.44 |
260.191 (PE 84) |
201.13 |
HALO_EXCHANGE:SWAP_BOUNDS_DP@1 |
84.665 (PE 125) |
145.81 |
205.938 (PE 44) |
121.27 |
UKCA_CONDEN@1 |
129.926 (PE 119) |
142.70 |
157.483 (PE 64) |
27.56 |
UKCA_ACTIVATE@1 |
47.787 (PE 2) |
142.31 |
181.268 (PE 68) |
133.48 |
GLUE_CONV_5A@2 |
63.004 (PE 4) |
142.13 |
253.887 (PE 51) |
190.88 |
EG_BICGSTAB@1 |
137.405 (PE 122) |
138.19 |
139.512 (PE 53) |
2.11 |
EG_SL_FULL_WIND@1 |
126.708 (PE 71) |
136.84 |
146.641 (PE 26) |
19.93 |
EG_MASS_CONSERVATION@1 |
135.017 (PE 65) |
136.71 |
138.43 (PE 90) |
3.41 |
UKCA_ABDULRAZZAK_GHAN@1 |
37.264 (PE 2) |
131.75 |
170.694 (PE 68) |
133.43 |
EG_SWAP_BOUNDS_DP@1 |
81.792 (PE 125) |
130.61 |
181.49 (PE 44) |
99.70 |
MCICA_SAMPLE@1 |
83.078 (PE 82) |
121.38 |
145.69 (PE 118) |
62.61 |
MCICA_SAMPLE@2 |
91.16 (PE 63) |
119.54 |
145.887 (PE 118) |
54.73 |
SW_RAD@1 |
77.719 (PE 1) |
119.27 |
147.43 (PE 84) |
69.71 |
SW_RAD@2 |
77.872 (PE 2) |
119.22 |
144.885 (PE 68) |
67.01 |
GET_EMFILE_REC@1 |
112.8 (PE 122) |
118.11 |
122.598 (PE 0) |
9.80 |
EM_GET_TIME_REC@1 |
112.718 (PE 122) |
118.03 |
122.522 (PE 0) |
9.80 |
EM_GET_DATA_REAL@1 |
111.06 (PE 122) |
116.39 |
120.882 (PE 0) |
9.82 |
SWAP_BOUNDS_MV@1 |
28.396 (PE 84) |
115.38 |
199.869 (PE 81) |
171.47 |
MONOCHROMATIC_RADIANCE@1 |
78.448 (PE 82) |
106.25 |
126.666 (PE 118) |
48.22 |
LSP_SUBGRID@1 |
12.441 (PE 82) |
105.15 |
189.266 (PE 68) |
176.82 |
MONOCHROMATIC_RADIANCE@2 |
83.904 (PE 63) |
104.92 |
125.631 (PE 118) |
41.73 |
ASAD_SPARSE_VARS:SPLINSLV2@1 |
95.805 (PE 58) |
103.88 |
113.311 (PE 85) |
17.51 |
LSP_SUBGRID@2 |
8.904 (PE 82) |
103.17 |
199.939 (PE 68) |
191.03 |
UKCA_SYNC@1 |
32.847 (PE 76) |
102.83 |
240.835 (PE 2) |
207.99 |
eg_CUBIC_LAGRANGE@1 |
96.001 (PE 47) |
97.68 |
103.408 (PE 6) |
7.41 |
EG_PRECON@1 |
90.502 (PE 1) |
93.57 |
98.272 (PE 96) |
7.77 |
TRI_SOR_DP_DP@1 |
90.385 (PE 1) |
93.44 |
98.146 (PE 96) |
7.76 |
MONOCHROMATIC_RADIANCE_TSEQ@1 |
65.447 (PE 82) |
93.23 |
113.064 (PE 118) |
47.62 |
MCICA_COLUMN@1 |
65.245 (PE 82) |
93.01 |
112.823 (PE 118) |
47.58 |
UKCA_COND_COFF_V@1 |
81.884 (PE 119) |
92.41 |
103.16 (PE 64) |
21.28 |
MONOCHROMATIC_RADIANCE_TSEQ@2 |
70.663 (PE 63) |
91.99 |
111.883 (PE 118) |
41.22 |
MCICA_COLUMN@2 |
70.453 (PE 63) |
91.78 |
111.644 (PE 118) |
41.19 |
ATM_STEP_STASH@1 |
84.996 (PE 22) |
86.03 |
86.957 (PE 80) |
1.96 |
NI_IMP_CTL@1 |
45.451 (PE 67) |
82.58 |
139.312 (PE 15) |
93.86 |
DEPARTURE_POINT_ETA@1 |
77.775 (PE 7) |
81.65 |
86.662 (PE 108) |
8.89 |
EG_SL_MOISTURE@1 |
78.181 (PE 16) |
80.09 |
81.465 (PE 67) |
3.28 |
LSP_QCLEAR@1 |
31.645 (PE 82) |
76.34 |
119.65 (PE 76) |
88 |
TR_SET_PHYS_4A@1 |
66.234 (PE 26) |
75.87 |
84.841 (PE 64) |
18.61 |
QWIDTH@1 |
22.66 (PE 82) |
75.49 |
124.74 (PE 68) |
102.08 |
UKCA_CHECK_MD_ND@1 |
69.194 (PE 75) |
70.27 |
75.388 (PE 6) |
6.19 |
UKCA_CALCNUCRATE@1 |
68.372 (PE 125) |
69.10 |
70.717 (PE 0) |
2.34 |
SPATIAL@1 |
67.004 (PE 30) |
67.71 |
68.473 (PE 64) |
1.47 |
EG_HELM_RHS_STAR@1 |
51.844 (PE 116) |
67.40 |
88.188 (PE 32) |
36.34 |
TWO_COEFF@1 |
43.54 (PE 82) |
66.72 |
82.824 (PE 118) |
39.28 |
TWO_COEFF@2 |
47.588 (PE 63) |
65.71 |
81.055 (PE 118) |
33.47 |
UKCA_BINAPARA@1 |
64.937 (PE 100) |
65.13 |
65.544 (PE 0) |
0.61 |
PP_HEAD@1 |
64.005 (PE 22) |
65.05 |
66.097 (PE 64) |
2.09 |
EM_GET_DATA_REAL3D@1 |
64.208 (PE 111) |
64.78 |
71.576 (PE 52) |
7.37 |
UKCA_SOLVECOAGNUCL_V@1 |
58.616 (PE 120) |
62.14 |
66.439 (PE 64) |
7.82 |
QWIDTH@2 |
4.984 (PE 82) |
59.15 |
115.522 (PE 68) |
110.54 |
GLOBAL_2D_SUMS@1 |
28.433 (PE 64) |
50.74 |
140.98 (PE 125) |
112.55 |
LSP_QCLEAR@2 |
4.328 (PE 82) |
50.56 |
99.799 (PE 68) |
95.47 |
EG_SISL_INIT@1 |
46.928 (PE 67) |
50.07 |
61.546 (PE 44) |
14.62 |
UKCA_VOLUME_MODE@1 |
47.083 (PE 119) |
49.73 |
52.24 (PE 79) |
5.16 |
EG_SISL_INIT_UVW@1 |
44.041 (PE 67) |
47.16 |
58.657 (PE 44) |
14.62 |
MID_CONV_5A@1 |
16.537 (PE 0) |
45.99 |
119.343 (PE 84) |
102.81 |
MID_CONV_5A@2 |
17.869 (PE 3) |
45.42 |
113.131 (PE 86) |
95.26 |
The most expensive routines
Ordering routines by self: mean |
|
Min |
Mean |
Max |
(Max-Min) |
HALO_EXCHANGE:SWAP_BOUNDS_NS_DP@1 |
157.542 (PE 67) |
339.93 |
537.404 (PE 4) |
379.86 |
ATMOS_PHYSICS1@1 |
300.894 (PE 82) |
329.67 |
342.476 (PE 105) |
41.58 |
UKCA_COAGWITHNUCL@1 |
291.242 (PE 66) |
297.58 |
305.31 (PE 11) |
14.07 |
HALO_EXCHANGE:SWAP_BOUNDS_EW_H1_DP@1 |
142.173 (PE 117) |
279.24 |
550.243 (PE 82) |
408.07 |
UKCA_RADAER_BAND_AVERAGE@2 |
168.927 (PE 1) |
272.03 |
361.764 (PE 68) |
192.84 |
UKCA_RADAER_BAND_AVERAGE@1 |
172.18 (PE 1) |
269.00 |
365.513 (PE 84) |
193.33 |
UKCA_ABDULRAZZAK_GHAN@1 |
30.799 (PE 2) |
125.25 |
164.161 (PE 68) |
133.36 |
EG_CORRECT_TRACERS_UKCA@1 |
117.617 (PE 117) |
120.01 |
122.011 (PE 33) |
4.39 |
EM_GET_DATA_REAL@1 |
110.635 (PE 122) |
115.97 |
120.466 (PE 0) |
9.83 |
SWAP_BOUNDS_MV@1 |
28.396 (PE 84) |
115.38 |
199.869 (PE 81) |
171.47 |
LS_PPNC@1 |
33.676 (PE 82) |
114.54 |
150.187 (PE 64) |
116.51 |
HALO_EXCHANGE:SWAP_BOUNDS_EW_DP@1 |
100.696 (PE 114) |
113.41 |
124.73 (PE 57) |
24.03 |
ASAD_SPARSE_VARS:SPLINSLV2@1 |
95.805 (PE 58) |
103.88 |
113.311 (PE 85) |
17.51 |
UKCA_SYNC@1 |
32.847 (PE 76) |
102.83 |
240.835 (PE 2) |
207.99 |
eg_CUBIC_LAGRANGE@1 |
96.001 (PE 47) |
97.68 |
103.408 (PE 6) |
7.41 |
EG_MASS_CONSERVATION@1 |
95.715 (PE 76) |
97.01 |
98.023 (PE 32) |
2.31 |
UKCA_COND_COFF_V@1 |
81.884 (PE 119) |
92.41 |
103.16 (PE 64) |
21.28 |
QWIDTH@1 |
22.66 (PE 82) |
75.49 |
124.74 (PE 68) |
102.08 |
EG_INTERPOLATION_ETA@1 |
66.717 (PE 87) |
75.21 |
92.301 (PE 5) |
25.58 |
UKCA_CHECK_MD_ND@1 |
69.172 (PE 75) |
70.24 |
75.388 (PE 6) |
6.22 |
UKCA_BINAPARA@1 |
64.937 (PE 100) |
65.13 |
65.544 (PE 0) |
0.61 |
EM_GET_DATA_REAL3D@1 |
64.056 (PE 111) |
64.63 |
71.43 (PE 52) |
7.37 |
TRI_SOR_DP_DP@1 |
59.813 (PE 14) |
62.68 |
71.219 (PE 125) |
11.41 |
UKCA_SOLVECOAGNUCL_V@1 |
58.616 (PE 120) |
62.14 |
66.439 (PE 64) |
7.82 |
GLUE_CONV_5A@1 |
36.546 (PE 81) |
61.67 |
98.098 (PE 59) |
61.55 |
GLUE_CONV_5A@2 |
36.041 (PE 82) |
61.34 |
99.935 (PE 58) |
63.89 |
QWIDTH@2 |
4.984 (PE 82) |
59.15 |
115.522 (PE 68) |
110.54 |
GLOBAL_2D_SUMS@1 |
28.433 (PE 64) |
50.74 |
140.98 (PE 125) |
112.55 |
UKCA_CONDEN@1 |
47.715 (PE 1) |
50.29 |
54.727 (PE 82) |
7.01 |
UKCA_AERO_CTL@1 |
41.735 (PE 58) |
42.08 |
42.39 (PE 8) |
0.66 |
EXPPXI@1 |
40.945 (PE 4) |
41.73 |
42.388 (PE 52) |
1.44 |
ATM_STEP_4A@1 |
39.724 (PE 74) |
41.04 |
42.424 (PE 71) |
2.70 |
NI_CONV_CTL@1 |
24.042 (PE 84) |
40.04 |
84.945 (PE 49) |
60.90 |
EG_CORRECT_TRACERS@1 |
37.839 (PE 28) |
38.61 |
39.737 (PE 0) |
1.90 |
STWORK@1 |
25.975 (PE 0) |
38.34 |
39.188 (PE 37) |
13.21 |
DEPARTURE_POINT_ETA@1 |
32.487 (PE 54) |
35.72 |
39.443 (PE 124) |
6.96 |
UKCA_COAG_COFF_V@1 |
33.708 (PE 44) |
33.78 |
34.076 (PE 1) |
0.37 |
EG_SISL_INIT_UVW@1 |
31.962 (PE 29) |
33.19 |
34.644 (PE 121) |
2.68 |
ASAD_SPARSE_VARS:SPFULJAC@1 |
30.131 (PE 15) |
32.53 |
37.938 (PE 89) |
7.81 |
LSP_QCLEAR@1 |
12.383 (PE 82) |
29.81 |
46.248 (PE 76) |
33.86 |
PP_HEAD@1 |
29.068 (PE 13) |
29.40 |
29.75 (PE 40) |
0.68 |
GATHER_FIELD_MPL@1 |
22.108 (PE 0) |
29.04 |
31.048 (PE 32) |
8.94 |
EM_FOPEN@1 |
26.938 (PE 66) |
28.27 |
29.11 (PE 15) |
2.17 |
UKCA_MAIN1@1 |
25.417 (PE 66) |
27.36 |
28.104 (PE 8) |
2.69 |
SWAP_BOUNDS_2D_MV@1 |
5.698 (PE 84) |
26.32 |
68.979 (PE 15) |
63.28 |
SPATIAL@1 |
25.89 (PE 30) |
26.12 |
26.321 (PE 32) |
0.43 |
LSP_SUBGRID@1 |
3.138 (PE 82) |
25.09 |
44.37 (PE 68) |
41.23 |
UKCA_VOLUME_MODE@1 |
24.218 (PE 123) |
24.86 |
25.468 (PE 94) |
1.25 |
LSP_SUBGRID@2 |
2.178 (PE 82) |
24.42 |
46.087 (PE 68) |
43.91 |
STEXTC@1 |
22.891 (PE 5) |
23.23 |
23.661 (PE 64) |
0.77 |