What is the optimum design of a C++ numerical array library for large-scale scientific applications?

Robin J. Hogan (r.j.hogan at ecmwf.int) ECMWF and University of Reading

Abstract. As a language for scientific computing, C++ is at a disadvantage compared to many other languages due to its lack of a well-designed standard for multi-dimensional arrays supporting efficient whole-array expressions, expressive array-subsetting syntax and linear algebra. To support the development of such a standard, in this paper I review the interface, capabilities and weaknesses of a number of free C++ array libraries (Adept, Armadillo, Blaze, Blitz++, Eigen, MTL4, ra-ra, uBLAS and Xtensor) as well as other languages supporting multi-dimensional arrays (particularly Fortran, Python, Matlab, IDL and Julia). These are contrasted with the verbose and limited whole-array capabilities in the C++20 Standard Template Library. To help ensure the standard meets the needs of large-scale scientific applications, I also present an analysis of the array types, dimensions and features used in an Earth-system model for operational weather forecasting (2.2 million lines of code). I argue that an unlimited number of dimensions should be supported, not the limit of two imposed by many libraries focusing on linear algebra, and propose a solution to the lack of a matrix-multiplication operator in C++. A detailed investigation is presented of the problem that most C++ libraries cannot simply and efficiently pass a subset of an array to a function, and a solution is proposed. A total of 25 specific recommendations are made that will hopefully contribute to a discussion leading to the formulation of a standard.

Read paper: version 1.0, 18 August 2020 (PDF)

This paper is intended to contribute to a wider discussion, and may be updated in future. Feel free to email me if you have comments.

Return to Robin Hogan's home page | Department of Meteorology | University of Reading