
What is the optimum design of a C++ numerical array library for
large-scale scientific applications?
Robin J. Hogan (r.j.hogan at ecmwf.int)
ECMWF and University of Reading
Abstract. As a language for scientific computing, C++ is at a
disadvantage compared to many other languages due to its lack of a
well-designed standard for multi-dimensional arrays supporting
efficient whole-array expressions, expressive array-subsetting syntax
and linear algebra. To support the development of such a standard, in
this paper I review the interface, capabilities and weaknesses of a
number of free C++ array libraries (Adept,
Armadillo,
Blaze,
Blitz++,
Eigen,
MTL4,
ra-ra,
uBLAS
and
Xtensor)
as well as other languages supporting multi-dimensional arrays
(particularly Fortran, Python, Matlab, IDL and Julia). These are
contrasted with the verbose and limited whole-array capabilities in
the C++20 Standard Template Library. To help ensure the standard
meets the needs of large-scale scientific applications, I also present
an analysis of the array types, dimensions and features used in an
Earth-system
model for operational weather forecasting (2.2 million lines of
code). I argue that an unlimited number of dimensions should be
supported, not the limit of two imposed by many libraries focusing on
linear algebra, and propose a solution to the lack of a
matrix-multiplication operator in C++. A detailed investigation is
presented of the problem that most C++ libraries cannot simply and
efficiently pass a subset of an array to a function, and a solution is
proposed. A total of 25 specific recommendations are made that will
hopefully contribute to a discussion leading to the formulation of a
standard.
Read paper: version 1.0, 18 August
2020 (PDF)
This paper is intended to contribute to a wider discussion, and may
be updated in future. Feel free to email me if you have comments.

|