Adept: A fast automatic differentiation library for C++

Robin Hogan, May 2012 (r.j.hogan@reading.ac.uk)

Overview

Adept (Automatic Differentiation using Expression Templates) is a software library that enables algorithms written in C and C++ to be automatically differentiated. It uses an operator overloading approach, so very little code modification is required. Differentiation can be performed in forward mode, reverse mode (to compute the adjoint), or the full Jacobian matrix can be computed. Moreover, the use of expression templates and several other important optimizations mean that reverse-mode differentiation is 5-9 times faster than the leading libraries that provide equivalent functionality (ADOL-C and CppAD) and less memory is used. In fact, Adept is also only around 5-20% slower than an adjoint code you might write by hand, but immeasurably faster in terms of user time; adjoint coding is very time consuming and error-prone. For further details, read the paper.

Functionality

Adept provides the following functionality:

Note that at present Adept is missing some functionality that you may require:

It is hoped that future versions will remedy these limitations (and it is hoped that a future version of Fortran will support templates).

License

Adept is released under the GNU General Public License (GPL). This means that you can use and modify the library for any purpose, but that if you distribute a software package that incorporates the library or a modified version of it, you must release the source code for the entire software package under the conditions of the GNU GPL. If you would like to use Adept under other terms, please contact Robin Hogan.

If you use Adept in a publication, please cite the Hogan (2012) paper. This is a request and not a condition of the license.

How to use Adept to automatically differentiate your code

If you have used ADOL-C or CppAD then you will already be familiar with what is involved in applying an operator-overloading automatic differentiation package to your code. The user interface to Adept differs from these two only in the detail.

Step 1: Code preparation

Firstly, in all source files containing code to be differentiated, you need to include the adept.h header file and import the adouble type from the adept namespace. Assuming your code uses double precision, you then search and replace double with the "active" equivalent adouble, but doing this only for those variables whose values depend on the independent input variables.

Consider the contrived example in the paper:

   double algorithm(const double x[2]) {
     double y = 4.0;
     double s = 2.0*x[0] + 3.0*x[1]*x[1];
     y *= sin(s);
     return y;
   }
The modified code would look like this, where the changes are shown in red:
   #include "adept.h"
   using adept::adouble;

   adouble algorithm(const adouble x[2]) {
     adouble y = 4.0;
     adouble s = 2.0*x[0] + 3.0*x[1]*x[1];
     y *= sin(s);
     return y;
   }

Changes like this need to be done in all source files that form part of an algorithm to be differentiated. Note that you may use adouble as the template argument of an Standard Template Library (STL) vector type (i.e. std::vector<adouble>), but unfortunately not for the STL complex and valarray types.

If you need to access the real number underlying an adouble variable a, for example in order to use it as an argument to the fprintf function, then use a.value() or adept::value(a). Of course, any mathematical operations performed on this real number will not be differentiated.

Step 2: Interface for reverse-mode differentiation

Suppose you wanted to create a wrapper around algorithm to return not only the result but also to compute the adjoint, it would look like this:

   #include "adept.h"

   double algorithm_ad(const double x_val[2], // Input values
                       double* Y_ad,          // Input-output adjoint
                       double x_ad[2]) {      // Output adjoint
     using namespace adept;                   // Import Stack and adouble from adept
     static Stack stack;                      // Where the derivative information is stored
     stack.start();                           // Start recording
     adouble x[2] = {x_val[0], x_val[1]};     // Initialize adouble inputs
     adouble Y = algorithm(x);                // Call version overloaded for adouble args
     Y.set_gradient(*Y_ad);                   // Load the input-output adjoint
     stack.reverse();                         // Run the adjoint algorithm
     x_ad[0] = x[0].get_gradient();           // Extract the output adjoint for x[0]
     x_ad[1] = x[1].get_gradient();           //   ...and x[1]
     *Y_ad   = Y.get_gradient();              // Input-output adjoint will have changed too
     return Y.value();                        // Return the result of the simple computation
   }

The component parts of this function deserve some comment:

To do forward-mode differentiation in this example would involve setting the initial gradients of x instead of Y, calling the member function forward() instead of reverse(), and extracting the final gradients from Y instead of x.

Step 3: Compiling

Under a Unix-like environment, compile the code ensuring that adept.h can be found (e.g. put it in /home/me/include and compile with -I/home/me/include) and link the code to the compiled Adept library file libadept.a (e.g. put it in /home/me/lib and compile with -L/home/me/lib -ladept).

Calculating the full Jacobian matrix

If the full Jacobian matrix is required then the interface code can be a little simpler:

   #include "adept.h"

   double algorithm_jac(const double x_val[2],// Input values
                        double jac[2]) {      // Output Jacobian matrix
     using namespace adept;                   // Import Stack and adouble from adept
     static Stack stack;                      // Where the derivative information is stored
     stack.start();                           // Start recording
     adouble x[2] = {x_val[0], x_val[1]};     // Initialize adouble inputs
     adouble Y = algorithm(x);                // Call version overloaded for adouble args
     stack.independent(x, 2);                 // Identify independent variables
     stack.dependent(Y);                      // Identify dependent variable(s)
     stack.jacobian(jac);                     // Compute the Jacobian and store in jac
     return Y.value();                        // Return the result of the simple computation
   }

The changes from the previous example are in red:

Tips for the best performance

See also