This version of the tutorial will use C++. We hope to provide a version of this tutorial using python in the near future. Until then, this tutorial assumes that you are already familiar with C++ and have your own toolchain/IDE. We have tested the code in this tutorial using recent versions of gcc and MSVC. This tutorial uses the Eigen library for linear algebra and uses Boost to create a C++ version of Matlab's tinv function. For instructions regarding installing Eigen, click here. For instructions regarding installing Boost, click here.
Create a new C++ project, and place HelperFunctions.hpp in the include directory. This file contains some simple functions that are not specific to Seldonian algorithms, but which will be useful to us:
Function Name  Description 

tinv  double tinv(double p, unsigned int nu) returns the 100(1p ) percentile of the Student t distribution with nu degrees of freedom. This is a C++ implementation of Matlab's tinv function. This is the only function that uses Boost. If you have your own tinv implementation that does not use Boost, then you could use that here and avoid installing Boost. 
stddev  double stddev(const VectorXd& v) provides the sample standard deviation of the vector v , with Bessel's correction: \(\operatorname{stddev}(v)\sqrt{\frac{1}{v1} \sum_{i=1}^{v}(v_i\operatorname{mean}(v))^2}\), where \(\operatorname{mean}(v)=\frac{1}{v}\sum_{i=1}^{v} v_i\) is the sample mean of v . 
ttestUpperBound  double ttestUpperBound(const VectorXd& v, const double& delta, const int numPoints = 1) returns a (1delta )confidence upper bound on the mean of the distribution that generated the independent and identically distributed values in the vector v , assuming that \(\operatorname{mean}(v)\) is normally distributed. If numPoints is provided as an argument, then this function predicts what its output would be if it were called with a new vector v that contains numPoints samples from the same distribution that generated v . This functionality will be useful when the candidate selection mechanism is predicting what the future output of the safety test will be. 
CMAES  VectorXd CMAES(...) is an implementation of the blackbox optimization algorithm CMAES. Its inputs are:

Create main.cpp and copy in the following code:
// Standard includes that should come with your compiler #include <iostream> // For console i/o #include <vector> // For vectors, not including linear algebra #include <random> // For random number generation // Additional library that you should have downloaded. #include <Eigen/Dense> #include "HelperFunctions.hpp" // General functions not specific to our Seldonian algorithm implementation using namespace std; // To avoid writing std::vector all the time using namespace Eigen; // To avoid writing Eigen::VectorXd all the time // We will place code for the Seldonian algorithm here. // Entry point for the program int main(int argc, char* argv[]) { mt19937_64 generator(0); // Create the random number generator to use, with seed zero unsigned int numPoints = 5000; // Let's use 5000 points vector<Point> Data = generateData(numPoints, generator); // Generate the data // Create the behavioral constraints  each is a gHat function and a confidence level delta. Put these in vector objects, one element per constraint vector<VectorXd(*)(const VectorXd&, const vector<Point>&)> gHats(2); // The array of gHat functions to use gHats[0] = gHat1; // This gHat requires the MSE to be less than 2.0 gHats[1] = gHat2; // This gHat requires the MSE to be at least 1.25 vector<double> deltas(2, 0.1); // The array of confidence levels, delta, to use. Initialize with two values, both equal to 0.1. pair<VectorXd, bool> result = QSA(Data, gHats, deltas, generator); // Run the Seldonian algorithm. if (result.second) cout << "A solution was found: " << result.first.transpose() << endl; else cout << "No solution found." << endl; }
The code above includes the Eigen library and HelperFunctions.hpp (which in turn includes Boost). The function main(...)
is set up to run a simple experiment. Read through the function main, which relies on some things that we will need to write. Point
will be an object that holds one data point, generateData
will be a function that generates a data set for us to run the algorithm on, gHat1
will be \(\hat g_1\), gHat2
will be \(\hat g_2\), and QSA
will be our quasiSeldonian algorithm. The pair of objects returned by QSA
is the solution (first element) and a Boolean flag indicating whether a solution was found (second element). The first element is only guaranteed to be set if the second element is true (if a solution was found).
We now implement the problem that we defined earlier in the tutorial [go back]. The following functions should all be placed above the function int main(...)
. First, we define a Point
object that will contain a single data point:
// We will store individual data points in this object struct Point { double x; // Input value double y; // Output value };
Next, we write the generateData
function, which samples data as described in the problem description:
// Generate numPoints data points. Here generator is the random number generator to use vector<Point> generateData(int numPoints, mt19937_64 & generator) { vector<Point> result(numPoints); // Create the vector of data points that we will return, of length numPoints normal_distribution<double> d(0, 1); // Create a standard normal distribution (mean 0, variance 1) for (Point& p : result) { // Loop over each point p in result p.x = d(generator); // Sample x from a standard normal distribution p.y = p.x + d(generator); // Set y to be x, plus noise from a standard normal distribution } return result; }
Next, we write the function that takes in a solution \(\theta\) and an input \(X\), and produces as output the prediction of \(Y\), i.e., this function implements \(\hat y(X,\theta)\):
// Using the weights in theta, predict the yvalue associated with the provided x. // This function assumes we are performing linear regression, so that theta has // two elements, the slope (first parameter) and yintercept (second parameter) double predict(const VectorXd & theta, const double & x) { return theta[0] * x + theta[1]; }
Next, we write the function \(\hat f\), which specifies the primary objective: minimize the sample mean squared error. Since we attempt to maximize \(\hat f\), we return the negative sample mean squared error, so that maximizing \(\hat f\) corresponds to minimizing the sample mean squared error:
// Estimator of the primary objective, in this case, the negative sample mean squared error double fHat(const VectorXd& theta, const vector<Point> & Data) { double result = 0, prediction; // We will store the sample MSE in result. Prediction will store the prediction for each point in the data set for (const Point& p : Data) { // Loop over points p in the data set (Data) prediction = predict(theta, p.x); // Get the prediction using theta result += (prediction  p.y) * (prediction  p.y); // Add the squared error to result } result /= (double)Data.size(); // We want the sample mean squared error, not the sum of squared errors, so divide by the number of samples return result; // Return the value that we have computed }
Next, we write the functions \(\hat g_1\) and \(\hat g_2\), which will be provided as input to the Seldonian algorithm. Recall from the Simple Problem tutorial that for every \(i\) and \(j\): $$ \hat g_{1,j}(\theta,D) = (\hat y(X_j,\theta)Y_j)^22.0, $$ and $$ \hat g_{2,j}(\theta,D) = 1.25(\hat y(X_j,\theta)Y_j)^2. $$ Compare the above definitions to the code below.
// Returns unbiased estimates of g_1(theta), computed using the provided data VectorXd gHat1(const VectorXd& theta, const vector<Point>& Data) { VectorXd result(Data.size()); // We will get one estimate per data point, so initialize the result to have length equal to the number of data points for (unsigned int i = 0; i < Data.size(); i++) { // Loop over the data points double prediction = predict(theta, Data[i].x); // Compute the prediction for the i'th data point result[i] = (Data[i].y  prediction) * (Data[i].y  prediction); // Compute the squared error for the i'th data point, and store in the i'th element of result. } result.array() = 2.0; // We want the MSE to be less than 2.0, so g(theta) = MSE2.0. return result; // Return the result that we have computed } // Returns unbiased estimates of g_2(theta), computed using the provided data VectorXd gHat2(const VectorXd& theta, const vector<Point>& Data) { VectorXd result(Data.size()); // We will get one estimate per data point, so initialize the result to have length equal to the number of data points for (unsigned int i = 0; i < Data.size(); i++) { // Loop over the data points double prediction = predict(theta, Data[i].x); // Compute the prediction for the i'th data point result[i] = (Data[i].y  prediction) * (Data[i].y  prediction); // Compute the squared error for the i'th data point, and store in the i'th element of result. } result.array() = 1.25  result.array(); // We want the MSE to be at least 1.25, so g(theta) = 1.25MSE. return result; // Return the result that we have computed }
Finally, later we will want the ordinary leastsquares solution to use as a starting point during the search for a candidate solution. This code implements least squares linear regression:
// Run ordinary least squares linear regression VectorXd leastSquares(const vector<Point> & Data) { // Put data into an input matrix X (Data.size() rows, 2 cols), and vector y (of Data.size()). MatrixXd X = MatrixXd::Ones(Data.size(), 2); // Initialize X to be a matrix of Data.size() rows and 2 cols, filled with ones. VectorXd y(Data.size()); // Initialize y to be a vector of length Data.size() for (unsigned int i = 0; i < Data.size(); i++) { // Loop over data points X(i, 0) = Data[i].x; // Copy the xvalue over the entry in the first colum of the i'th row. y[i] = Data[i].y; // Copy the target value into the yvector } return X.jacobiSvd(ComputeThinU  ComputeThinV).solve(y); // Return the least squares solution using Eigen's Jacobi SVD function. }
We now have all of the libraries that we need and all of the functions to implement the problem that we specified earlier. Now we're ready to start writing our Seldonian algorithm! From here it's easy—by linecount, you've already written most of the code.