AI Safety · Tutorial 6

Candidate Selection

Recall the pseudocode for computing the candidate solution:

2. Candidate Selection: Use a black-box optimization algorithm to compute $\theta_c$ that approximates a solution to: $$ \theta_c \in \arg\max_{\theta \in \Theta} \hat f(\theta,D_1)\\ \text{s.t. } \forall i \in \{1,2,\dotsc,n\}, \quad \hat \mu(\hat g_i(\theta,D_1)) + 2\frac{\hat \sigma(\hat g(\theta,D_1))}{\sqrt{|D_2|}}t_{1-\delta_i,|D_2|-1} \leq 0 $$

We will use CMA-ES to search for the candidate solultion. However, CMA-ES is not a constrained algorithm. We will therefore incorporate the constraint into the objective function as a barrier function. That is, we will find an approximate solution to the unconstrained problem:

$$ \theta_c \in \arg \max_{\theta \in \mathbb R^2} \begin{cases} \hat f(\theta,D_1) &\mbox{if } \forall i \in \{1,2,\dotsc,n\}, \,\, \operatorname{ub}_i \leq 0\\ -100,\!000 - \sum_{i=1}^n \operatorname{max}\left(0,\operatorname{ub}_i\right ) &\mbox{otherwise,} \end{cases} $$ where $\operatorname{ub}_i$ is the conservative prediction of what the upper bound will be in the safety test for the $i^\text{th}$ constraint: $$ \operatorname{ub}_i=\hat \mu(\hat g_i(\theta,D_1) + 2\frac{\hat \sigma(\hat g(\theta,D_1))}{\sqrt{|D_2|}}t_{1-\delta_i,|D_2|-1}. $$

Here $-100,\!000$ is a constant small enough to ensure that $\hat f$ will be larger for any solution that is predicted to pass the safety test, and we include the term $\sum_{i=1}^n \max(0,\operatorname{ub}_i)$ to shape the barrier function to encourage CMA-ES to tend towards solutions that will pass the safety test.

To implement candidate selection in this way, we begin by writing the objective function that CMA-ES will attempt to maximize. Recall that CMA-ES is a black-box optimization algorithm. It does not know that our objective function depends on the data, the $\hat g_i$ functions, the values of each $\delta_i$, and the size of $D_2$ (the size of the safety data set). So, these terms will be passed within the "params" argument. CMA-ES only knows that we are passing an array of pointers to constant objects. Here inside of our objective function, we will unpack this array of pointers to constant objects of unknown types into pointers to objects of the types we need. When we call CMA-ES later, we will need to pack these relevant objects into the params object in the same order/way that we are unpacking them here.

// The objective function maximized by getCandidateSolution.
double candidateObjective(
  const VectorXd& theta,                // The solution to evaluate
  const void * params[],                // Other terms that we need to compute the objective value, packed into one object - an array of const pointers to objects of unknown types (unknown to CMA-ES, but known to us here as we packed this object)
  mt19937_64& generator)                // The random number generator to use
{
  // Unpack the variables in params into their different types. See how they were packed in getCandidateSolution
  const vector<Point>* p_Data = (const vector<Point>*)params[0];            // The first pointer in params is a pointer to the data.
  const vector<VectorXd(*)(const VectorXd&, const vector<Point>&)>* p_gHats = (const vector<VectorXd(*)(const VectorXd&, const vector<Point>&)>*)params[1]; // The second pointer in params is a pointer to the std::vector of pointers to the gHat functions.
  const vector<double>* p_deltas = (const vector<double>*)params[2];        // The third pointer in params is a pointer to std::vector of delta_i values.
  const unsigned int safetyDataSize = *((const unsigned int*)params[3]);    // The fourth pointer in params is a pointer to the number of points in the safety data set.

  double result = fHat(theta, *p_Data); // Get the primary objective
  bool predictSafetyTest = true;        // Prediction of what the safety test will say - initialized to "true" = pass.
  for (unsigned int i = 0; i < p_gHats->size(); i++) {                      // Loop over the behavioral constraints
    double ub = ttestUpperBound((*p_gHats)[i](theta, *p_Data), (*p_deltas)[i], safetyDataSize); // Get the prediction of what the upper bound on g_i(theta) will be in the safety test.
    if (ub > 0) {                       // We don't think the i'th behavioral constraint will pass the safety test if we return theta as the candidate solution
      if (predictSafetyTest) {
        predictSafetyTest = false;      // We don't think the safety test will pass
        result = -100000;               // Put a barrier in the objective - any solution that we think will pass all tests should have a value greater than what we return for this solution. Also, remove any shaping due to the primary objective so that we focus on the behavioral constraint.
      }
      result -= ub;                     // Add a shaping to the objective function that will push the search toward solutions that will pass the prediction of the safety test.
    }
  }
  return result;
}

In the above code, the computation of the variable ub can be scary looking. Let's look at this line in more detail. p_gHats is a pointer to an array of function pointers (our $\hat g_i$ functions). The code (*p_gHats)[i] is getting the $i^\text{th}$ function pointer—the pointer to the function gHat1 if i=1 and gHat2 if i=2. This function pointer is then called on the inputs theta, *p_Data, i.e., we are computing $\hat g_i(\theta,D_1)$, where p_Data is the pointer to $D_1$, which we are dereferencing. The result of this call is a vector of unbiased estimates of $g_i(\theta)$. This vector is then the first argument to the call to ttestUpperBound. The second argument is $\delta_i$, and the third is $|D_2|$.

Now that we have our candidate objective function, we can write getCandidateSolution, which uses CMA-ES to search for a solution that maximizes candidateObjective. Within getCandidateSolution we will have to pack the object params in a way that is consistent with how we unpack it in candidateObjective.

// Use the provided data to get a solution expected to pass the safety test
VectorXd getCandidateSolution(const vector<Point> & Data, const vector<VectorXd(*)(const VectorXd&, const vector<Point>&)>& gHats, const vector<double>& deltas, const unsigned int& safetyDataSize, mt19937_64 & generator) {
  VectorXd initialSolution = leastSquares(Data);    // Where should the search start? Let's just use the linear fit that we would get from ordinary least squares linear regression
  double initialSigma = 2.0*(initialSolution.dot(initialSolution) + 1.0); // A heuristic to select the width of the search based on the weight magnitudes we expect to see.
  int numIterations = 100;                          // Number of iterations that CMA-ES should run. Larger is better, but takes longer.
  bool minimize = false;                            // We want to maximize the candidate objective.
  // Pack parameters of candidate objective into params. In candidateObjective we need to unpack in the same order.
  const void* params[4];
  params[0] = &Data;
  params[1] = &gHats;
  params[2] = &deltas;
  params[3] = &safetyDataSize;
  // Use CMA-ES to get a solution that approximately maximizes candidateObjective
  return CMAES(initialSolution, initialSigma, numIterations, candidateObjective, params, minimize, generator);
}

That's it! If you compile and run the program, you should get the following console output: A solution was found: 1.28802 0.737138. While an exciting result, it doesn't look very exciting. In the next tutorial we show how you can change the code in int main(...) to repeatedly run your QSA algorithm using different amounts of data and printing results to files that can be used to create plots like Fig. 3 in our paper.

The final main.cpp file with all of the code up until this point can be downloaded here:
Download main.cpp