AI Safety · Tutorial 5

Safety Test

Before we implement the safety test, let us write a shell for our quasi-Seldonian algorithm, which we will call QSA. This shell code will show how the safety test will be used. At a high level, we are simply partitioning the data, getting a candidate solution, and running the safet test.

Notice that in the code below we are using the more descriptive names: candidateData for $D_1$ and safetyData for $D_2$
Notice also that we are placing 40% of the data in candidateData and 60% in safetyData. This is an arbitrary choice and it remains an open question how best to optimize this partitioning of the data.

# Our Quasi-Seldonian linear regression algorithm operating over data (X,Y).
# The pair of objects returned by QSA is the solution (first element) 
# and a Boolean flag indicating whether a solution was found (second element).
def QSA(X, Y, gHats, deltas):
  # Put 40% of the data in candidateData (D1), and the rest in safetyData (D2)
  candidateData_len = 0.40
  candidateData_X, safetyData_X, candidateData_Y, safetyData_Y = train_test_split(
                X, Y, test_size=1-candidateData_len, shuffle=False)
  
  # Get the candidate solution
  candidateSolution = getCandidateSolution(candidateData_X, candidateData_Y, gHats, deltas, safetyData_X.size)

  # Run the safety test
  passedSafety      = safetyTest(candidateSolution, safetyData_X, safetyData_Y, gHats, deltas)

  # Return the result and success flag
  return [candidateSolution, passedSafety]

Now recall the pseudocode for the safety test:

3. Safety Test: Return $\theta_c$ if $$ \forall i \in \{1,2,\dotsc,n\}, \quad \hat \mu(\hat g_i(\theta_c,D_2)) + \frac{\hat \sigma(\hat g_i(\theta_c,D_2))}{\sqrt{|D_2|}}t_{1-\delta_i,|D_2|-1} \leq 0, $$ and No Solution Found (NSF) otherwise.

Given the helper functions that we already have, this function is straightforward to write:

# Run the safety test on a candidate solution. Returns true if the test is passed.
#   candidateSolution: the solution to test. 
#   (safetyData_X, safetyData_Y): data set D2 to be used in the safety test.
#   (gHats, deltas): vectors containing the behavioral constraints and confidence levels.
def safetyTest(candidateSolution, safetyData_X, safetyData_Y, gHats, deltas):

  for i in range(len(gHats)):  # Loop over behavioral constraints, checking each
    g         = gHats[i]  # The current behavioral constraint being checked
    delta     = deltas[i] # The confidence level of the constraint

    # This is a vector of unbiased estimates of g(candidateSolution)
    g_samples = g(candidateSolution, safetyData_X, safetyData_Y) 

    # Check if the i-th behavioral constraint is satisfied
    upperBound = ttestUpperBound(g_samples, delta) 

    if upperBound > 0.0: # If the current constraint was not satisfied, the safety test failed
      return False

  # If we get here, all of the behavioral constraints were satisfied      
  return True

We're almost there. All that's left is the the function getCandidateSolution!