Policy Primer - The Limits of Thresholds
AUTHORS
Cohere For AI team
ABSTRACT
Prominent AI governance frameworks around the world have specified thresholds based on the amount of computing power used to train an AI model, measured in floating-point operations (FLOPs). Models that exceed these thresholds are assumed to pose a level of risk that requires additional reporting and scrutiny. However, the appropriateness of these approaches is under debate among scientific communities, in response to growing evidence that increased training compute does not necessarily equate to increased risk. This is due to several factors: (1) Model capabilities and performance are affected by factors beyond just training compute, including data quality, downstream model optimization techniques, and algorithmic architectures. (2) The risks associated with AI models are affected by factors that are not accounted for in training compute measures, such as characteristics of the datasets used in model training, deployment context, and safety optimization. Further complexities add to the limitations of compute-based thresholds, including technical uncertainties about how FLOPs should be calculated, and the fact that current training compute thresholds are not likely to be met by any existing models, meaning that immediate and near-term risks may be overlooked. The limitations of compute-based thresholds may have the following consequences that hinder the ultimate goal of managing AI risk (in no particular order, and not an exhaustive list): Incentives could be created for model developers to game compute thresholds rather than meaningfully address risks; Regulatory scrutiny could be applied disproportionately to models over the threshold that may not actually pose greater risk than models under the threshold; Resources and capacity of those working on AI safety could be diverted away from near-term, real-world risks. To address these limitations, policymakers could consider: adopting alternative or complementary approaches to assessing which AI models should face greater scrutiny, developing dynamic rather than static thresholds, and more clearly defining approaches to calculating and measuring FLOPs. This policy primer provides an overview of evidence about the limitations of compute-based thresholds, to support policymakers implementing risk-based governance of AI models. The primer references technical concepts that are explored in deeper detail in an essay by the Head of Cohere For AI, Sara Hooker.