Extract from Matthew G. White, Alexander F. Koskey, Madison “MJ” McMahan’s article “California’s New Generative AI Law – What Your Organization Needs to Know”
California is making waves with its new AI law, Assembly Bill 2013 (AB 2013), set to take effect in 2026. This groundbreaking legislation (again) puts the state at the forefront of tech regulation by tackling one of AI’s biggest challenges: the “black box” problem. AB 2013 demands transparency, requiring AI companies to disclose detailed information about the data they use to train their generative models, shedding light on a previously hidden layer of machine learning. With this bold move, California is leading the charge for accountability in artificial intelligence by requiring AI companies to explain what’s going on inside their AI systems.
Unpacking the Basics: Algorithms, Training Data, and Models
To understand the implications of AB 2013, it is useful to understand the fundamentals of how machine learning works. At its core, machine learning consists of three primary components: (1) an algorithm (or set of algorithms); (2) training data; and (3) the resulting model. The algorithm is essentially a set of instructions or procedures that can be fine-tuned to find patterns. During the training phase, the algorithm is fed a vast array of examples – known as training data – which allows it to recognize these patterns on its own. Once this training phase is complete, the outcome is a machine-learning model. This model is what users actually interact with; it’s the tool that applies the algorithm’s learned patterns to real-world data.
AB 2013 focuses primarily on the training data piece of this triad. Since training data is fundamental to a model’s behavior, any hidden biases or issues in the data directly impact the resulting model, often in ways that are hard to detect or understand. Under AB 2013, developers will need to disclose extensive details about their training data, including its sources, types, and whether it includes copyrighted or sensitive information. This type of documentation offers insight into how models are shaped by the data they’re built on – and turns that black box into a somewhat clearer shade of gray.