Black Box

When Developers Can’t Explain Their AI, Laws No Longer Apply

When developers can’t explain how their AI works, the laws of robotics no longer apply

In 1942, the writer Isaac Asimov introduced a code of fictional laws so elegant and compact that their relevance in subsequent science fiction literature has scarcely ebbed. The Three Laws of Robotics—also known in the real world as Asimov’s Laws—are a built-in safety feature governing the behavior of nearly all artificial intelligences in Asimov’s substantial canon. The laws, in order of priority, dictate that a robot must not harm humans, must obey human orders, and must protect its own existence; together, they allow the AIs of Asimov’s stories to safely interact with humans and, crucially, to earn humans’ trust.

In interviews, Asimov strongly implied not only that the laws were essentially self-evident and intuitive, but that they could ideally be applied to humans as well. It’s not hard to picture this, at least in theory. Humans are notoriously rule-happy, even among other social animals, and their rule systems proliferate in tandem with the size and complexity of their social groups. For many of these, the ability to follow rules is itself at least as valuable as the purported goal of the system. Legal codes, for example, function as prescriptive rule-based systems in that violations of the law are themselves cause for punishment, with only limited regard for how the violations do or don’t contribute to the outcome of justice. Inexpert insider trading that earns the trader no profit is still criminal; even ignorance of the law and lack of malicious intent are no excuse for violations. The laws themselves have undeniable value: like Asimov’s Laws, they foster trust among members of a codependent society.

Those with high-stakes decision-making power over other humans must demonstrate even more rigorous adherence to the codes that authorize their judgments. Without these rules—even assuming a general shared aspiration toward benevolence or some other goal—the objects of such judgment have no recourse for justice when they are wronged, and the agents of judgment, human or machine, cannot be held accountable for errors or corruption. But in an age where deep machine learning and neural networks are slowly overtaking human judgment in key social arenas, artificial intelligences are bypassing principled behavior altogether, pursuing their human developers’ goals according to rules they discover and refine on their own. This method unleashes the full force of AI power to collect, identify, and analyze data at processing rates far beyond the capacity of the most assiduous of human brains. It also cloaks deeply important decision-making at unprecedented, systemic scales in a mythical opacity that developers themselves cannot penetrate.

The way machines have traditionally “judged” under the old computing model is uniformly deductive. Programmers communicate with software via direct commands, which amount to a series of hardline rules for operation, dependent on context. Give a program an input—a search term, a formatting rule, a statistical figure—and the program will pore over its internal rulebook to return an acceptable output. Since each rule is deliberately constructed by the programmer, it can be retrieved, examined, and altered in order to fix an error, simplify a process, or adapt to a new purpose altogether.

Programming an artificial intelligence requires a rulebook, too—but only the first chapter is written by a human. The hallmark of AI programs is the ability to learn directly from raw data sets (hence “machine learning”), as well as from the outcome of existing rules, in order to extract new sets of rules that will further refine results in pursuit of a goal. This model closely mirrors the familiar ways in which animals, children, and even some plants learn from data and develop internal, often unconscious, sets of rules about how the world works. Toddlers amass enormous amounts of rule-based knowledge about their surroundings by interpreting the results of their earnest experiments on everything in sight. A pre-verbal child who cannot say or recognize words like up or down will still peer at the floor when her spoon falls from her high chair, following a rule she can only have learned by induction—from careful and repeated observation of her world. Complex sets of rules, like the syntax of her first language, will require that the child collect several years of experimental data before she can reliably express herself. Even as a literate adult, she will likely be unable to identify the subtle laws she is obeying when she describes the big red barn, never the red big barn—but obey them she will.

Black Box / leolintang + ivanmollov / composite

This impenetrability of rule systems is exactly what makes machine learning so powerful—and so frightening. Today’s most exciting AI projects are often meant to replace human reasoning in much more impactful spheres than the minutiae of English syntax. The Top Stories feature run by Google’s AI spread fake news about the Las Vegas gunman who massacred country music festival attendees last fall. Facebook’s advertising algorithms, designed to cull spam and individualize marketing, have famously promoted hate speech and organizations known to incite violence against minorities. Palantir’s predictive policing AI was accused of replicating and even magnifying racial bias during its secretive testing in New Orleans; in China’s Xinjiang province, home to the persecuted Uyghur ethnic minority group, AI will soon analyze vast troves of personal data from citizens’ social activities to predict and preempt acts of defiance against the state.

The risks associated with turning over such enormously important tasks to artificial decision-making are sobering. As in the case of Palantir and Chinese predictive policing, AI that uses records of biased human judgments—like arrest records in countries known for racial profiling—as learning fodder are highly likely to replicate those biases, albeit on system-wide scales. Even when the learning data are presumed to be objective, AI analysis suffers from its inability to articulate any internal reasoning when its methods are probed. At New York’s Mount Sinai Hospital, researchers have trained an AI called Deep Patient to scan electronic health records and predict the onset of a wide range of diseases with impressive accuracy. But as soon as the machine’s diagnostic power outstrips the capacity of human doctors, its recommendations are suddenly impossible to justify—and, therefore, impossible to trust. While describing Deep Patient’s mysterious ability to predict the notoriously unforeseeable onset of schizophrenia in at-risk patients, the lead Mt. Sinai researcher Joel Dudley lamented, “We can build these models, but we don’t know how they work.”

Abstract Network / farakos

Because the opacity of machine learning algorithms is so inextricable from their utility, any attempts to force these AIs into some version of ethical or legal conformity must sacrifice some amount of computing power in the name of accountability. Forbidding the usage of training data tainted with human bias is a logical starting point; beyond that, only strict external regulation in the form of mandatory accountability measures could hope to intervene. Such control could only issue from an international authority, possibly one that does not yet exist. As daunting a project as that may seem, its undertaking is of utmost importance. If humans are to benefit from the untapped potential of machine learning technologies, we will soon be forced to reconcile the jarring distinctions between AI and human intelligence—or else consent to a world in which systemic injustice is an accepted price for efficiency.