Above: Designing new information systems | Photo by Cecilie Arcurs
A-I, yi, yi. AI is everywhere, it seems, and into everything.
Artificial Intelligence—Siri on your phone, Alexa in your kitchen, self-driving vehicles on the highway. Machines are outsmarting us, and it’s just not fair, right?
It isn’t fair, critics say, pointing to examples of unfairness and bias where artificial intelligence and its underlying technology, machine learning, is used to simplify or automate human decision-making.
A program called COMPAS is often cited as a prime offender. COMPAS—for Correctional Offender Management Profiling for Alternative Sanctions—is a software tool used to predict recidivism in offenders that’s been used more than 2 million times since 2000. A 2016 ProPublica investigation found it was accurate only 61 percent of the time and that “Blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend.”
COMPAS algorithms made “the opposite mistake among whites,” ProPublica found. “They are much more likely than Blacks to be labeled lower-risk but go on to commit other crimes.”
Garbage in, garbage out
An infamous computer science maxim—“garbage in, garbage out”—is especially relevant to big data. If the input to a system is biased due to selection bias, institutional bias, or societal bias, then the output will be biased.
Amazon discovered it had a problem when it built an AI tool to review and rank job candidates. Amazon had trained its system using data from the résumés of previous successful hires. The problem was the data set was 95 percent male since that’s who worked at Amazon, and the artificial intelligence tool routinely rejected female candidates. Amazon shut the system down.
In October, whistleblower Frances Haugen, a former Facebook employee, made extraordinary assertions that Facebook’s algorithm harms teens’ mental health and incites misinformation, hate speech, and even ethnic violence.
“The real danger is not artificial general intelligence or super intelligence; the real danger is undue trust put into limited or flawed systems,” said Lise Getoor, professor of computer science and engineering at the Jack Baskin School of Engineering at UC Santa Cruz. She was recently named the Baskin Chair of Computer Science and Engineering.
Getoor, who joined UCSC in 2013, is a leading advocate for responsible data science and known widely for her research in machine learning and reasoning under uncertainty. Her upper-division class on ethics and algorithms begins its third year this fall. It’s a mix of both social impact studies and technical applications.
“I’ve worked hard to blend the two,” Getoor said, acknowledging “it may give students a bit of whiplash.
“Ten years ago—even seven or eight years ago—barely anybody talked about ethics and algorithms. Now there’s been a complete explosion of interest,” she said, “from things we’ve seen in the news—particularly around criminal justice, around hiring, around computer vision,” which is a field of computer science that works on enabling computers to see, identify, and process images in the same way human vision does.
Responsible data science
Getoor and her colleagues at UC Santa Cruz are tackling these and other sensitive issues of big data, artificial intelligence, and ethics in technology. Their efforts to promote responsible data science address both the technical and societal issues in emerging data-driven technologies.
Getoor heads two major data science projects at UC Santa Cruz. In 2017, the National Science Foundation (NSF) awarded her and a group of UCSC computer scientists, statisticians, and mathematicians $1.5 million as part of its Transdisciplinary Research in Principles of Data Science (TRIPODS) program, an effort to develop the theoretical principles of the field.
In TRIPODS’s second phase, NSF granted $12.5 million in 2020 to fund the Institute for Foundations of Data Science (IFDS), a collaboration between UC Santa Cruz, University of Washington, University of Wisconsin–Madison, and University of Chicago.
IFDS is a transdisciplinary research institute bringing together mathematicians, statisticians, and theoretical computer scientists. Its mission is to develop a principled approach to the analysis of the ever-larger, more complex, and potentially biased data sets that play an increasingly important role in industry, government, and academia.
Getoor also directs the D3 (for data, discovery, decisions) Data Science Research Center, a collaboration between academia and industry designed to develop open-source tools for collecting data, discovering patterns, and making decisions.
UCSC’s central role
It’s no surprise that UC Santa Cruz is at the center of this work, Getoor said.
She added that UCSC also has a very strong research reputation in algorithms in computer science and engineering.
“There is a history of expertise, dating back to the 1980s, in theoretical machine learning,” said Getoor.
That combination is what drew Yang Liu to the campus.
“It’s a big deal,” Liu, assistant professor of computer science and engineering, said of UCSC’s role in TRIPODS. Getoor “was able to put Santa Cruz on the map.”
Liu joined UC Santa Cruz in January 2019 from a postdoc at Harvard where he first became interested in data and fairness during a quarter-long discussion group.
“The fairness question is much bigger than we understood before,” Liu said. “We all make biased decisions all the time. If the data is historically biased, then we repeat or reproduce the bias.”
Liu said another reason he decided Santa Cruz was for him was that the city was one of the first to test an artificial intelligence predictive policing tool (in 2011), PredPol, that uses algorithms to predict how crime patterns will evolve. Santa Cruz was also one of the first to stop using the tool (in 2017) after it was accused of being disproportionately biased against people of color.
UC Santa Cruz “is the perfect equilibrium for me,” Liu said. “I like the campus, it’s close to Silicon Valley and its technology,” he said, “but also away, pretty open-minded, where we don’t only talk about technology.”
Liu received $1 million in funding from the NSF and Amazon earlier this year for research on the long-term effects of human interactions with artificial intelligence systems used to support decision making. His project is “Fairness in Machine Learning with Humans in the Loop.”
How humans interact with and respond to an AI system is a critical part of the equation leading to fairness, he said.
Liu cited an example of a loan service that uses a model to judge or rank applicants.
“We want the model to offer ‘recourse’ so that human agents can interact and respond to their received outcomes. What we don’t want is a model that prohibits an agent from improving their profiles to get approved in the future,” he said.
Another newcomer who joined UC Santa Cruz to focus on machine learning and ethics research is Xin (Eric) Wang, who arrived in summer 2020 an assistant professor of computer science and engineering.
His interests include natural language processing, computer vision, and machine learning, with an emphasis on building artificial intelligence tools that can communicate with humans using natural language to perform real-world tasks. Natural language processing is used in search and smart agents such as Siri and Alexa. Wang has also worked at Google AI, Facebook AI Research, Microsoft Research, and Adobe Research.
“Internet search is critically important,” he said. “It shapes people’s minds; we want a fair machine learning model.”
As an example, Wang pointed to a Google image search for basketball players. The likely results will be populated by males.
“The data itself is often biased, so machines learn the biases, and even worse, machines exaggerate such biases,” Wang said.
Interdisciplinarity, a UC Santa Cruz hallmark, is a key to, if not solving, then mitigating bias and promoting fairness in artificial intelligence.
“We want to encourage diverse teams at the Baskin School of Engineering,” said Abigail Kaun, executive adviser to Dean Alex Wolf. “We want to broaden how we can be part of the solution and inspire agents of change in the tech community.”
Kaun is working with Jon Ellis, associate professor of philosophy and director of the Center for Public Philosophy at UCSC, to create a deck of cards where each card will present an ethical issue.
“We’re putting together the most pressing 52 questions we need to be asking about technology,” she said.
Such questions might include:
- What kind of intelligence or consciousness would a robot or artificial machine need to have for it to be wrong to destroy it?
- Should tech companies be legally required to put time or money toward investigating the ethical implications of what they create?
- When, if ever, should a self-driving car sacrifice its passengers for the safety of pedestrians?
The cards are modeled after a similar project focused on microaggressions created at Harvey Mudd College. Kaun and Ellis have been crowdsourcing ideas for the cards at Tech Futures 52.
UCSC’s researchers emphasize the importance of maintaining human autonomy over artificial intelligence autonomy.
“We want to raise awareness of how much people are already being influenced by algorithms,” Getoor said, “whether ranking, recommender systems, search, friendship links, news posts. Trustworthy AI requires not just machine learning and ethics—it requires understanding how technology impacts people and society, and vice versa.”