Dr. Abeba Birhane: The Researcher Holding AI Accountable for Its Data

5 min read

Jun 18, 2026

Most AI researchers build things. Abeba Birhane takes them apart.

. . .

Born in Ethiopia and trained as a cognitive scientist, Dr. Birhane occupies an unusual position in the AI world. While much of the field races to scale up models and ship products, she has spent years doing the slower, less glamorous work of actually examining what those models are trained on. The question driving her research is deceptively simple: if AI systems learn from data, what happens when that data is deeply flawed?

The answer, she has found, is exactly what you would expect.

The Audit That Pulled 80 Million Images Off the Internet

In 2020, Birhane and researcher Vinay Prabhu published a paper that sent shockwaves through the computer vision community. They had audited two of the most widely used image training datasets in AI research: ImageNet and MIT's 80 Million Tiny Images. What they found was not a minor calibration issue. The datasets carried racist and misogynistic labels on a massive scale, encoded into the foundations of tools that had already been deployed in real-world systems.

MIT responded by taking the dataset down entirely. It was a direct consequence of the audit, and it demonstrated something important: this kind of work has real teeth.

The paper earned Birhane VentureBeat's AI Innovation Award in computer vision. But more than any award, it established her as someone willing to do the work that the AI industry tends to skip over in its rush to build.

Scale Does Not Fix Bias. It Amplifies It.

One of the most counterintuitive findings in Birhane's career challenges a core assumption of modern AI development: that bigger datasets produce better, fairer models.

In a paper she co-authored on what she calls "hate scaling laws," Birhane and her colleagues found the opposite. As datasets grow, hateful content scales proportionally with them. The claim that scale alone solves bias problems is, in her words, something to take "with a big bucket of salt."

This matters because scaling has become the dominant strategy in AI. Companies pour resources into gathering more data, training larger models, and chasing benchmark improvements. Birhane's research suggests that without systematic auditing, this approach bakes discrimination into AI at industrial quantities.

Algorithmic Colonization and the Question of Whose Values Get Encoded

Birhane's work extends beyond dataset audits. Her 2020 paper "Algorithmic Colonization of Africa" examined how AI systems developed with corporate agendas and Western assumptions get exported to African communities with little regard for local context. The technology arrives not as a neutral tool but as something that carries inherited assumptions about whose experiences are normal and whose are edge cases.

Her thinking here draws on Afro-feminist theory, which she applies to data analysis and computational sciences. She has noted that combining dataset auditing with that theoretical lens is "not a common combination," but she finds it productive precisely because it forces questions that purely technical frameworks miss: not just whether a system works, but who it works for, and who built the assumptions inside it.

Building the Field of AI Accountability

In 2022, a paper Birhane co-authored, "The Values Encoded in Machine Learning Research," won a Distinguished Paper Award at FAccT, one of the leading conferences on fairness and accountability in AI. The paper examined the research culture of machine learning itself, arguing that the field's priorities and reward structures encode particular values, whether or not researchers are conscious of them.

It is part of a broader effort to not just audit AI outputs but to examine the systems, institutions, and incentives that produce them.

That effort now has a physical home. Birhane founded and leads the AI Accountability Lab (AIAL) at Trinity College Dublin, where she is a Research Fellow in the School of Computer Science and Statistics. The lab focuses on systematic audits of AI models and training datasets. She also completed her PhD at University College Dublin in 2022, where her doctoral research examined the challenges of automating human behavior and the pitfalls of large-scale datasets.

Her work has drawn attention well beyond academia. In 2023, TIME magazine named her one of the 100 most influential people in AI. She has served on the United Nations Secretary-General's AI Advisory Body and on Ireland's national AI Advisory Council.

Why This Work Is Hard

There is a physical reality to what Birhane does that is easy to overlook. Auditing datasets for harmful content means spending hours looking at material that is, as she has put it, "not safe for work" most of the time. The work is not abstract.

She has also pushed back on the idea that technical fixes alone can resolve these problems. When harmful racial associations were discovered in AI image systems, some researchers proposed simply retraining the models on cleaner data. Birhane's position is more demanding. "These problems are not just a matter of 'fixing' training data," she has argued. They are rooted in history and social context. You have to understand where the problems came from before you can meaningfully address them. That requires going back in history, which, as she notes, is something the AI field does not often do.

What She Represents in the AI Debate

The AI industry has a complicated relationship with the kind of work Birhane does. Auditing, accountability, and ethical scrutiny slow things down. They produce uncomfortable findings. They involve perspectives that traditional computer science training does not emphasize.

But Birhane came to AI research precisely because she noticed the gap. Models were being trained on larger and larger datasets gathered from across the internet, including its most harmful corners, and almost nobody was systematically checking them. She stepped into that space and helped build a new research discipline around it.

The field she has helped create does not get as much attention as model releases or benchmark records. It probably should.

Share this story