When AI Gets It Wrong: How High-Stakes Failures Expose the Accountability Vacuum

In 2020, Robert Williams was arrested in Detroit based on a match generated by a facial recognition system. He hadn't committed the crime. He was the first person in the United States publicly documented to have been wrongfully arrested based on AI. He would not be the last.

In 2024, a Cambridge man was wrongfully convicted based on a criminal risk-assessment algorithm that a judge relied on during sentencing. Healthcare systems have begun using AI to diagnose cancer, approve insurance coverage, and predict patient risk—with error rates varying wildly across racial groups. And hiring algorithms have systematically discriminated against women, disabled people, and ethnic minorities, while operating invisibly to both job applicants and the hiring managers using them.

The common thread: when these systems fail, almost no one is held responsible. That's not a legal gray area. It's an accountability vacuum—and it's growing.

The Healthcare Crisis: 14% More Lawsuits, Still No Clear Liability

Data from 2024 showed a 14% increase in malpractice claims involving AI tools compared to 2022. The majority stemmed from diagnostic AI used in radiology, cardiology, and oncology. Missed cancer diagnoses by machine-learning software have become a central focus in several high-profile lawsuits.

But here's what's telling: as AI-driven malpractice claims rise, insurance companies have started adding AI-specific exclusions to policies. Some now require physicians to undergo AI training to remain covered. In other words, the industry is pricing in the risk while the legal system hasn't yet figured out who bears responsibility.

The problem is structural. AI systems in healthcare fail in ways that are difficult to attribute. If a physician misdiagnoses cancer, the fault is clear—the physician is liable. But if an AI system flags an abnormality and the physician misses it, whose error was it? The algorithm's output? The physician's interpretation? The software vendor's claims about accuracy? The hospital's decision to deploy the system without sufficient validation?

In most cases, the answer is: all of them, and none of them. Hospitals increasingly promote the integration of AI technologies and may therefore be held liable in the event of accidents or patient harm. But the legal standard for what constitutes negligence with AI hasn't been established. A physician might have followed every protocol, understood the system's limitations, and still caused harm if the AI system was defective in ways the vendor didn't disclose.

The bigger issue: informed consent. Patients often aren't aware that an AI system has a role in their diagnostic or therapeutic management. They're making medical decisions without knowing the AI is involved. And the law hasn't caught up to require that they know.

The Hiring System: Invisible Algorithms, Visible Harm

The hiring world offers a clearer case study. AI-based hiring tools have discriminated against women and racial minorities for years, but enforcement has been spotty until recently.

In July 2024, the U.S. District Court for the Northern District of California allowed claims of unlawful discrimination to proceed against Workday Inc. for its AI-driven recruitment software, which is used by thousands of companies. The plaintiff is now seeking to bring it as a class action. This case highlights a crucial shift: courts are unlikely to allow companies to avoid liability for decisions made by their AI systems.

But that's only part of the problem. The Workday case asks one question: Can vendors be held liable for algorithmic disparate impact? Another lawsuit asks a different one: Are candidates entitled to know they're being scored, see the data used, and challenge inaccuracies?

Together, they expose complementary accountability gaps. AI hiring systems wield substantial influence over employment outcomes while remaining largely invisible to the candidates they evaluate and a black box to the TA professionals who use them. The current implementation model, where algorithms make consequential decisions using data candidates don't know about, producing assessments candidates can't see, and TA professionals may not understand, is legally vulnerable.

Candidates have no idea why they were rejected. TA professionals may not understand what the algorithm is doing. And vendors claim proprietary rights over their systems, shielding them from examination. So when the system discriminates, the accountability question becomes: If an AI tool rejects a candidate, who's responsible?

Is it the vendor who built the algorithm? The company that deployed it? The hiring manager who relied on it without understanding it? The answer remains unsettled, which means the incentives are all wrong. Vendors have little pressure to audit their systems for bias. Companies have little reason to implement guardrails. And candidates have no recourse.

The result: AI-based hiring screens now process millions of applications globally, most of them rejected by systems the candidates will never see, based on criteria they'll never understand, with no mechanism for appeal.

Criminal Justice: Liberty on Trial, Expertise Absent

The stakes are highest in criminal justice. AI systems now influence who is stopped, searched, charged, detained, sentenced, supervised, or released.

The Robert Williams case from Detroit is instructive. Facial recognition technology incorrectly matched him to a surveillance photo. He was arrested, spent 30 hours in custody, and was only released after he could prove he wasn't the person in the photo. But the Detroit Police Department didn't disclose how the technology works, the accuracy rate, the known failure modes, or the circumstances under which it was used. Williams couldn't cross-examine the algorithm in court. He couldn't test its reliability. He could only hope that a human eventually reviewed the AI's recommendation.

That's not justice. That's trust in a system that doesn't deserve it.

The problem compounds. Criminal-justice entities—thousands of under-resourced police departments, prosecutors' offices, courts, and probation units—lack the technical expertise to evaluate these tools rigorously, while vendors market directly to practitioners. This creates a governance gap: even well-intentioned actors cannot reliably apply emerging standards amid rapid technological changes.

Errors may not be obvious or isolated; they may be subtle, systematic, and persistent—especially for users without technical expertise. Facial recognition systems have been shown to have higher error rates for people with darker skin tones. Predictive policing algorithms have disproportionately targeted low-income and minority neighborhoods for increased surveillance. Bail and sentencing recommendation systems encode biases from historical data, amplifying patterns of racial discrimination in the criminal legal system.

And when these systems fail, the defendant often can't see the code, can't challenge the training data, and can't get a second opinion. Courts have struggled with how to handle evidence generated by AI. If a defendant's liberty hinges on evidence generated by an AI system, their legal team must be able to scrutinize that evidence. Yet many AI companies claim proprietary rights over their algorithms, shielding them from examination. This creates a troubling scenario where justice becomes contingent on evidence that cannot be challenged.

The Accountability Vacuum: Who's Responsible?

The fundamental problem is that AI creates a responsibility gap. In traditional systems, accountability is clear. A doctor diagnoses a patient. A judge sentences a defendant. A hiring manager hires a candidate. These are human acts, and the humans bear responsibility.

But AI systems obscure responsibility. The algorithm's creators claim they didn't intend the discriminatory outcome—the training data did. The companies deploying the system claim they're just using the vendor's tool. The vendors claim the systems are complex and they can't fully explain emergent behaviors. And meanwhile, the person harmed has no clear defendant.

Even well-designed AI systems can fail unpredictably. Unlike traditional software, many AI systems can produce outputs that seem plausible but are incorrect. Criminal justice applications demand exceptional reliability because errors can result in wrongful detention, inappropriate sanctions, or public safety failures. Yet many of these systems are deployed with minimal testing, no clear failure protocols, and no accountability mechanism if they fail.

The EU's AI Act, enacted in March 2024, attempts to address this by classifying medical AI systems as high-risk and establishing the AI Liability Directive, which aims to make it easier for individuals to seek compensation. But the United States has no comprehensive federal legislation regulating AI systems, particularly in the context of safeguarding against algorithmic discrimination. Individual states are moving forward—California, New York, Illinois, and Colorado have all passed some form of AI regulation. But the gaps remain enormous.

Consider what a robust accountability framework would require:

Transparency: Vendors would be required to disclose how their systems work, what data they use, and their known failure modes. This would allow independent auditors and defendants to scrutinize the systems.
Validation: Before deploying AI in high-stakes contexts, agencies should require rigorous, independent validation rather than relying solely on vendor claims.
Auditing: Algorithms should be audited regularly for bias and performance drift. When disparate impacts are found, alternatives should be considered before continued deployment.
Human Oversight: Meaningful human review with information access, sufficient time, training, override authority, and documentation.
Accountability: Clear legal liability for vendors when their systems cause foreseeable harm.

None of these are guaranteed in most contexts where AI is deployed today.

The Pattern: Biased Data, Biased Systems, Biased Outcomes

Cases like risk scoring in criminal justice, algorithmic hiring screens, welfare and fraud detection, and exam grading illustrate how models can encode institutional bias, produce unequal error rates, and then get treated as "neutral" because they're automated.

This happens through several mechanisms. First, training data reflects historical discrimination. If past hiring decisions were biased against women, the AI system will learn those biases. If past criminal justice decisions were biased against minorities, the prediction algorithm will amplify that bias. The system isn't more objective—it's just more scalable.

Second, the objectives are often wrong. Hiring systems are optimized for whatever proxy the company chose for "good employee"—maybe resume keywords, maybe interview performance scores. But these proxies may not predict actual job performance, and they certainly may encode bias.

Third, oversight is weak. Most organizations deploying AI in hiring or criminal justice don't have technical teams to evaluate the systems. They rely on the vendor's assurances. When a system fails to discriminate, they have little way to detect it until lawsuits appear.

And finally, there's no accountability. The vendor can claim the training data was biased. The deploying organization can claim they used the system as designed. The hiring manager or judge can claim they were just following the algorithm's recommendation. Meanwhile, the job applicant didn't get hired, or the defendant got a harsher sentence than they deserved.

The Way Forward: Accountability Without Perfection

The hard truth: we can't make AI perfect. We can't eliminate bias entirely, or guarantee that complex systems will behave as expected in all contexts.

But we can do better than the current vacuum, where systems make high-stakes decisions, fail in ways that harm people, and no one bears responsibility.

It starts with treating AI in high-stakes domains as what it is: powerful tools that require exceptional oversight, not neutral automation that can replace human judgment.

In healthcare, doctors should be required to know what AI systems are telling them, when those systems fail, and what alternatives exist.
In hiring, candidates should know they're being evaluated by an algorithm, see the data used, and have a right to challenge inaccurate assessments.
In criminal justice, defendants should be able to scrutinize the algorithms that influence their liberty, and judges should have clear protocols for when and how to override algorithmic recommendations.
Across all high-stakes domains, there should be clear legal liability when AI systems cause foreseeable harm.

None of this would make AI perfect. But it would make it accountable. And right now, that's the difference between a technology we can trust and a system that's rigged against the people it affects most.