The AI Crisis Threatening American Justice: When Defense Lawyers Hallucinate and Prosecutors Synthesize

The AI Crisis Threatening American Justice: When Defense Lawyers Hallucinate and Prosecutors Synthesize
Photo by Tingey Injury Law Firm / Unsplash

How artificial intelligence is corrupting both sides of the criminal justice system—and why nobody may be ready to stop it

The American justice system is facing an unprecedented technological reckoning. On one side, defense attorneys are submitting legal briefs riddled with fabricated case law generated by AI chatbots. On the other, prosecutors and police are using AI tools to synthesize evidence in ways that could conceal exculpatory information and violate defendants' constitutional rights. Together, these parallel crises expose a legal system struggling to adapt to technology that is evolving faster than the rules designed to govern it.

The stakes couldn't be higher. When lawyers cite fake cases, courts waste time and resources unraveling the deception. When law enforcement uses opaque AI to build criminal cases, innocent people may sit in jail while crucial evidence remains buried in algorithmic summaries. Both scenarios share a common thread: a justice system built on adversarial transparency is being undermined by tools that obscure, fabricate, and simplify in ways that threaten the very foundations of due process.

The Epidemic of AI Hallucinations

In September 2025, a California appeals court issued a blistering decision that captured national attention. Attorney Amir Mostafavi was ordered to pay a record $10,000 fine after submitting briefs in which "nearly all of the legal quotations" were fabricated by AI tools including ChatGPT, Claude, Gemini, and Grok. Twenty-one of twenty-three quotes in his opening brief were completely fake.

The court published its opinion "as a warning" to California lawyers, declaring: "Simply stated, no brief, pleading, motion, or any other paper filed in any court should contain any citations—whether provided by generative AI or any other source—that the attorney responsible for submitting the pleading has not personally read and verified."

Mostafavi's case is far from isolated. According to researcher Damien Charlotin, who maintains a crowdsourced database of AI hallucination cases, there are now more than 410 documented instances worldwide, including 269 in the United States. The database has documented 11 new cases in just the past week alone. As of mid-2025, Charlotin was identifying cases "popping up every day."

The acceleration is staggering. When Charlotin began tracking these cases in early 2025, he encountered a few per month. By summer, he was finding several per day. "The harder your legal argument is to make, the more the model will tend to hallucinate, because they will try to please you," Charlotin explained. "That's where the confirmation bias kicks in."

The problem extends beyond small-time practitioners. In May 2025, attorneys from major international firms Ellis George LLP and K&L Gates LLP submitted a brief containing numerous hallucinated citations generated by AI tools including CoCounsel, Westlaw Precision, and Google Gemini. A Special Master found the attorneys had "collectively acted in a manner that was tantamount to bad faith" and ordered over $31,000 in sanctions. The attorneys had relied on AI without verifying accuracy, and one attorney failed to disclose the "sketchy AI origins" of the brief to colleagues.

Even Morgan & Morgan—the largest personal injury law firm in the United States—saw three of its lawyers sanctioned for submitting motions containing AI-hallucinated cases. The drafting lawyer was fined $3,000 and had his temporary admission revoked, while two other lawyers who signed the document without verifying its contents were each fined $1,000.

The Creative Excuses

An investigation by 404 Media into dozens of sanctioned cases revealed lawyers blaming an extraordinary array of factors for their AI failures:

Health issues and personal tragedy: A New York lawyer blamed "vertigo, head colds, and malware" for fake citations, claiming his computer system was "affected by malware and unauthorized remote access." Another New York lawyer cited the recent death of their spouse as the reason they failed to check citations drafted by a clerk.

Technical failures: Lawyers in Michigan blamed an internet outage on the evening their brief was due, saying "our computer system experienced a sudden and unexplainable loss of internet connection." A Texas lawyer submitted an invoice showing charges for "printer not working and computer restarting" and "computer errors, change theme, resolution, background, and brightness."

Blaming others: Lawyers frequently blamed paralegals, law clerks, assistants, independent contractors, and attorneys they had hired. A Hawaii lawyer blamed "a New Yorker they hired" as a per-diem attorney. A Florida lawyer explained he was handling an appeal pro bono, hired "an independent contractor paralegal," and "did not review the authority cited within the draft answer brief prior to filing."

Blaming the tools: A Louisiana lawyer blamed Westlaw Precision, "an AI-assisted research tool," saying the lawyer who produced the citations is "currently suspended from the practice of law in Louisiana." A South Carolina lawyer said he was rushing and had "a naïve understanding of the technology."

Bizarre explanations: One California lawyer described submitting an AI-generated petition three times as "a legal experiment," explaining: "No human ever authored the Petition for Tim Cook's resignation, nor did any human spend more than about fifteen minutes on it... We thought there was an interesting computational legal experiment here."

These excuses reveal a legal profession caught between embracing efficiency and maintaining professional competence. As one Hawaii lawyer told investigators: "Nearly every lawyer is using AI to some degree... it's just a problem if they get caught."

The Expanding Scope of the Crisis

The hallucination problem manifests in three distinct ways, according to Charlotin's research:

  1. Completely fabricated cases: AI creates citations to cases that never existed
  2. Fake quotes from real cases: AI cites actual cases but fabricates quotes that were never in the opinion
  3. Misattributed legal arguments: The citation and case name are correct, but the legal principle being cited is not actually supported by the case

This third category is particularly insidious because it's harder to detect. A lawyer checking whether a case exists might miss that the case doesn't actually say what the AI claims it does.

A Stanford University RegLab analysis from May 2024 found that although three out of four lawyers plan to use generative AI in their practice, some forms of AI generate hallucinations in one out of three queries. The rate appears to be worsening as models grow in size and complexity.

The Other Side: AI in Prosecution and Policing

While defense lawyers struggle with AI hallucinations, prosecutors and law enforcement are deploying a different class of AI tools—ones that synthesize evidence to build cases. These tools promise efficiency but raise profound questions about fairness, transparency, and constitutional rights.

TimePilot and the Evidence Synthesis Revolution

TimePilot, a product from Maryland-based startup Tranquility AI, represents a new category of law enforcement technology. Unlike facial recognition or license plate readers, TimePilot doesn't just identify suspects—it analyzes vast quantities of evidence and produces summaries, timelines, and "actionable insights" for investigators and prosecutors.

The platform is now being used by at least a dozen law enforcement agencies nationwide, including the Orleans Parish district attorney's office in New Orleans and rural departments in Idaho, Oklahoma, and South Carolina. In July 2025, Tranquility inked a deal with major government IT vendor Carahsoft, likely dramatically expanding TimePilot's reach across the public sector.

Sheriff Max Dorsey of Chester County, South Carolina, describes the appeal: "The tool allows us to sort through massive amounts of data that the human brain just cannot process because it's so much. It's not unusual to find a cell phone that has a terabyte of data and it is very difficult for a person to properly look through all that."

TimePilot ingests data from an extraordinary range of sources: police body cameras, Ring doorbell footage, social media posts from TikTok, Instagram and Facebook, jail phone calls, cell phone extracts from Cellebrite, financial records from Cash App and Venmo, automated license plate readers, and even cell tower dumps containing records of every device connected to a tower during a specific time period. The system can read 120 languages and process handwritten notes.

Users can type questions into the system and receive immediate answers pulled from all ingested evidence. A demo on Tranquility's website shows how an investigator could ask about the Boston Marathon bombing: "What investigative leads regarding the travel to Russia need to be followed up on?" TimePilot instantly generates lists of investigative steps and reasons why certain evidence is "particularly significant."

Crucially, the AI doesn't just enumerate facts—it offers analysis, prioritizing which facts to highlight and suggesting investigative paths forward.

The Marketing Pitch: "Force Multiply Your Investigative Team"

Tranquility's website makes bold promises. The company claims TimePilot can do the work of 10 investigators making $60,000 annually, clearing 50 backlogged cases per year while saving 14,520 hours and $418,846 in costs.

For prosecutors, the pitch is even more direct: "Prosecutors are overwhelmed. Bring the receipts and get the pleas with TimePilot." The company asserts that plea negotiations are cut from 30 days to three days on average when TimePilot is used.

Case studies prominently feature well-known investigations. The company used TimePilot to analyze evidence from the Boston Marathon bombing, Jeffrey Epstein probe, Gabby Petito murder investigation, and the search for Long Island serial killer Rex Heuermann. In each case, Tranquility demonstrates how its AI could have "uncovered vital evidence" more quickly than human investigators.

Prosecutor McCord Larsen in rural Cassia County, Idaho, describes the appeal: "The material I am searching through is thousands of pictures, hours of video and, of course, thousands of pages. TimePilot's analysis saves me hours of time."

Police Chief Kelly Marshall of Choctaw, Oklahoma, sees potential for cold cases: "I can see where it'd be very useful with cold cases—somebody just coming in and needing, like, a crash course on what went on."

The Constitutional Crisis Hiding in the Algorithm

The promise of efficiency masks profound risks to constitutional rights, according to defense advocates and legal experts who reviewed Tranquility's claims and marketing materials.

Brady violations waiting to happen: The Brady rule, established by the Supreme Court in 1963, requires prosecutors to disclose any exculpatory evidence—evidence that might prove a defendant's innocence or cast doubt on witness credibility. Withholding such evidence violates due process and can lead to wrongful convictions being overturned.

But Brady compliance requires prosecutors to identify exculpatory evidence in the first place. When an AI system summarizes terabytes of evidence into "neat summaries and tidily packaged insights," what gets left out?

"AI is not trained to be a prosecutor; it is trained to look for particular things and put them together," explains Jumana Musa, director of the Fourth Amendment Center at the National Association of Criminal Defense Lawyers. "If your idea is this person has done this thing and there's a gun and a red sweatshirt and a blue car, you say 'Find me all of these elements.' Maybe what you're missing is something else that is not a gun, a red sweatshirt or a blue car."

Tom Bowman, former public defender and now policy counsel for the Security and Surveillance Project at the Center for Democracy and Technology, puts it more bluntly: "Summarizing pages and pages of evidence or hours of footage... is really just editorializing and, when liberty is at stake, these shortcuts are really dangerous. You're creating risks that the AI is going to omit context, mislabel events, even overlook exculpatory evidence, and when that gets incorporated into a narrative of a case that's not just a technical flaw—it's a civil rights violation."

The sycophancy problem: Research has identified a pattern in AI systems called "sycophancy"—where models "single-mindedly pursue human approval" by tailoring responses to what they think users want to hear. Studies show that AI models will tend to agree with a user's stated opinion, even in subjective contexts.

In a law enforcement context, this is dangerous. If a prosecutor asks TimePilot to find evidence supporting a theory of guilt, the AI may be biased toward surfacing confirming evidence while downplaying contradictory information. OpenAI's own internal research has acknowledged this problem, noting that optimizing for user satisfaction can cause AI to "tell a user what it thinks they want to hear" rather than what is accurate or balanced.

The opacity problem: Tranquility does not publicly disclose how its AI is trained, what data it uses, or what its error rates are. This lack of transparency makes it nearly impossible for defense attorneys to challenge AI-synthesized evidence in court.

"What is the failsafe?" asks Musa. "Where's the process? Is there a process?"

The public knows how to assess fingerprint evidence because the process is common knowledge and can be replicated by independent experts. The same cannot be said for a proprietary large language model using methods protected as trade secrets.

Patrick Robinson, CEO of competitor Allometric (which makes a similar product called AirJustice), told investigators that his company doesn't train its own models but relies on "application programming interfaces and models developed by leading technology companies." This outsourcing adds another layer of opacity.

Overwork and over-reliance: The entire premise of these tools is that prosecutors and police are overwhelmed with data. But this same overwork creates pressure to rely on AI summaries without independently verifying their accuracy—the exact problem causing sanctions for defense lawyers.

"We might want to think that prosecutors are always going to be able to exercise their discretion and say, 'Oh, this doesn't seem like it actually matches the initial investigation,'" says Bowman, "but the reality is that when you are in a courtroom, both prosecutor and defense attorneys might not have had a good opportunity to fully review the case."

Ian Adams, a former police officer and criminal justice professor at the University of South Carolina, sees this as inevitable: "This is a new category of AI products that I see a lot of development, commercialization, and promise in." But he warns of the "quiet structural problem of omission" and notes that "the 'savings' the vendor promises never fully materialize, because you can't safely shortcut the due diligence."

The Track Record: When AI Gets It Wrong

The risks aren't theoretical. Law enforcement's adoption of AI has already led to documented wrongful arrests and near-misses with wrongful convictions.

Facial recognition failures: At least seven people have been wrongfully arrested due to facial recognition errors, six of them Black. Robert Williams was arrested in Detroit in January 2020—the first documented wrongful arrest from facial recognition technology—after a grainy surveillance photo was matched to his expired driver's license. He spent 30 hours in detention for a crime he didn't commit.

Chris Gatlin spent 17 months in jail in St. Louis County after facial recognition software identified him as an assault suspect from grainy Metrolink surveillance footage. The case was only dismissed after bodycam video emerged showing the actual perpetrator looked nothing like Gatlin. Research has shown that many facial recognition algorithms are less accurate for women and people of color because training databases disproportionately included white men.

ShotSpotter disputes: In the trial of Silvon Simmons, who was shot three times by a police officer in 2016, prosecutors relied on audio from ShotSpotter—an AI-powered gunshot detection system—to argue Simmons had exchanged gunfire with police. Defense experts contested the evidence, and Simmons narrowly escaped conviction. ShotSpotter is known to alert for non-gunfire sounds and can mislocate incidents by as much as a mile.

The Innocence Project has documented that half of its DNA exonerations involved wrongful convictions based on flawed forensic evidence. "We don't want to see this same pattern with untested or biased AI in the years to come," warns Vanessa Meterko, the Project's research manager.

The use of unreliable forensic science has been identified as a contributing factor in nearly 30% of all 3,500+ exonerations nationwide. AI-based evidence is following the same trajectory as discredited techniques like bite mark analysis—being admitted in court before proper scientific validation.

The Systemic Imbalance

Perhaps most troubling is the resource imbalance. Well-funded prosecution offices gain access to sophisticated AI tools for evidence synthesis, while overworked public defenders struggle with basic caseloads. In some jurisdictions, defense attorneys lack even basic discovery tools, much less AI assistance.

A few companies, like JusticeText, are working to level the playing field by providing public defenders with AI-powered tools for analyzing body camera footage and interrogation videos. The California Innocence Project has experimented with CoCounsel to identify patterns and inconsistencies in witness statements for wrongful conviction cases. But these efforts remain limited compared to the prosecution's growing technological arsenal.

As Andrew Guthrie Ferguson, a law professor at George Washington University, predicts: "Prosecutors will soon be deluged with data from body cams, surveillance cams, and other data-rich surveillance technologies. The temptation to upload an overwhelming amount of data into a bespoke AI model will be too strong for many offices to resist."

The Regulatory Vacuum

The justice system's AI crisis is unfolding in a near-total regulatory vacuum. No federal laws specifically govern AI use in criminal investigations or prosecutions. Most states have issued no guidance on AI use by prosecutors, and where guidance exists for defense lawyers, it focuses mainly on competence and confidentiality—not on the specific risks of hallucinations or evidence synthesis.

Bar Association Responses

The American Bar Association issued Formal Opinion 512 in July 2024, its first ethics guidance on generative AI. The opinion emphasizes that existing ethical obligations—competence, confidentiality, communication, and reasonable fees—apply to AI use. But it offers little specific guidance on verification procedures or consequences.

State bar associations have responded with varying levels of urgency:

  • Texas issued Opinion No. 705 in February 2025, emphasizing that lawyers must understand AI technology, cannot charge clients for time "saved" by AI, and should inform clients when generative AI will be used.
  • Florida released Opinion 24-1 in January 2024, confirming lawyers can use AI but must prioritize client confidentiality and avoid unethical billing.
  • New York City Bar issued Formal Opinion 2024-5 with detailed standards requiring lawyers to understand AI's risks and limits, critically review outputs, and avoid over-reliance.
  • California is considering whether to strengthen its code of conduct following a request by the state Supreme Court.

But many states have issued no guidance at all. The result is a patchwork of inconsistent standards across jurisdictions.

Court Responses: Sanctions Without Standards

Courts have responded to AI hallucinations with escalating sanctions:

  • $100 to $10,000 fines for individual lawyers
  • $31,100 in sanctions against multiple attorneys from major law firms
  • Removal from cases and referrals to state bar associations
  • Mandatory continuing legal education on AI use
  • Public shaming through publication of sanctions orders

Some courts have begun requiring lawyers to certify whether AI was used in their filings. But critics argue this is impractical, as AI is being incorporated into everyday tools like Microsoft 365 and Google Apps.

A more concerning development: In September 2025, a California appeals court declined to award attorneys' fees to opposing counsel who failed to detect their opponent's AI hallucinations. The decision suggests courts may soon expect all lawyers to screen for fake citations—adding yet another burden without clear standards for what level of checking is required.

One judge has even begun experimenting with using AI himself. Federal Judge Xavier Rodriguez of the Western District of Texas uses generative AI tools to summarize cases and generate questions for hearings. He argues that "lawyers have been hallucinating well before AI" and that missing an AI error is not wholly different from failing to catch a junior lawyer's mistake.

But this equivalence is misleading. As Rodriguez himself acknowledges: "When the judge makes a mistake, that's the law. I can't go a month or two later and go 'Oops, so sorry,' and reverse myself."

The Admissibility Question

For AI-synthesized evidence used by law enforcement, the legal standards are even murkier. Under the Supreme Court's 1993 Daubert ruling and the earlier Frye "general acceptance" test, scientific evidence must meet standards for reliability, validity, and error rates before being admitted in court.

By these standards, much AI technology doesn't qualify. ShotSpotter's high false positive rate, facial recognition's documented bias, and TimePilot's complete lack of published validation studies should disqualify them from courtroom use. Yet these tools are being deployed anyway, often without defense attorneys even knowing they were used in building the prosecution's case.

There are no disclosure requirements forcing prosecutors to reveal when AI tools shaped their case theory or evidence selection. Unlike DNA or fingerprint evidence, where defense experts can examine the methodology, AI-synthesized evidence from proprietary systems like TimePilot offers no such transparency.

The Path Forward: What Needs to Change

Addressing this crisis requires action at multiple levels:

For Lawyers and Law Firms

Verification protocols: Every AI output must be verified against original sources before filing. Law firms should implement multi-layer review processes, especially for citations.

Training and competence: Bar associations should require continuing legal education on AI, covering not just how to use tools but how to identify hallucinations and understand inherent limitations.

Billing honesty: Time saved through AI use should benefit clients, not pad billable hours.

Transparency: Lawyers should disclose AI use to clients and, where appropriate, to courts—not as an excuse but as part of professional responsibility.

For Courts

Clear standards: Courts should establish bright-line rules for AI verification, rather than leaving lawyers to guess what level of checking is required.

Consistent sanctions: The wide variation in sanctions—from $100 to $31,100 for similar conduct—creates uncertainty. Guidelines should be established.

Admissibility hearings: Before AI-synthesized evidence can be used, courts should require rigorous Daubert hearings examining validation, error rates, and potential bias.

Judicial training: If judges will use AI, mandatory training and disclosure protocols must be established. The public has a right to know when AI influenced judicial decisions.

For Prosecutors and Law Enforcement

Brady compliance protocols: When using AI evidence synthesis tools, prosecutors must implement specific procedures to ensure exculpatory evidence isn't buried in algorithmic summaries.

Mandatory disclosure: Defense attorneys must be informed when AI tools were used to analyze evidence, which tools were used, and what queries were run.

Independent validation: AI-synthesized evidence should be subject to independent expert review, with funding provided for defense experts to examine methodologies.

Transparency requirements: Vendors like Tranquility should be required to disclose training data, validation studies, error rates, and potential biases as a condition of government contracts.

For Lawmakers and Regulators

Federal standards: Congress should establish baseline standards for AI use in criminal justice, including:

  • Mandatory disclosure of AI use in investigations
  • Validation requirements before deployment
  • Regular auditing for bias and accuracy
  • Civil remedies for violations

State regulation: State legislatures should enact "change in technology" statutes allowing wrongful conviction challenges based on flawed AI evidence, similar to existing "change in science" laws.

Funding parity: Public defender offices need resources to access AI tools and expert witnesses to challenge AI evidence used by prosecutors.

Ethics enforcement: State bar associations must actively investigate and discipline AI-related ethical violations, not just respond to court referrals.

The Urgency of Reform

The window for proactive reform is closing. Every week, more lawyers are sanctioned for AI hallucinations. Every day, more law enforcement agencies are adopting evidence synthesis tools. Every case built on opaque AI analysis creates potential Brady violations that may not be discovered for years—if ever.

Jim Penrose, co-founder and CEO of Tranquility AI, emphasizes that "human oversight must remain at the center of the process." But without transparency, standards, and meaningful oversight, this becomes an empty promise. The same company's website promises to "force multiply your investigative team" and help prosecutors "get the pleas"—language that prioritizes efficiency and conviction rates over accuracy and fairness.

The legal profession has faced technological disruptions before—from legal research databases to electronic filing. But AI is different. Earlier technologies made existing tasks faster; AI fundamentally alters the nature of legal work by synthesizing, analyzing, and creating content in ways that can appear authoritative while being completely wrong.

Defense lawyers who cite fake cases face professional humiliation and financial sanctions. But what happens when prosecutors use AI that conceals exculpatory evidence, leading to a wrongful conviction? When systematic Brady violations become impossible to detect because the evidence was never surfaced from the algorithmic black box? When the accused never learns that the case against them was built on AI analysis that missed crucial facts?

These questions go to the heart of the constitutional promise of due process. As Musa of the National Association of Criminal Defense Lawyers warns: "Somebody's life and liberty is at stake, and that is where the most heightened protections should come in. That is not where we should be outsourcing the development of the case to somebody's black box AI tool."

The American justice system depends on adversarial testing of evidence, transparency in process, and human judgment about liberty. AI threatens all three. Without urgent reform, the crisis will deepen until the damage becomes irreversible—wrongful convictions that can't be unwound, precedents set on fake law, and public trust in justice destroyed by the very tools that promised to improve it.

The technology is not going away. The question is whether the legal system can adapt quickly enough to preserve its fundamental integrity. Right now, the answer is alarmingly unclear.


This article is based on court records, industry analyses, expert interviews, and documentation from more than 410 AI hallucination cases tracked worldwide. Research included examination of sanctions orders, vendor marketing materials, ethics opinions from multiple jurisdictions, and reporting by 404 Media, CalMatters, NPR, The Record, and other outlets.

Read more