Everyone is talking about AI risk.

Security conferences are full of it. Board decks have a slide for it. Your legal team has sent at least one memo about it. The CISO is spinning up a framework. Somewhere in your organization, a working group has been formed, a policy has been drafted, and a consultant has been retained to evaluate your “AI governance posture.”

And it all feels important, vaguely urgent, and just slightly out of reach. Like a storm system on the weather map three states away. Real, probably. Coming, maybe. Here? Not yet.

Here is the problem with that posture: you sign the certifications.

Not the CISO. Not the CTO. Not the working group. You. Every quarter, under Sections 302 and 404 of Sarbanes-Oxley, you certify that your organization’s internal controls over financial reporting are effective, that material weaknesses have been disclosed, and that the financial statements are not materially misleading. That certification carries your name and your liability.

Sarbanes-Oxley was written in 2002, when the foundational assumption of every financial control was simple and unquestioned: somewhere in the chain of every significant financial transaction, a human being makes a decision. A manager approves the invoice. A controller reviews the journal entry. A treasurer releases the payment. Systems record it. Auditors sample it. You certify it.

AI is dismantling that assumption. Not in theory. In production environments, right now, inside organizations that have certified those controls effective.

The storm is not three states away. It is the rain you are standing in and have not yet looked up to see.

The False Comfort of “We Have Controls”

Most CFOs I have spoken with believe their AI risk is managed. They have a vendor review process. They have a data governance policy. Legal reviews the contracts. The AI tools in the finance stack were approved by IT.

That is not the same thing as having controls over AI as a financial actor.

The controls you have were designed for humans operating systems. What is emerging, and in many cases what has already arrived, is systems operating autonomously inside financial workflows. The distinction is not semantic. It is the entire architecture of accountability.

Traditional SOX testing asks: did the right human approve this transaction? Auditors sample activity and verify the approval chain. The evidence is an approval record, a signature, a workflow timestamp.

What do you do when the approver is software?

When an AI system ingests an invoice, validates it against policy, and routes it to payment without human review, there is no human approval to sample. There is a confidence score. There is a policy rule satisfied. There is a log entry that says the system evaluated the transaction and executed it.

That is a different kind of evidence, and it points to a different kind of control. Not whether the right human approved, but whether the system making the decision is governed correctly. Whether it was trained on appropriate data. Whether it has drifted from its original classification behavior. Whether the policy it enforces actually maps to the policy you intended to enforce.

Most organizations have not built that control layer yet. Many do not know they need it. And some have already had it fail in ways that are now visible in court filings, SEC settlements, and police reports.

What Has Already Happened

These are not hypotheticals. These are documented, litigated, and in one case, reported to Hong Kong police.

Arup, January 2024: $25.6 Million Gone in 15 Transfers

British engineering firm Arup, the company behind the Sydney Opera House and the Bird’s Nest stadium, lost $25.6 million in a single week. The mechanism was not a system breach. The mechanism was a deepfake.

A finance employee in the Hong Kong office received what appeared to be a request from the company’s UK-based CFO. The employee was suspicious. So he joined a video conference call. On that call, the CFO appeared, along with several other recognizable colleagues. The CFO directed him to execute a series of confidential transfers. The employee complied. He executed fifteen transactions into five bank accounts totaling HK$200 million, roughly $25.6 million USD.

Every person on that call was an AI-generated deepfake, built from publicly available video and audio of Arup employees from prior online conferences.

The CFO Dive reporting on the Arup incident.

Think carefully about what this means for your control environment. The authorization system worked exactly as designed. A senior executive issued the instruction. The employee followed proper protocol by verifying via video conference. The approval was given. The transfers were executed. The control did not fail. The human layer in the control framework was the counterfeit.

SOX Section 302 asks executives to certify that they have designed controls that provide reasonable assurance. What assurance, exactly, do your current controls provide when the authorizing executive can be recreated in software from a LinkedIn video?

Arup’s Chief Information Officer later told Fortune that the number and sophistication of these attacks had been rising sharply. The company confirmed fake voices and images were used, stated that financial stability was not materially affected, and declined further comment pending investigation.

The $25 million was not recovered.

Opendoor, 2022-2025: When the Model Drifts and the Executives Don’t Notice

Opendoor built its business on a specific proposition: its AI pricing algorithm could dynamically evaluate and price homes more accurately than traditional real estate methods, and critically, it could adjust to changing market conditions. That capability was not just a product feature. It was the core investment thesis, described in regulatory filings, on the website, in press releases, and in earnings calls.

Officers of the company signed certifications on financial statements while those representations were in the market.

Then the housing market shifted in 2022. The model did not adjust the way the company had represented it would. Revenue collapsed from $15 billion in 2022 to $5 billion in 2024. Investors filed securities class action lawsuits in federal court alleging violations of Section 10(b) of the Exchange Act and SEC Rule 10b-5, claiming materially false and misleading statements about the effectiveness of the company’s pricing algorithm.

The SEC filing detailing the consolidated class action.

In June 2025, Opendoor agreed to a $39 million settlement. The company did not admit wrongdoing.

The accountability question embedded in this case is the one that should be keeping CFOs awake: if the model drifted and no one was monitoring it, is that a known weakness or an undisclosed one? At what point does a degraded model become a material weakness in internal controls over financial reporting? These questions do not yet have clean regulatory answers. They are being answered, slowly and expensively, in court.

SEC AI Washing Enforcement: Delphia and Global Predictions, March 2024

On March 18, 2024, the Securities and Exchange Commission settled charges against two investment advisers for what it called “AI washing”: claiming to use artificial intelligence in ways they were not actually using it.

Delphia, a Toronto-based robo-adviser, represented in its SEC filings, on its website, and in investor communications that it used machine learning to analyze collective client data to make investment decisions. According to the SEC, it did not. It collected client data intermittently between 2019 and 2023 but never used that data as input into its investment algorithms. The SEC’s own examination found this. Delphia was informed. Delphia updated its Form ADV. Then Delphia continued making similar representations in emails to investors and in a press release anyway.

Global Predictions described itself as “the first regulated AI financial advisor” and promoted “AI-driven forecasts” in its advisory services. When the SEC asked for documentation to substantiate these claims, Global Predictions could not produce it.

The SEC’s official press release on the enforcement actions.

The combined penalties, $225,000 for Delphia and $175,000 for Global Predictions, are not the point. The doctrine is the point. The SEC established, using existing statutes with no new regulation required, that misrepresenting the nature or capability of AI systems to investors is securities fraud. The same sections of the Advisers Act, the same antifraud provisions, the same marketing rules that have governed investment adviser representations for decades apply with full force to what you say about your AI systems.

SEC Chair Gary Gensler said at the time of the settlements: “Investment advisers should not mislead the public by saying they are using an AI model when they are not. Such AI washing hurts investors.”

That principle does not stop at investment advisers. It is a legal template being actively applied to public companies by a regulatory agency that has named AI disclosure as a formal examination priority. If your investor materials, earnings calls, or SEC filings describe AI capabilities that are overstated, or that have degraded without disclosure, you are in the territory this doctrine was built to address.

UnitedHealth and the 1.2-Second Decision

This case is not a CFO’s industry, but the control failure is universal.

UnitedHealth Group deployed an AI algorithm called nH Predict to evaluate post-acute care claims. The system reviewed and made coverage decisions at scale. A class action lawsuit, now moving forward in federal court, contains the detail that makes this relevant to any CFO who has deployed automation in a financial decision workflow: the model had a documented 90% error rate on appeals. Nine out of ten times a human reviewed the AI’s denial, they overturned it.

The reporting on the nH Predict lawsuit.

Separately, a filing against Cigna documented that its algorithm reviewed and rejected more than 300,000 claims in two months. Average time per claim: 1.2 seconds.

The legal question being advanced in these cases is precise: at what point does a known error rate in an automated decision system become a material fact that should have been disclosed? At what point does the absence of monitoring constitute a control failure rather than simply a gap?

Apply that question to your accounts payable automation. Your expense classification engine. Your invoice approval workflow. If your AI system has an error rate you are not measuring, you do not have a control. You have an assumption dressed up as a process.

The Accountability Gap Is Closing

For years, the working assumption in most organizations was that AI liability was diffuse, technically complex, and unlikely to land directly on the CFO. The CTO built it. The vendor provided it. The algorithm made the decision. Accountability would be hard to pin.

That working assumption is now demonstrably wrong.

The Financial Stability Oversight Council named AI as a systemic financial risk in its December 2024 Annual Report, sharpening its prior 2023 concern into explicit guidance demanding enhanced oversight. FSOC Annual Report 2024.

The Government Accountability Office published formal findings on AI in financial services in May 2025, noting specifically that most regulators confirmed AI outputs must inform, not replace, human decisions, and identifying gaps in regulatory authority to examine technology service providers. GAO Report GAO-25-107197.

Securities class action filings involving AI claims hit 12 in the first half of 2025 alone, on pace to exceed all of 2024’s total of 15. The Disclosure Dollar Loss Index, which measures market capitalization impact from securities class actions, reached $403 billion in H1 2025, a 56% increase from the prior period. Cornerstone Research and Stanford Law School midyear report.

Courts are allowing claims to proceed under Section 11, Section 10(b), and the Investment Advisers Act based specifically on misrepresentations about algorithmic performance. The SEC has served explicit notice that AI washing is securities fraud under existing law. The FSOC has named AI a systemic risk. The GAO has mapped the oversight gaps.

The question being constructed in these cases, built one filing at a time, is simple: did you know, or should you have known, that the system was not performing as represented?

That is not a CTO question. The CTO does not sign the Section 302 certification. You do.

The Scorecard, Just for the Cases in This Article

Add up only the matters discussed here:

The Arup deepfake loss: $25.6 million, not recovered.

The Opendoor investor settlement: $39 million, for misrepresentations about algorithmic capability.

The SEC AI washing penalties: $400,000 combined, establishing the legal doctrine.

The UnitedHealth and Cigna algorithmic claim denial litigation: unresolved, with multi-billion dollar exposure across multiple class actions still working through federal court.

Total confirmed losses and settlements in these four cases alone: over $65 million, excluding the pending insurance litigation.

These are only the cases that became public, were litigated, and generated a documentary record. The Bank of England found that 47% of financial firms reported at least one negative consequence from AI use. Most of those did not generate a news cycle. Most of those are sitting somewhere inside organizations that have certified their controls effective.

The Question Worth Asking Before Your Next Certification

AI will not eliminate financial controls. In organizations that design carefully, it will strengthen them. Constitutional construction,continuous monitoring, automated anomaly detection, and policy enforcement at scale can build a tighter control environment than manual sampling ever achieved.

But only if someone designed the governance layer deliberately. Only if someone mapped the decision authority structure and asked, at each point where software replaced human judgment: what is the evidence of this decision, who can explain it to a judge, and what happens when it drifts?

That is not a technology question. That is a governance question, and it belongs to you.

The CFOs who are ahead of this are not waiting for the next enforcement action to motivate the conversation. They are auditing their AI control registries. They are asking their audit teams to test model governance the way they used to test approval workflows. They are having the uncomfortable internal conversation that surfaces the actual list of places where software now makes decisions that affect financial statements without human review.

They are doing this because they know something that the cases above confirm: the signature on the certification does not belong to the algorithm.

It belongs to the person who chose to trust it without verifying it was trustworthy.

If that question does not yet have a clean answer in your organization, the time to develop one is before the failure, not after.

If this raises questions worth exploring for your organization, the conversation is worth having sooner than the regulatory calendar demands.