A Payment System that Outgrew Its Oversight
Medicare Advantage, the privately administered alternative to traditional Medicare, now enrolls more than 35 million Americans and processes over $615 billion in annual federal payments. The program pays private insurers based on the documented health complexity of their enrolled members through a mechanism called risk adjustment: plans caring for sicker populations receive higher per-member payments to cover their expected costs. In principle, this ensures adequate funding for patients who need the most care.
In practice, the system created a powerful financial incentive to document more diagnoses per patient, and the technology designed to act on that incentive evolved faster than the governance structures designed to oversee it. AI-powered coding tools can now scan millions of clinical records, identify potential diagnoses, and recommend codes at a speed and scale that would have been unimaginable when the risk adjustment framework was established. The question that went unanswered for too long is whether these tools were optimized for accuracy or for revenue, and who was responsible for knowing the difference.
MedPAC’s March 2026 report estimated that Medicare spends $76 billion more on MA enrollees than it would spend on the same population in traditional fee-for-service Medicare. Of that amount, $22 billion is attributed to coding intensity, the measurable gap between how MA plans document diagnoses and how fee-for-service providers document the same conditions in the same patients. That $22 billion represents the aggregate output of coding technology deployed across an industry that, until recently, faced limited accountability for the accuracy of what its AI produced.
The Governance Vacuum
For most of risk adjustment’s history, the AI tools powering it operated in a governance vacuum. Plans purchased coding technology from vendors. The vendors’ AI scanned clinical notes and recommended diagnosis codes. Coders reviewed the recommendations, often under throughput pressure that limited independent judgment. Codes were submitted to CMS. Nobody systematically evaluated whether the AI’s reasoning was sound, whether its recommendations were defensible, or whether its design incentivized accuracy over volume.
The consequences of that vacuum are now quantified. The DOJ has collected over $670 million in settlements from two major health organizations over allegations that their risk adjustment practices inflated federal payments. OIG audits published in March 2026 found that between 81% and 91% of sampled high-risk diagnosis codes lacked adequate clinical documentation. In multiple audits, acute condition categories like stroke and myocardial infarction showed 100% failure rates: every sampled record in those categories was unsupported.
These outcomes weren’t caused by rogue employees or isolated misconduct. They were produced by systems, technological and organizational, designed to maximize code identification without corresponding investment in code validation. The AI found diagnoses. Nobody governed whether the evidence behind those diagnoses could survive federal scrutiny.
What Governance Looks Like Now

The regulatory response has been rapid and multi-directional. CMS scaled its audit workforce from approximately 40 certified coders to plans for approximately 2,000, supplemented by AI-assisted pattern detection for audit targeting. The agency now audits all 550+ MA contracts annually with quarterly cadence. In a January 2026 memo, CMS specified that AI in risk adjustment should serve as a “medical coder support tool” with humans making final determinations.
The OIG published its first Medicare Advantage Industry-wide Compliance Program Guidance since 1999 in February 2026. That guidance identified add-only chart reviews, where coders find codes to submit but never identify codes to remove, as a high-risk practice. It warned that failing to remove unsupported codes may constitute a compliance failure. It flagged health risk assessments that generate diagnoses disconnected from ongoing patient care.
CMS finalized the exclusion of unlinked chart review diagnoses from risk score calculations for payment year 2027, eliminating the value of roughly 75 million diagnoses that couldn’t be matched to clinical encounters. The bipartisan UPCODE Act proposes going further, with the Congressional Budget Office estimating $124 billion in savings over ten years from excluding chart-review-only and health risk assessment diagnoses entirely.
Each action addresses a different facet of the same governance problem: AI-powered coding tools generated billions in payments based on diagnoses that nobody verified could withstand the scrutiny the system always should have applied.
The Ai Accountability Standard
What’s emerging from this regulatory convergence is a new accountability standard for AI in healthcare payment. The AI must be explainable: every recommendation traceable to specific clinical evidence and specific reasoning. It must support human oversight that’s genuine, not nominal: coders must have the information and authority to override AI recommendations, and those overrides must be documented. It must operate in both directions: identifying unsupported codes for removal with the same rigor it applies to identifying missed codes for addition. And it must be auditable: compliance teams must be able to examine the AI’s decision patterns, flag anomalies, and verify alignment with clinical standards.
These requirements don’t describe a feature list. They describe an architectural philosophy. Systems built to maximize code identification, with accountability features layered on afterward, produce different outcomes than systems built from the ground up around evidence validation, explainability, and two-way coding.
The $76 billion question isn’t whether AI should be used in risk adjustment. It’s whether the AI is governed well enough to produce output that regulators, clinicians, and taxpayers can trust. The organizations investing in a risk adjustment platform built around that governance standard are building the infrastructure the system should have had from the start. The organizations still running ungoverned AI in a governed market are discovering, through settlements and audit findings, that the governance vacuum has closed.