Here’s a hard truth in 2026: One of the world’s most sophisticated organizations — with elite talent and deep resources — had a critical vulnerability sitting in its flagship internal AI platform for over two years.
An external autonomous AI agent found it in under two hours.
In late February 2026, CodeWall pointed their offensive AI agent at McKinsey’s internal AI platform, Lilli. No credentials. No human guidance. Just a domain name.
Within two hours, the agent had full read-write access to the production database. It exposed 46.5 million chat messages, 728,000 confidential files, and — most dangerously — 95 system prompts that controlled how Lilli reasoned and advised McKinsey’s consultants.
The technical flaw? A classic SQL injection through unauthenticated API endpoints. The kind of issue that’s been on every security checklist for decades.
This Is the Real Story
This wasn’t a failure of one CIO. It was a failure of an entire generation of IT and security leadership whose mental models were forged in the previous century of technology.
Most current CIOs, CISOs, and senior IT leaders built their careers on predictable, human-paced systems. They understand networks, databases, and traditional applications. But agentic AI operates in a completely different world:
- It moves at machine speed
- It autonomously discovers and chains vulnerabilities
- It treats system prompts — the instructions that define how your AI thinks — as just another database field
The old playbook doesn’t work here. Traditional scanners missed this vulnerability for years. Human-paced pentesting couldn’t keep up. The entire profession is still thinking in terms of human attackers and static defenses.
And McKinsey is not an isolated case. The industry is seeing a pattern of AI systems being deployed faster than the people securing them can adapt.
This Is Not a Criticism. It’s a Warning.
The executives who rose through the ranks mastering yesterday’s technology are now responsible for securing tomorrow’s systems. The gap between their experience and the new reality is widening every month.
The solution isn’t to shame experienced leaders. It’s to acknowledge a hard truth: the talent and mindset that built secure enterprise systems in the 2010s are systematically underprepared for the agentic AI world of 2026.
We need a fundamental reset across the profession. New ways of thinking about identity, trust boundaries, prompt governance, and what even counts as a critical asset. We need security leaders who understand that mutable system prompts are as dangerous as source code.
A Leadership Case Study, Not a Technical Footnote
The McKinsey Lilli incident should be studied not as a technical footnote, but as a leadership case study. Even the best organizations with the best people can fall dangerously behind when their mental models are rooted in the past.
The pace of change is unforgiving. The old success formulas that served us so well are now becoming liabilities.
Why This Hit Close to Home
This story also caught my attention because the chief of staff agent in my own system is named Lily. Seeing what happened to Lilli made me immediately review and strengthen my own setup. I now know exactly what risks to watch for and how to secure the prompt layer and agent access properly. I hope sharing this perspective helps CIOs around the world catch up faster — because the next autonomous agent won’t be running a responsible disclosure exercise.
Are you willing to rebuild your understanding of technology and security from the ground up — before the next autonomous agent chooses your organization as its target?
Barry Li is a PhD candidate at the University of Newcastle researching sustainability assurance and climate reporting. He also builds personal agentic AI systems and writes about what he learns from the experience.