How an AI Agent Broke Into McKinsey's Internal AI Platform Lilli and accessed Millions of Records in 2 Hours

In March 2026, an autonomous AI agent deployed by a security research firm was able to independently break into McKinsey's internal chatbot with no human in the loop. Read more about this exploit.

ARTIFICIAL INTELLIGENCEAI GOVERNANCEAI SECURITY

Balasubramanyam Gopatipalyam

3/11/20264 min read

AI agent breaks into McKinsey's AI chatbot

In February 2026, a security research firm specializing in red-teaming services, demonstrated a new reality in the age of agentic AI. The security firm's autonomous AI agent, given no credentials, no insider knowledge, and no human guidance, breached McKinsey’s internal generative AI platform, Lilli.

In just 2 hours, the autonomous agent was able to achieve full read-and-write access to the production database of McKinsey's internal chatbot.

Lilli was rolled out in 2023 and named after the consulting firm's first ever woman employee hired back in the 1940s. The chatbot is now used by over 40,000 McKinsey consultants and processes more than 5 lakh prompts each month. The agent exposed 46.5 million chat messages discussing strategy, mergers and acquisitions, and client engagements, all in plaintext. It was also able to access over 7 lakh confidential files, 50k plus user accounts, 36 lakhs RAG document chunks of proprietary research, and 95 writable system prompts that control how the AI behaves for thousands of users.

In this science-fiction style of breach, an autonomous AI agent was able to independently infiltrate into another AI agent.

This is a classic application security failure. Exposed APIs were just the starting point. The autonomous agent was able to take advantage of a classic application exploit like SQL injection and poor authorisation and executed the breach at machine in under 2 hours!

This was a controlled red-team exercise under responsible disclosure. McKinsey patched the issues within hours, and a third-party forensic investigation confirmed no client data or confidential information was accessed by any unauthorised party.

Yet, the incident sends a clear warning to every organisation deploying internal AI tools today.

How is the attack executed?

CodeWall, the security research firm that demonstarted this exploit, has deployed their autonomous offensive agent with a simple instruction: "explore McKinsey’s Lilli platform". Here is exactly how it unfolded, step by step, in simple language.

First, the agent mapped the entire public attack surface of Lilli. It discovered publicly exposed API documentation listing over 200 endpoints. While most required authentication, 22 endpoints were completely unauthenticated, implying anyone on the internet can get an access to Lilli.

Next, it probed an unprotected endpoint that saved user search queries into the database. The developers had properly parameterised the values, however they concatenated the JSON keys directly into the SQL query. That tiny mistake created a blind SQL injection vulnerability.

The agent noticed that database error messages reflected those JSON keys verbatim. Using 15 careful iterations, it analysed the errors, inferred the database structure, and began extracting data. Everything happened without McKinsey's SIEM tool flagging the issue.

Then it chained the SQL injection with an Insecure Direct Object Reference (IDOR) flaw. This allowed it to jump from one user’s data to another’s, reading employee search histories and escalating privileges.

Within two hours, the agent had full read and write access to the entire production database. It could have downloaded every chat, every file, and rewritten the 95 system prompts silently poisoning every response Lilli gave to consultants across the firm.

Why did this happen?

Because Lilli was built for speed and usefulness, not with the assumption that an autonomous attacker could chain small mistakes at machine speed.

Basic app-sec testing like SQL injection was not performed before deploying Lilli. The prompt layer, the instructions that govern AI behaviour, was never treated as the new “crown jewel” asset.

Traditional security vulnerabilities combined with lack of AI-guardrails used by new age AI threat actors has resulted in devastating impact in minutes, not months.

Key takeaway from these specific vulnerabilities

The biggest lesson is simple yet profound: AI platforms are now high-value targets and highly vulnerable to attacks. And, the vulnerabilities that matter most are often the ones we thought were “solved”.

This exploit was not about prompt poisoning or model misalignment. This breach was executed using classic SQL injection and IDOR combined with exposed, unauthenticated APIs. These are decades-old app-sec issues listed in the OWASP Top 10, yet they remain ignored when rushed AI deployments skip proper controls.

Even more critically, the system prompts and AI configurations became writable. A single UPDATE statement could have changed what Lilli told every consultant, turning a trusted internal tool into a weapon for misinformation or data leakage.

The era of “just ship the AI feature” is over.

Autonomous agents now research, map, probe, chain, and escalate continuously and exactly like a highly skilled human attacker, but without sleep or weekends!

Your internal AI chatbot or agent platform stores the same sensitive intelligence that used to live only in email, shared drives, or secure vaults. If it is not secured with the same rigour as your core infrastructure, the the same AI tool becomes the weakest link in your data protection chain

What should businesses using AI systems do?

If your organisation is already using or planning to deploy internal AI chatbots, agents, or generative tools, now is the time to act before an autonomous attacker does.

  • Start by treating every AI platform as a full production system, not a pilot project. Map its complete attack surface, including APIs, prompt storage, RAG document chunks, and integration points with external models.

  • Implement strong authentication on every endpoint. Eliminate unauthenticated paths.

  • Validate and sanitize all inputs, especially dynamic JSON keys or user-controlled data that touches your database.

  • Protect your prompts and AI configurations as sensitive assets. Store them separately, encrypt them, and apply strict access controls and versioning. Never allow write access from untrusted sources.

  • Adopt continuous, agent-aware application security testing. Traditional scanners often miss the chained, blind vulnerabilities that autonomous agents find in minutes.

  • Red-team exercises using offensive AI agents should become part of your regular cadence.

  • Build AI governance that covers the entire AI lifecycle, from design to deployment to ongoing monitoring.

  • Classify your AI systems by risk, documenting data flows, and ensuring human oversight where high-stakes decisions are involved.

Above all, shift from "reactive patching" to "proactive resilience". Turn potential AI risks into governed opportunities so your internal tools drive productivity without becoming liabilities.

Loooking for AI Governace Advisory?

Request for a free 1-1 meeting with an expert AI Governance Consultant.