← All articles
ai-agentsSignal 85/100

AI intel digest

Anthropic And OpenAI Just Admitted The Model Isn't Enough.

The video reframes McKinsey's Lily AI platform breach ($20 SQL injection, 22 unauthenticated endpoints out of 200, full

2026-05-1023 min read4,692 words11 facts · 0 assumptions
Start here

Executive summary

1. SUMMARY The video reframes McKinsey's Lily AI platform breach ($20 SQL injection, 22 unauthenticated endpoints out of 200, full read/write access to 40,000 consultants' data) as a procurement and organizational design failure rather than a security lapse. It argues that traditional SaaS procurement sequences break down with AI agents because agents cross permission boundaries in ways humans never do. The speaker notes six major vendor announcements in one week (Anthropic, OpenAI, SAP, Pinecone, Salesforce, ServiceNow) all addressing the same gap: implementation complexity, not model capability, is the real bottleneck. The core prescription is moving technical teams earlier in the buying process. 2. KEY FACTS FACT: A $20 exploit gave full read/write access to McKinsey's Lily AI platform used by ~70% of 40,000 consultants | EVIDENCE: "$20, two hours to get full read and write access to the AI platform that 70% of McKenzie's 40,000 consultants use every single day" | CONFIDENCE: HIGH FACT: The exploit was SQL injection, first documented in 1998 | EVIDENCE: "this exploit wasn't exotic. It was SQL injection. The first documented case of SQL injection is in 1998" | CONFIDENCE: HIGH FACT: 22 of 200 API endpoints shipped with no authentication, including writable production endpoints | EVIDENCE: "22 of 200 endpoints shipped with no authentication" and "endpoints writable to production that were unauthenticated" | CONFIDENCE: HIGH FACT: Lily had been in production for more than 2 years at time of breach | EVIDENCE: "Lily had been in production for more than 2 years" | CONFIDENCE: HIGH FACT: Codewall disclosed responsibly on March 9th; McKinsey patched within an hour | EVIDENCE: "Codewall disclosed responsibly on March 9th... McKenzie patched it up with an hour's credit to them" | CONFIDENCE: HIGH FACT: Six vendor announcements occurred within one week addressing agent implementation gaps | EVIDENCE: "Anthropic and OpenAI have both stood up enterprise services companies with billions of dollars behind them... SAP acquired Dreo and Prior Labs... Pine Cone launched Nexus... Salesforce shipped headless 360... Service Now opened up Action Fabric" | CONFIDENCE: HIGH FACT: Anthropic and OpenAI both established enterprise services companies with billions in backing | EVIDENCE: "Anthropic and OpenAI have both stood up enterprise services companies with billions of dollars behind them to put engineers inside customer buildrooms" | CONFIDENCE: HIGH FACT: SAP acquired Dreo and Prior Labs for unified data layer and tabular foundation models | EVIDENCE: "SAP acquired Dreo and Prior Labs to bring a unified data layer and tabular foundation models" | CONFIDENCE: HIGH FACT: Pinecone launched Nexus to prevent agents from rebuilding business context on every run | EVIDENCE: "Pine Cone launched Nexus, which is essentially stop making your agent rebuild the business from scratch every time it runs" | CONFIDENCE: HIGH FACT: Salesforce shipped Headless 360 exposing platform as APIs/tools/commands for agent use | EVIDENCE: "Salesforce shipped headless 360 which exposes their platform as APIs and tools and command line commands because agents don't click through screens" | CONFIDENCE: HIGH FACT: ServiceNow opened Action Fabric for outside agents to trigger governed workflows with identity/audit | EVIDENCE: "Service Now opened up Action Fabric so outside agents can trigger governed workflows, playbooks, approvals, cataloges through a controlled surface with identity and audit attached" | CONFIDENCE: HIGH 3. KEY IDEAS IDEA: The Lily incident is a procurement/build failure that surfaced as a security incident, not a security failure per se | REASONING: McKinsey has competent engineers; SQL injection is trivial to prevent; 22/200 unauthenticated endpoints indicates systemic pattern, not individual error | IMPLICATION: Organizations must redesign procurement processes for AI, not just add security checklists IDEA: Traditional SaaS procurement sequence (strategy → contract → security review → IT planning → developer build) breaks with AI agents | REASONING: SaaS is bounded (admin console, published API, role-based permissions); agents are unbounded (cross systems, no "screen" as permission boundary, require code-level permission mediation) | IMPLICATION: Implementation is the strategy, not downstream of it; technical teams must be involved before purchase, not after IDEA: Agents lack the "screen as permissions model" that humans implicitly use | REASONING: Humans don't see what they can't access; agents ask each system "am I allowed to read this?" and every system must answer clearly, be auditable, and compose with others | IMPLICATION: Every cross-system agent workflow requires explicit engineering work for permissions, audit, and cost control that doesn't exist by default IDEA: Vendor announcements signal that "the model was never the hard part" | REASONING: Six vendors simultaneously launched services for reachable surfaces, governed action, permission-aware data, cheaper context assembly, forward-deployed engineers | IMPLICATION: The competitive moat and risk surface in enterprise AI has shifted from model capability to implementation/integration infrastructure IDEA: Organizational defaults under pressure determine security posture more than policies or documentation | REASONING: 22 unauthenticated endpoints suggests default state was "not authenticated," not that individuals failed; what happens when teams move fast matters more than what the docs say | IMPLICATION: Technical architecture must be a first-class business concern with default-safe configurations, not something deferred to incident response 4. KEY QUOTES "$20, two hours to get full read and write access to the AI platform that 70% of McKenzie's 40,000 consultants use every single day" "22 of 200 endpoints shipped with no authentication... That's not a random mistake. That's a pattern." "The implementation question isn't downstream of a strategic decision. It's effectively the strategic decision itself." "The signal is that the model was never the hard part. The hard part is exactly what the Lily incident surfaced." "You cannot design our software for a world where humans click through screens and where agents are bolted on afterward." "The cheapest thing you can do this quarter is to move the technical developer review... earlier in the process." 5. SIGNAL POINTS The Lily breach was SQL injection (known since 1998) at a company with competent engineers — the failure mode is systemic, not technical hygiene 22 unauthenticated endpoints out of 200 (11%) including writable production access indicates organizational default failure, not individual error AI agents cross permission boundaries that humans never notice because humans have "screens as permissions models" — agents require explicit code-level permission engineering Six major vendors (Anthropic, OpenAI, SAP, Pinecone, Salesforce, ServiceNow) announced agent implementation infrastructure in one week, confirming the bottleneck is integration, not models Traditional procurement sequence puts developers last; for agents this commits capital to untested strategies because implementation viability cannot be validated by demo Two critical pre-signing questions: (1) Does your platform distinguish humans from agents? (2) What is the technical default when your team is under pressure? If an agent cannot be unplugged in 5 minutes from a console, incident response has a hole that won't be found until 3 a.m. The speaker publishes a six-question technical checklist with vendor answer rubric and repair playbook for already-signed contracts 6. SOURCES MENTIONED McKinsey / Lily: AI platform used by ~70% of 40,000 consultants; breached via SQL injection with 22 unauthenticated endpoints; patched within an hour of responsible disclosure Codewall: Security research startup that discovered and disclosed the Lily vulnerability responsibly on March 9th Anthropic: Stood up enterprise services company with billions in backing to embed engineers in customer buildrooms OpenAI: Same as Anthropic — enterprise services company for implementation support SAP: Acquired Dreo and Prior Labs for unified data layer and tabular foundation models Pinecone: Launched Nexus to cache/prevent agent context rebuilding on every run Salesforce: Shipped Headless 360 exposing platform as APIs, tools, CLI commands for agent consumption ServiceNow: Opened Action Fabric for external agents to trigger governed workflows with identity and audit Nate's Newsletter / Substack: Publisher of the six-question developer checklist, vendor rubric, and repair playbook 7. VERDICT This video carries unique signal for enterprise AI decision-makers because it connects a specific breach (Lily) to a structural procurement pattern and validates that pattern with six concurrent vendor pivots. Most coverage of the Lily incident stopped at "security lapse"; this speaker traces it to organizational design — technical voices absent from buying decisions, default-unsafe configurations under pressure, and the mismatch between SaaS procurement sequences and agentic system requirements. The value is not in the breach details (widely reported) but in the framework: implementation as strategy, the screen-as-permissions model, and the two pre-signing questions. For anyone building an AI roadmap or evaluating vendors this quarter, this is actionable structural thinking that security checklists and vendor demos won't surface. The signal density is high because every claim ties to either verifiable facts (22/200 endpoints, six vendor announcements) or a coherent causal model with testable implications. --- Count: 11 facts, 0 assumptions, 0 demonstrations Signal density: 85

What matters

Signal points

  1. 1

    The Lily breach was SQL injection (known since 1998) at a company with competent engineers — the failure mode is systemic, not technical hygiene

  2. 2

    22 unauthenticated endpoints out of 200 (11%) including writable production access indicates organizational default failure, not individual error

  3. 3

    AI agents cross permission boundaries that humans never notice because humans have "screens as permissions models" — agents require explicit code-level permission engineering

  4. 4

    Six major vendors (Anthropic, OpenAI, SAP, Pinecone, Salesforce, ServiceNow) announced agent implementation infrastructure in one week, confirming the bottleneck is integration, not models

  5. 5

    Traditional procurement sequence puts developers last; for agents this commits capital to untested strategies because implementation viability cannot be validated by demo

  6. 6

    Two critical pre-signing questions: (1) Does your platform distinguish humans from agents? (2) What is the technical default when your team is under pressure?

  7. 7

    If an agent cannot be unplugged in 5 minutes from a console, incident response has a hole that won't be found until 3 a.m.

  8. 8

    The speaker publishes a six-question technical checklist with vendor answer rubric and repair playbook for already-signed contracts

Interpretation

Key ideas

1

The Lily incident is a procurement/build failure that surfaced as a security incident, not a security failure per se

Why: McKinsey has competent engineers; SQL injection is trivial to prevent; 22/200 unauthenticated endpoints indicates systemic pattern, not individual error

Implication: Organizations must redesign procurement processes for AI, not just add security checklists

2

Traditional SaaS procurement sequence (strategy → contract → security review → IT planning → developer build) breaks with AI agents

Why: SaaS is bounded (admin console, published API, role-based permissions); agents are unbounded (cross systems, no "screen" as permission boundary, require code-level permission mediation)

Implication: Implementation is the strategy, not downstream of it; technical teams must be involved before purchase, not after

3

Agents lack the "screen as permissions model" that humans implicitly use

Why: Humans don't see what they can't access; agents ask each system "am I allowed to read this?" and every system must answer clearly, be auditable, and compose with others

Implication: Every cross-system agent workflow requires explicit engineering work for permissions, audit, and cost control that doesn't exist by default

4

Vendor announcements signal that "the model was never the hard part"

Why: Six vendors simultaneously launched services for reachable surfaces, governed action, permission-aware data, cheaper context assembly, forward-deployed engineers

Implication: The competitive moat and risk surface in enterprise AI has shifted from model capability to implementation/integration infrastructure

5

Organizational defaults under pressure determine security posture more than policies or documentation

Why: 22 unauthenticated endpoints suggests default state was "not authenticated," not that individuals failed; what happens when teams move fast matters more than what the docs say

Implication: Technical architecture must be a first-class business concern with default-safe configurations, not something deferred to incident response

Evidence

Key facts

A $20 exploit gave full read/write access to McKinsey's Lily AI platform used by ~70% of 40,000 consultants

HIGH

Evidence: $20, two hours to get full read and write access to the AI platform that 70% of McKenzie's 40,000 consultants use every single day

The exploit was SQL injection, first documented in 1998

HIGH

Evidence: this exploit wasn't exotic. It was SQL injection. The first documented case of SQL injection is in 1998

22 of 200 API endpoints shipped with no authentication, including writable production endpoints

HIGH

Evidence: 22 of 200 endpoints shipped with no authentication" and "endpoints writable to production that were unauthenticated

Lily had been in production for more than 2 years at time of breach

HIGH

Evidence: Lily had been in production for more than 2 years

Codewall disclosed responsibly on March 9th; McKinsey patched within an hour

HIGH

Evidence: Codewall disclosed responsibly on March 9th... McKenzie patched it up with an hour's credit to them

Six vendor announcements occurred within one week addressing agent implementation gaps

HIGH

Evidence: Anthropic and OpenAI have both stood up enterprise services companies with billions of dollars behind them... SAP acquired Dreo and Prior Labs... Pine Cone launched Nexus... Salesforce shipped headless 360... Service Now opened up Action Fabric

Anthropic and OpenAI both established enterprise services companies with billions in backing

HIGH

Evidence: Anthropic and OpenAI have both stood up enterprise services companies with billions of dollars behind them to put engineers inside customer buildrooms

Show 4 more facts

SAP acquired Dreo and Prior Labs for unified data layer and tabular foundation models

HIGH

Evidence: SAP acquired Dreo and Prior Labs to bring a unified data layer and tabular foundation models

Pinecone launched Nexus to prevent agents from rebuilding business context on every run

HIGH

Evidence: Pine Cone launched Nexus, which is essentially stop making your agent rebuild the business from scratch every time it runs

Salesforce shipped Headless 360 exposing platform as APIs/tools/commands for agent use

HIGH

Evidence: Salesforce shipped headless 360 which exposes their platform as APIs and tools and command line commands because agents don't click through screens

ServiceNow opened Action Fabric for outside agents to trigger governed workflows with identity/audit

HIGH

Evidence: Service Now opened up Action Fabric so outside agents can trigger governed workflows, playbooks, approvals, cataloges through a controlled surface with identity and audit attached

Memorable lines

Quotes

$20, two hours to get full read and write access to the AI platform that 70% of McKenzie's 40,000 consultants use every single day
22 of 200 endpoints shipped with no authentication... That's not a random mistake. That's a pattern.
The implementation question isn't downstream of a strategic decision. It's effectively the strategic decision itself.
The signal is that the model was never the hard part. The hard part is exactly what the Lily incident surfaced.
You cannot design our software for a world where humans click through screens and where agents are bolted on afterward.
The cheapest thing you can do this quarter is to move the technical developer review... earlier in the process.