A two-year deep dive into AI infrastructure reveals a sobering reality: the true threat isn't just in the prompts, but in the very foundation of the systems. Security researchers Hillai Ben Sasson and Dan Segev found that while the industry is fixated on prompt injection, the real systemic risks lie within the AI supply chain and foundational layers. Their findings, set to be detailed at RSAC, emphasize a critical shift in defense strategy: securing the infrastructure that hosts and trains these models is no longer optional—it is the frontline of AI security.
Architectural Vulnerabilities in AI : A Multi-Layered Threat Analysis
Moving beyond the hype of prompt injection: A deep dive into the structural vulnerabilities of AI infrastructure. Based on two years of rigorous research, we explore why security professionals must pivot their focus toward foundational flaws to truly secure the AI stack
The Speed Trap: Why Infrastructure is the Real AI Crisis
It’s time to look past the hype of prompt injection. While it's a clever attack method, it’s often just a distraction from a much grimmer reality: the crumbling security of our AI infrastructure. We’re seeing a flood of new tools—like the Model Context Protocol (MCP)—being rushed into core business systems, yet they arrive riddled with flaws at the foundational level.
We are essentially falling into the same old trap of choosing speed over safety. The 2026 CISO AI Risk Report paints a worrying picture: 83% of security chiefs are losing sleep over how much access AI has to their internal systems, especially with 71% of these tools operating in the shadows without official approval. If we don’t stop obsessing over the interface and start hardening the actual infrastructure, we’re missing the biggest security threat of our time.
"The real AI crisis isn't the prompts—it's the infrastructure. While 83% of CISOs worry about AI access, the rush to deploy tools like MCP is creating a 'security vacuum' where speed consistently outpaces foundational safety."
AI Security is in a Real Mess—And the
Let’s talk about the Pickle format. It’s the industry standard for storing model weights, yet it’s a security nightmare. Why? Because it mixes data and code in a way that’s frankly irresponsible. It allows malicious files to fire off malware the second a model is opened. This isn’t an accident; it’s the result of data researchers prioritizing speed over "threat modeling." We’ve essentially built the future of AI on a foundation that was never meant to be secure.
It’s Not Just a Bug; It’s the Whole Stack
The industry is currently obsessed with prompt injection, but honestly, that’s just a distraction. It’s the tip of a very large, very dangerous iceberg. The real crisis is unfolding across five distinct layers of the AI lifecycle—and each one is leaking:
The Data Leak Problem (Training Layer) : Security fails before the model is even born. Look at Microsoft’s 2023 disaster: 38TB of private data exposed through a sloppy, "over-permissive" file-sharing link. When the foundation is built on leaked data, the whole structure is compromised.
The Inference Gap (Production Layer) : This is where models actually "think" and talk to users. Researchers found that production-ready services—even big names like DeepSeek and Ollama—are riddled with flaws that allow attackers to jump from a simple query to full system control.
The "Vibe Coding" Disaster (Application Layer) : We’re in a rush to build "cool" apps using vibe-coding tools, but the security is practically nonexistent. We found enterprise-grade applications that could be cracked in minutes. It’s "move fast and break things" taken to a reckless extreme.
The Cloud Poisoning (Hosting Layer) : Most AI lives in the cloud. If an attacker compromises the AI-specific cloud infrastructure, they don't just get one victim—they get every single customer using that cloud. It’s a "one-to-many" disaster waiting to happen.
The Hardware Domino Effect (System Layer) : This is the scariest part. A single flaw in a core library—like NVIDIA’s Triton Inference Server—doesn’t just hurt one user. It creates a backdoor into every cloud provider and every application using that silicon. It’s a "keys to the kingdom" scenario for attackers.
"AI security is failing because we're securing the 'conversation' while leaving the infrastructure wide open—one hardware flaw in a library like NVIDIA Triton can compromise every cloud provider and application simultaneously."
Stop Patching, Start Closing the Loop
There’s no "magic pill" for this. You can't just "patch" a broken foundation. We need to move past the "set it and forget it" mindset. If we don’t shift toward continuous, automated compliance and "closing the loop" on security protocols, we’re just waiting for the next massive exploit. In today’s world, an unpatched vulnerability isn't a risk—it's a countdown.