Inside the Black Box: What Leaked AI System Prompts Reveal About How Your Favorite Tools Actually Think
A deep-dive into the most comprehensive collection of leaked system prompts from Cursor, Manus, Windsurf, Devin, v0, and 30+ other AI tools — revealing their core architectures, tool designs, and agent philosophies.
Part 1: Foundations — The Mental Model
Every AI tool you use daily has a hidden constitution: a system prompt that defines its personality, capabilities, restrictions, and the exact tools it can call. These prompts are the real product — more so than the models themselves.
The GitHub repository x1xhlol/system-prompts-and-models-of-ai-tools is the most comprehensive public collection of leaked system prompts and tool definitions for over 30 major AI products. With 30,000+ lines of raw prompts, this is effectively a museum of how the modern AI industry builds agents.
Think of it this way: if the LLM (GPT-4, Claude, Gemini) is the engine, the system prompt is the driver. Reading these leaks is like seeing behind the wheel for the first time.
The mental model:
User Query
│
▼
[System Prompt] ← defines identity, tools, rules, constraints
│
▼
[LLM Backbone] ← Claude, GPT-4, etc.
│
▼
[Tool Calls] ← shell, browser, editor, deploy
│
▼
[Result]
Part 2: The Investigation — What’s in the Repo
The repository is organized by product, each folder containing prompt files (.txt) and tool schemas (.json). Here’s a map of what’s included:
| Category | Products |
|---|---|
| Coding Agents | Cursor, Devin AI, Windsurf, Augment Code, Junie, Kiro, Trae, VSCode Agent |
| Autonomous Agents | Manus, Replit Agent, Emergent, Leap.new |
| Gen AI Builders | v0 (Vercel), Lovable, Same.dev, Orchids.app |
| AI Assistants | Anthropic Claude (multiple versions), Perplexity, NotionAI, Cluely |
| Mobile / IDE | Xcode AI, Qoder, CodeBuddy, Poke |
Each folder typically contains:
- A
Prompt.txt— the actual system prompt injected before every conversation - A
Tools.json— the full list of callable tools with TypeScript-like signatures - Sometimes versioned snapshots (e.g.,
Prompt Wave 11.txt,Agent Prompt 2.0.txt)
This means you can literally diff how a product’s instructions evolved over time.
Part 3: The Diagnosis — What These Prompts Actually Reveal
🧠 Manus: The Most Transparent Agent Architecture
Manus’s leaked Agent loop.txt is a masterclass in agentic design. It reveals a multi-module architecture:
Event Stream → Planner Module → Knowledge Module → Datasource Module → Executor
The agent loop is explicit:
1. Analyze Events: Understand user needs through the event stream
2. Select Tools: Choose the next tool call based on current state
3. Wait for Execution: Tool runs in sandbox, result added to event stream
4. Iterate: Repeat with ONE tool call per iteration
5. Submit Results: Send deliverables via message tools
6. Enter Standby
Key insight: Manus separates planning from execution explicitly. The Planner module provides numbered pseudocode steps as part of the event stream, and the agent must complete every planned step. This is why Manus feels so methodical.
Interesting rules from the prompt:
- Default language: English, but adapts to user’s language
- “Avoid using pure lists and bullet points format in any language” — Manus is instructed to write in prose
- Capable of deploying services and exposing ports publicly
✂️ Cursor: Surgical Precision in Code Editing
Cursor’s system prompt reveals a philosophy of minimal, targeted edits. The prompt instructs the model to:
- Never output unchanged code — always use markers like
// ... existing code ... - Default to a “lazy edit” mode: only write the parts of the file that change
- Use explicit
<CHANGE>annotations to mark modified lines
This explains why Cursor’s edits feel surgical compared to tools that rewrite the entire file. The system prompt literally forbids unnecessary rewrites.
🤖 v0 (Vercel): The Full-Stack React Renderer
v0’s prompt introduces a concept called CodeProject — a special block that groups React component files and renders them in the browser. The tool has specific knowledge of:
- Writing to files using
```lang file="path/to/file"syntax - Using kebab-case for filenames
- Including
taskNameActiveandtaskNameCompletemetadata for UI feedback
The prompt even covers how to use // ... existing code ... markers. v0 is doing the same “lazy edit” strategy as Cursor, but for React/Next.js specifically.
🔍 Devin AI: Evidence-Based Software Engineering
Devin’s leaked prompt is philosophically different from the others. It’s designed as a code archaeology tool that answers questions about a codebase:
INSTRUCTIONS:
- DO NOT MAKE UP ANSWERS
- Cite EVERY SINGLE SENTENCE with <cite repo="..." path="..." start="..." end="..." />
- Citations should span at most 5 lines of code
- End every answer with a "Notes" section
Devin is explicitly instructed to be a skeptic — if it doesn’t know something, it says so. Every claim must be backed by file-level evidence with line numbers. This is extraordinary for an AI tool — it’s essentially a peer-reviewed engineering assistant.
Devin’s prompt even includes:
- Support for Mermaid diagrams (no colors — “they make text hard to read”)
- Never cite entire functions, only salient lines
- Adapts output language to user’s language
🌊 Windsurf: Full TypeScript Tool API
Windsurf’s leaked Tools Wave 11.txt exposes its entire tool API as TypeScript type definitions. This is one of the most technically detailed leaks in the repository:
type capture_browser_screenshot = (_: {
PageId: string;
toolSummary?: string; // "2-5 word summary of what this tool is doing"
}) => any;
type codebase_search = (_: {
Query: string;
TargetDirectories: string[];
toolSummary?: string;
}) => any;
type deploy_web_app = (_: {
Framework: "nextjs" | "sveltekit" | "remix" | ...;
ProjectId: string;
ProjectPath: string;
Subdomain: string;
}) => any;
The toolSummary parameter on every tool is fascinating — Windsurf is instructed to briefly describe what it’s doing in every tool call. This is how the “Windsurf is doing X” status bar messages are generated.
🔬 Anthropic Claude: Multiple Persona Versions
The Anthropic folder contains multiple versions of Claude’s agent prompt across time:
Agent Prompt v1.0.txt,v1.2.txt,2.0.txtSonnet 4.5 Prompt.txtClaude Code 2.0.txtChat Prompt.txtTools Wave 11.txt
This reveals how Claude’s instructions evolved as Anthropic expanded from a chat assistant to a full coding agent. The Agent Tools v1.0.json shows the original toolset, while Tools Wave 11.txt is significantly larger.
Part 4: The Resolution — What This Means for Developers
1. You Can Learn Prompt Engineering from the Best
These are production-grade system prompts written by teams at Anthropic, Cognition (Devin), Codeium (Windsurf), and Vercel. Reading them teaches you:
- How to define tool schemas that models actually follow
- How to structure agent loops for reliability
- How to constrain LLM behavior with explicit rules
- How to version prompts as products evolve
2. Security Warning for AI Startups
The repo includes a direct warning: if you’re building an AI product, your system prompt is a high-value attack surface. ZeroLeaks — linked in the repo’s README — offers prompt extraction audits.
If your system prompt contains API keys, internal URLs, or proprietary logic, this is a real risk.
3. Building Better Agents
Key patterns from the best prompts in this repo:
Pattern 1: Separate planning from execution
# Bad: Let the model figure it out as it goes
# Good: Manus-style explicit pseudocode planning, then execute step by step
Pattern 2: One tool call per reasoning step
# All top agents: iterate with ONE tool at a time, observe, then decide next step
Pattern 3: Cite your sources
# Devin-style: every claim backed by file + line number evidence
Pattern 4: Lazy edits, not rewrites
# Cursor/v0 style: only output changed code, use markers for unchanged sections
4. Compare Tool Philosophies
| Tool | Core Philosophy |
|---|---|
| Manus | Deliberate, event-driven, one-step-at-a-time |
| Cursor | Surgical precision, minimal output, code-first |
| Devin | Evidence-based, citation-driven, skeptical |
| v0 | Full-stack aware, component-centric, UI-first |
| Windsurf | Verbose tool API, status transparency |
Final Mental Model
System Prompt Anatomy (Universal Pattern)
├── Identity → "You are X, built by Y"
├── Capabilities → What tasks the agent can do
├── Tools → Typed API for taking actions
├── Rules → Constraints on behavior
├── Agent Loop → How to iterate towards a goal
└── Output Format → How to structure responses
The x1xhlol/system-prompts-and-models-of-ai-tools repository is more than a curiosity — it’s a reference architecture for how the AI industry is building the next generation of software agents. Whether you’re building your own AI product or just want to understand why your coding assistant behaves the way it does, this repo is an invaluable window into the black box.
⭐ The repo has 30,000+ lines of raw AI intelligence. Drop a star if you find it useful.
Related posts
-
Context Engineering: The Discipline That Separates Good AI Agents from Great Ones
A deep dive into Agent Skills for Context Engineering — the open-source toolkit cited in academic research that teaches you how to curate context windows like a professional AI engineer.
-
Superpowers: The Workflow That Teaches AI Agents Discipline
Superpowers makes coding agents slow down, ask questions, write plans, and test first. The result is less flashy AI code, but much more trustworthy code.
-
BitNet: The Era of 1-bit LLMs is Finally Here
Explore bitnet.cpp, Microsoft's official framework for 1-bit LLMs that replaces multiplications with additions for massive speedups.
-
Khoj: The Open-Source AI Second Brain You Can Self-Host
Khoj is an open-source personal AI app that acts as your AI second brain — chat with any LLM, search your documents with semantic AI, build custom agents, and self-host it completely on your own machine.