The Decentralized AI Immune System




Talos Protocol is a Bittensor subnet that engineers a continuous "Purple Team" adversarial loop to secure enterprise AI. We crowdsource the discovery of Zero-Day jailbreaks and instantly forge the Nano-Shields to block them.
The Problem: The Latency of Centralized Security
As enterprise AI adoption explodes, "Shadow AI" and Zero-Day jailbreaks move exponentially faster than centralized security teams can patch them. When a novel exploit is discovered, centralized security firms take days to manually update their defenses. During this vulnerability gap, enterprises are exposed to massive data exfiltration and compliance failures.
The Solution: The Purple Team Engine
Talos splits its decentralized swarm into two competing factions in a continuous, zero-sum economic war:
The Red Team (Attackers): Miners scout infrastructure for exposed endpoints and use Reinforcement Learning to generate novel, semantic adversarial prompts that bypass target guardrails.
The Blue Team (Defenders): Miners ingest the live stream of successful Red Team attacks and rapidly train highly quantized "Nano-Shields" (e.g., DistilBERT classifiers) to block the exact attack pattern without flagging normal users.
How it Works (The Edge Architecture):
The byproduct of this continuous war is the Talos Shield SDK. Enterprise developers install this lightweight router directly into their backend.
Cloud API Mode: Prompts are scrubbed of PII locally, then securely routed to the Talos Validator Gateway for threat evaluation.
Edge Mode: For strict zero-trust compliance, the SDK silently downloads the Blue Team's latest Nano-Shield directly to the client's server for local, offline inference.
Fig. 1. The Talos Subnet Architecture: The Purple Team Engine.

Fig. 2. The Talos Validator Workflow: From Submission to Sandbox Scoring.

Business Impact:
Talos provides a complete Continuous Threat Exposure Management (CTEM) platform. We find the vulnerabilities before the hackers do, and we deploy the patch to the enterprise SDK in minutes—not days.
Resource Link:
Talos Shield Model: https://huggingface.co/rcavine/talos-guard-base/tree/main
Business Pitch Deck: https://www.canva.com/design/DAHBZpnx-JQ/RZHW0U1k3pmTWpdGGYH_JA/view?utm_content=DAHBZpnx-JQ&utm_campaign=designshare&utm_medium=link2&utm_source=uniquelinks&utlId=hef9a68fb22
Proposal: https://drive.google.com/file/d/1J2L7lE7iNHD2wCECR6yaeZHdNsMVsVR_/view?usp=sharing
X/Twitter Introductory Post: https://x.com/Kimchii73/status/2026677058777366608
All Resources (Proposal, Slides, Architecture Diagram): https://drive.google.com/drive/folders/1WrHwt2OQsV3dCsaDAx_K6vvpEVuC61wR?usp=sharing
We have deployed the subnet to testnet, but due to extreme high cost and instability from the bittensor testnet causes only 1 validator and 2 miners that works properly. That's why the current demo covers Talos Protocol in local network to prove that 3 validators and 10 miners works properly. Also addition, the basilica LLM inference are down due to security issue, so we use mockup data as the LLM responses, but the shield (the model that detects if the prompt contains jailbreak or not) is being trained live by blue miners. You can check the shield model in our huggingface: https://huggingface.co/rcavine/talos-guard-base/tree/main. Future works include SDK Python implementation so users can integrate our shield model properly in their LLM pipelines.