hackquest logo

Test Forge

Forging bulletproof test cases through decentralized AI and mutation-verified unit tests. Proven code quality, rewarded by the network.

视频

项目图片 1

技术栈

Web3

描述

# TestForge — Project Description

---

## 🎯 One-Liner

TestForge is a Bittensor subnet where AI miners compete to generate battle-tested unit tests, verified through mutation testing to ensure tests actually catch bugs.

---

## 📋 Executive Summary

| | |

|---|---|

| Problem | 70% of open-source code has zero tests. Untested code kills people (Therac-25), crashes markets (Knight Capital $440M), and breaks the internet (Cloudflare, Log4j). |

| Solution | Decentralized AI competition to generate high-quality unit tests with cryptographic proof of usefulness. |

| Innovation | Three-gate verification with mutation testing — the only ungameable way to prove a test actually works. |

| Why Bittensor | Tests are binary verifiable. Perfect fit for incentive-driven competition. Best AI wins, bad AI earns nothing. |

---

## 🔬 How It Works

```

┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐

│ CODE IN │────▶│ 64 MINERS │────▶│ 3-GATE │────▶│ BEST TEST │

│ │ │ COMPETE │ │ VALIDATOR │ │ WINS TAO │

└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘

```

| Gate | Question | Method | Threshold |

|------|----------|--------|-----------|

| Gate 1 | Do tests run? | pytest execution | 0 failures |

| Gate 2 | Do tests cover code? | coverage.py | ≥80% lines |

| Gate 3 | Do tests catch bugs? | Mutation injection | ≥60% kills |

Score = (G1 + G2 + G3) / 3 → Rewards distributed proportionally

---

## 🛡️ Why Mutation Testing is Ungameable

The Cheat (without mutation testing):

```python

def test_fake():

my_function("x") # No assertion — always passes, 100% coverage

```

Why It Fails Gate 3:

```python

# Original: return a + b

# Mutant: return a - b ← AI injects this bug

test_fake() # Still passes! Mutation SURVIVES.

# Kill rate: 0% → Gate 3 FAILS → Score: 0

```

You cannot fake catching a bug. The mutation either dies or it doesn't.

---

## 📊 Key Metrics

| Metric | Value |

|--------|-------|

| Prototype Status | ✅ Complete |

| Test Coverage | 47/47 passing |

| Simulation | 5 miners × 10 epochs working |

| API | FastAPI server ready |

| Benchmarks | SWE-bench, GitBug-Java, Defects4J |

---

## 🏗️ Architecture

```

TESTFORGE SUBNET

├── Task Generator ──────▶ Pulls from real-world benchmarks

├── Miner Pool ──────────▶ LLM agents generate pytest files

├── Validator Pipeline ──▶ 3-gate verification + mutation engine

├── Score Engine ────────▶ Proportional reward distribution

└── Bittensor Chain ─────▶ On-chain weight updates

```

---

## 💰 Economics

| Role | Emission Share | Incentive |

|------|----------------|-----------|

| Miners | 41% | Generate better tests → earn more |

| Validators | 41% | Run honest verification → earn stake |

| Subnet Owner | 18% | Grow the network → grow value |

---

## 🚀 Go-To-Market

| Phase | Timeline | Target | Goal |

|-------|----------|--------|------|

| 1 | Months 1-6 | Bittensor devs | 10 miners, 3 validators |

| 2 | Months 6-12 | Open source repos | GitHub Action, 100 repos |

| 3 | Year 2 | Enterprise | Self-hosted, $100k contracts |

---

## 🏆 Competitive Advantage

| vs Copilot/ChatGPT | vs Other Subnets |

|--------------------|------------------|

| ✅ Verified quality | ✅ Binary scoring (not subjective) |

| ✅ Mutation-tested | ✅ Focused on ONE task |

| ✅ Decentralized | ✅ Real benchmarks |

| ✅ Continuously improving | ✅ Ungameable mechanism |

---

## 📦 Deliverables

| Artifact | Status | Link |

|----------|--------|------|

| GitHub Repo | ✅ Live | [Test-Forge](https://github.com/manjeetsharma0796/Test-Forge) |

| Core Engine | ✅ Complete | testforge — 8 modules |

| Validation Pipeline | ✅ Complete | 3-gate system |

| Mutation Engine | ✅ Complete | 6 mutation operators |

| Tests | ✅ 47 passing | /tests/ |

| Simulation | ✅ Working | python main.py simulate |

| Documentation | ✅ Complete | /docs/ — 5 guides |

| API Server | ✅ Ready | python main.py serve |

---

## ⚡ Quick Start

```bash

git clone https://github.com/manjeetsharma0796/Test-Forge.git

cd Test-Forge

pip install -r requirements.txt

python -m pytest tests/ -v # 47 passed ✅

python main.py simulate # Watch miners compete

```

---

## 🎯 The Ask

Select TestForge for Round 2.

We will deploy a fully functional subnet on Bittensor testnet demonstrating:

- Working miner competition

- Live 3-gate validation

- Real TAO emissions to the best test generator

---

> "The best test is one that catches bugs before users do. TestForge makes AI prove it."

---

GitHub: [github.com/manjeetsharma0796/Test-Forge](https://github.com/manjeetsharma0796/Test-Forge)

本次黑客松进展

simulation build
ideation and top layer architecture ready

integration -- upcoming months

融资状态

0

队长
MManjeet Sharma
项目链接
赛道
AIInfraOtherDeFi