ARBuilder — Final Report
Project: ARBuilder — AI-Powered Development Assistant for the Arbitrum Ecosystem
Team: Quantum3 Labs
Grant Program: Arbitrum Foundation Developer Tooling
Period: January – March 2026
Milestones Completed: 5/5
1. Overview
ARBuilder is an open-source MCP (Model Context Protocol) server that provides AI-powered code generation for the Arbitrum ecosystem. It transforms natural language prompts into production-ready code for:
- Stylus smart contracts (Rust/WASM) — SDK 0.10.0
- Cross-chain bridging (ETH/ERC20 across L1, L2, L3)
- Full-stack dApps (NestJS + Next.js + The Graph + Chainlink)
- Orbit chain deployment (config, rollup, validators, node setup)
The tool uses a RAG (Retrieval-Augmented Generation) pipeline with hybrid BM25 + vector search over 40+ curated Arbitrum repositories, ensuring generated code follows current SDK patterns rather than relying on outdated LLM training data.
19 MCP tools work inside Cursor, VS Code, and Claude Desktop — or via the hosted playground at arbuilder.app.
2. Deliverables & Milestone Completion
All 5 milestones have been completed and merged to the main branch.
| # |
Milestone |
Description |
Status |
PRs |
| M1 |
Proposal + PoC Improvement |
Updated proposal, improved PoC, documentation |
Complete |
— |
| M2 |
Stylus + SDK RAG |
Stylus code generation (6 tools), Arbitrum SDK bridging (3 tools), MCP integration, tutorial video |
Complete |
#4, #5, #6, #7 |
| M3 |
Full dApp Builder |
Backend, frontend, indexer, oracle, orchestration (5 tools), hybrid RAG, ABI extraction |
Complete |
#9, #10, #11, #12, #13 |
| M4 |
Orbit Integration |
Chain config, rollup deployment, validator setup, Q&A, orchestration (5 tools) |
Complete |
#14 |
| M5 |
Final Report + Metrics |
ARBuilder v1.0, final report, KPIs |
Complete |
#15, #16, #17, #19 |
Key Technical Deliverables
| Component |
Details |
| MCP Tools |
19 tools across 4 modules (M1: 6, M2: 3, M3: 5, M4: 5) |
| Templates |
7 Stylus contract templates + 9 Orbit deployment templates |
| RAG Pipeline |
Hybrid BM25 + vector search (BGE-M3, 1024 dimensions) + cross-encoder reranking |
| Code Validation |
Docker-based cargo check with up to 3 auto-fix attempts, 6-phase fix architecture |
| Hosted Service |
Cloudflare Workers (Workers AI, Vectorize, D1, KV, Queue) |
| Test Suite |
451 unit/integration tests including TypeScript compilation and bash syntax validation |
| Ingestion Pipeline |
Worker-native cron (every 6 hours) — scrape, chunk, embed, upsert |
3. Developer Adoption & Usage Metrics
3.1 Platform Usage
Data source: Cloudflare D1 usage_logs (MCP endpoint)
| Metric |
Value |
| Total code generation requests |
1,921 |
| Total LLM tokens consumed |
5,340,565 |
| Overall success rate |
99.7% (1,915/1,921) |
| Average latency |
15.6 seconds |
| Peak daily usage |
333 requests (March 5, 2026) |
| Active tracking period |
Feb 24 – Mar 8, 2026 |
3.2 Requests by Tool
| Tool |
Requests |
Category |
| generate_stylus_code |
488 |
M1: Stylus |
| generate_bridge_code |
327 |
M2: SDK |
| generate_messaging_code |
183 |
M2: SDK |
| ask_bridging |
183 |
M2: SDK |
| orchestrate_dapp |
94 |
M3: dApp |
| generate_indexer |
84 |
M3: dApp |
| generate_tests |
69 |
M1: Stylus |
| ask_stylus |
68 |
M1: Stylus |
| generate_frontend |
60 |
M3: dApp |
| generate_backend |
57 |
M3: dApp |
| get_workflow |
51 |
M1: Stylus |
| ask_orbit |
44 |
M4: Orbit |
| generate_orbit_deployment |
42 |
M4: Orbit |
| get_stylus_context |
38 |
M1: Stylus |
| generate_oracle |
35 |
M3: dApp |
| generate_validator_setup |
33 |
M4: Orbit |
| generate_orbit_config |
33 |
M4: Orbit |
| orchestrate_orbit |
32 |
M4: Orbit |
3.3 Developer Reach
| Channel |
Details |
| Registered users (arbuilder.app) |
5 accounts with API keys |
| External developers with mainnet deployments |
5 unique developers |
| GitHub maintainers |
4 (wms2537, DiegoFloresWenHao, macoloye, ngjupeng) |
| GitHub stars |
21 |
| YouTube videos |
3 tutorials published |
| Glama quality score |
AAA (highest tier) |
4. Mainnet Deployments — Case Studies
Five projects built with ARBuilder have been independently deployed to Arbitrum One mainnet by external developers:
| # |
Project |
Developer |
Contract Address |
Description |
| 1 |
Arbitrum TipJar |
jaysun15 |
0xba39c706dc01a10c974d5dc355e0725def232e2e |
On-chain tipping |
| 2 |
ArbiDrop |
Jalaa532 |
0x27ccb5c8da8ae0e2c8b349a0a7d14627429c9569 |
Token airdrop distribution |
| 3 |
ArbTokenLaunch |
benedicther |
0x1Da2ce71c789E43775c374920b345998B1F72abC |
Token launch platform |
| 4 |
Credential Revocation Registry |
ajeelll |
0x5c31858bd9769cfc29dc36f0cb40bc90cdff321a |
Verifiable credential registry |
| 5 |
DisasterReliefLedger |
samdasak |
0xff5a49998772f123c26333fa3a033bdb9cd1fdc7 |
Disaster relief fund tracking |
Testnet projects (Arbitrum Sepolia):
- test-erc20-stylus — ERC20 token on Stylus SDK 0.10.0
- arcadedao-orbit — Orbit chain deployment
- prediction-market-dapp — Full-stack prediction market
5. KPI Achievement
| KPI |
Target |
Achieved |
Evidence |
| Unique developers |
≥10 |
12 |
Registered users + GitHub contributors + mainnet deployers (deduplicated) |
| Code generation requests |
≥100 |
1,921 (19.2x target) |
Cloudflare D1 usage_logs — tracked Feb 24 – Mar 8, 2026 (schema) |
| GitHub stars |
≥20 |
21 |
GitHub |
| Community contributions |
≥5 |
8 issues, 18 merged PRs |
Issues / PRs |
| Case studies |
≥3 |
5 |
See Section 4 — all with on-chain contract addresses |
| Projects deployed to mainnet |
≥5 |
5 |
See Section 4 — verified on Arbitrum One |
6. Community Engagement & Distribution
6.1 Open Source
| Item |
Details |
| License |
MIT |
| README |
Comprehensive with architecture, quick start, 19-tool reference |
| CONTRIBUTING.md |
Development guide, code style, PR process |
| Issue templates |
Bug report + feature request |
| Good first issues |
7 open issues labeled for new contributors |
6.2 Directory Listings
6.3 Content & Tutorials
| Content |
Link |
| Setup & usage walkthrough |
YouTube |
| Prediction market dApp demo |
YouTube |
| L3 Orbit chain demo |
YouTube |
6.4 Website & SEO
arbuilder.app includes:
- Landing page with all 4 milestone feature sections
- Interactive playground with all 19 tools
- Transparency page (data sources, pipeline status)
- SEO: JSON-LD structured data, Open Graph, Twitter cards, robots.txt, sitemap.xml
- LLM discovery endpoint (/llms.txt) for AI search engines
7. Technical Architecture
RAG Pipeline
Query → BM25 + Vector Search (BGE-M3) → RRF Fusion → Cross-Encoder Reranking → LLM Generation
- Hybrid search over 40+ curated Arbitrum repos and docs
- Cross-encoder reranking for precision
- 65 Stylus compilation rules in system prompts
- Compile verification via Docker
cargo check with 3 auto-fix attempts
Code Quality Pipeline
6-phase fix architecture (Python + TypeScript, kept in sync):
- Structural corrections (attributes, extern crate)
- Import resolution (alloc, alloy_sol_types)
- Storage accessor patterns (getter/setter)
- API migration (SDK 0.9.x → 0.10.0)
- Type corrections (B256, U256, as_u64)
- Output sanitization (garbled LLM output)
Deployment
| Mode |
Stack |
| Self-hosted |
Python MCP server via stdio → Cursor / VS Code / Claude Desktop |
| Hosted |
Cloudflare Workers, Workers AI, Vectorize, D1, KV, Queue |
8. Key Learnings
- RAG outperforms fine-tuning for fast-moving SDKs. Stylus SDK went from 0.6 → 0.10.0 in months. RAG allows knowledge base updates in hours, not weeks of retraining.
- Compile verification is non-negotiable. The
cargo check + auto-fix loop catches errors that static analysis cannot, especially Stylus-specific patterns like self.vm() migration.
- Hybrid search matters. BM25 alone misses semantic similarity; vector search alone misses exact API signatures. The combination with cross-encoder reranking significantly improves code quality.
- Template + LLM hybrid works best. Templates provide reliable scaffolding (100% success rate); LLM fills in custom logic. This gives consistency for common patterns while maintaining flexibility.
- Testing catches real bugs. Our TypeScript compilation tests caught a template substitution ordering bug where
CHAIN_ID_PLACEHOLDER was being replaced before PARENT_CHAIN_ID_PLACEHOLDER, corrupting parent chain IDs in generated code.
9. Repository Statistics
| Metric |
Value |
| Total commits |
218 |
| Lines of code (Python + TypeScript) |
56,700+ |
| Test cases |
451 |
| Pull requests merged |
18 |
| Issues created |
8 (7 open “good first issue”) |
| Data sources in knowledge base |
40+ |
10. Future Plans
- Continue maintaining the knowledge base as Stylus SDK evolves
- Respond to community issues and contributions via the 7 open “good first issue” items
- Monitor and update directory listings (awesome-mcp-servers, awesome-stylus, mcp.so)
- Explore additional Arbitrum ecosystem integrations based on community feedback
- Track mainnet deployments from new users
ARBuilder is open-source under the MIT license. Built by Quantum3 Labs for the Arbitrum ecosystem.
1 Like