[ Arbitrum Obrit ] - In-Chain SQL Database for Arbitrum Orbit

In-Chain SQL Database for Arbitrum Orbit

Author : Dennison Bertram Date : 28 Apr 2025


1 Problem Statement

Orbit roll-ups still rely on off-chain indexers (The Graph, Ponder, Postgres, etc.) for anything beyond basic eth_call. That extra stack complicates DevOps, widens the trust surface, and weakens Orbit’s “one-binary backend” story. We want contracts and backend code to write and query rich, relational data directly on the roll-up with no external infra.

The opportunity here is that we can use the Orbit stack as a complete backend infrastructure for running web apps. As a developer the friction to building web3 apps is still too high, specifically for web apps. Orbit has the opportunity to replace the entire backend infrastructure stack allowing us to write frontend code in frameworks like Nextjs that just works with the EVM backend. The key missing piece here is the Database. If we can build and in-chain SQL database we could vastly simplify the infrastructure of building apps. The future I’m envisioning here is as a developer I launch an Orbit app: frontend code, business logic in solidity/rust, with a database natively available. The killer feature here is that suddenly my backend can natively interact with money. This is a 10x improvement on the status quo today where I need to use API’s to interact with money. With Orbit, I can custody money directly in app in a secure, trustless way. This would super-charge the appchain thesis.

2 Goals

  • Expose a SQL-compatible interface (DuckDB or SQLite dialect) callable from Solidity, Stylus and JSON-RPC.
  • Keep every write on-chain, deterministic and fraud-provable.
  • Achieve < 50 ms read latency over local IPC and ≤ 300 k gas for common updates.
  • Deliver all of the above in one Nitro binary (validators run EVM + SQL, nothing else).

Explicitly out of scope: multi-roll-up distributed SQL, full Postgres feature parity, massive OLAP workloads.


3 Proposed Solution (High-Level)

  1. Static-compile DuckDB to WASM (wasm32-wasi, size-optimised).
  2. Wrap the WASM in a Stylus precompile deployed at a reserved address (e.g., 0x…0800).
    • query(bytes sql) → bytes (view function, returns Arrow IPC or JSON)
    • exec(bytes sql) → bool (state-changing)
  3. Implement a custom VFS mapping 4 kB DuckDB pages to EVM storage slots:
    slot = keccak256(dbId ‖ pageNo).
  4. Register the module root and gas schedule in rollupConfig; followers auto-load the WASM on start-up.
  5. Publish a thin Solidity facade (ISql) and a TypeScript client (SqlProvider) so dApps can call sql.query() and sql.exec() out-of-the-box.

4 Architecture Sketch

Solidity / Node           Nitro Follower
┌──────────────┐ CALL     ┌──────────────────────┐
│ ISql.query() │ ───────▶ │ Stylus WASM Engine   │
└──────────────┘          │  • DuckDB runtime    │
        ▲ JSON-RPC / IPC  │  • VFS ↔ EVM storage │
        └─────────────────┴──────────────────────┘

Read-only queries touch zero storage slots; writes persist only the pages actually modified.


5 Key Requirements & Acceptance Criteria

  • Determinism – two independent validators must compute identical query outputs when replaying a block.
  • Write cost – inserting a 4 kB row-set must stay below ~2.5 M gas in worst-case cold-storage scenarios.
  • Read latency – simple SELECT … WHERE id = ? over local IPC should complete in < 50 ms on commodity hardware.
  • Upgradability – chaining a new WASM root must preserve existing data and continue serving queries.
  • Failure isolation – malformed SQL must revert without corrupting on-chain pages.

6 Example Developer Workflow

  1. Deploy a contract that imports the ISql interface.

  2. Run a migration transaction:

    sql.exec("CREATE TABLE orders(id INT PRIMARY KEY, maker TEXT, amt INT)");
    
  3. Insert or update rows inside regular business-logic transactions.

  4. From the frontend (or another contract) call:

    const rows = await sql.query(
      "SELECT * FROM orders WHERE maker = ?", makerAddress
    );
    

    The call returns Arrow or JSON—no external indexer required.


7 Benefits

  • Zero off-chain indexer – the roll-up itself stores and serves application data.
  • Deterministic, fraud-provable SQL – queries run inside Stylus under Nitro’s dispute protocol.
  • Simplified DevOps – validators run one binary; backend services speak standard JSON-RPC or IPC.
  • Rich analytics – DuckDB supports joins, aggregates, window functions and more, enabling dashboards and leaderboards without Postgres.
  • Tunable economics – Orbit operators can align page-write gas with actual calldata costs.

8 Open Questions

  • Engine choice – DuckDB (analytics, columnar) versus SQLite (smaller binary).
  • Page size – 2 kB versus 4 kB trade-off between worst-case gas and throughput.
  • Return format – Arrow stream or plain JSON as the default wire format.
1 Like

Thanks for putting this up, dennison.

There is a lot to unpack here, but before we dive into all the technicalities of this proposal, we suggest you add it to the Proposals category and rename it to include that it is a Constitutional AIP, as implementation would bring modifications to the code stack.

1 Like

The mad scientist aspect of this idea fascinates me and the idea of doing crazy thing on an Orbit..there’s really no reason not to, as long as it doesn’t affect the rest of the rest of the ecosystem!

Would it be performant? Would it be useful? I’m not sure. Would it be cool? Yes.

I am admittedly naive to the exact details of the implementation changes that would be necessary to implement this, but ideally it would be first implemented as a fork of the codebase, one that adds additional client functionality but works without changing any of the on-chain specifications or functionality. That would entirely avoid making it Constitutional, which would otherwise be a murderously high bar to pass most likely.

Later, if the concept proves itself useful and popular, the changes could then be put up to be merged into the main branch of the client in a separate proposal.

1 Like

Looks like this was posted twice, haha, we have moved it to the Technical Discussion category :slight_smile: and deleted the redundant topic. Hopefully, will help more people see it and spur further technical discussion!

1 Like

Should his be in Proposals category or technical discussion? I’m a little unclear where this should be put. (See the advice above)

1 Like

I think really we would want @offchainlabs to opine on this, because I imagine they would be the ones who would want to build it. So I’d like to ask to get their input. As a builder it would be incredibly useful for me. The idea that I have to build a parallel indexing solution on top of of my codebase is so frustrating.

1 Like

We think technical discussion is appropriate for now as there is nothing yet to vote on. If this changes after a discussion, then we’d recommend a fresh post in the Proposal section with key items to vote on.

2 Likes

Sounds good!

Also I realized that I’m the first person to post in technical discussion! What an honor. :slight_smile:

I don’t understand this at all. We just moved to permissionless validation on arbitrum (anyone can now run a node and verify the chain). Adding a network db like that might prevent this?

You lose determinism with this kind of db, that’s why arbitrum is a blockchain and not a spreadsheet.

“the roll-up itself stores and serves application data.”
The roll-up itself already stores and serves app data.

I really don’t understand the motivation, what are you trying to build that existing tools don’t work or are too inconvenient?

Mud.dev has already created an onchain SQLlite like db implemented in solidity. It’s really cool and it might give you what you need.

You’re comparing a db to making individual rpc calls to get every piece of data. People don’t really do that much for complex applications anymore. Have you tried graphql or subgraphs? They work and are very heavily used to web2 development / standardized.

1 Like

Something feels very anti-pattern about this. Can a demo be built for people to test out? This also serves to check that some engineering details are ironed out because it feels a bit over optimistic.

For what it is worth, I think this is cool.

No thats the beauty of this idea, it’s a SQLite database, running in rust, inside the chain, just as part of the stack. This would be entirely deterministic- any mutations have to be made from on-chain calls via smart contracts.

I really don’t understand the motivation, what are you trying to build that existing tools don’t work or are too inconvenient?

Think of it like this. Today you build a smart contract. It emits events. Then you have to build an indexer that reads these events and formats them, then you have to build a db that processes these events, so that you can read them or do any sort of time-series calls from a front end, then you have to keep this in sync with the chain, now you have two sources of truth and then you have to make sure you reconcile them regularly to be sure you haven’t missed something. (Oh did your event get trigged deeply nested in some other call? Hope you didn’t miss that one!).

Thats a terrible experience compared to building on Web2 where you just build your DB and you’re done.

I’ll def check Mud.dev but previously my issue has been that purely solidity implemented DB’s haven’t been very performant nor easy to use.

I’m proposing a hybrid here: We implement SQLite in Rust (Maybe it’s a rust smart contract?) it’s actually in the chain (or onchain) thus its entirely deterministic, each node syncs it like they would sync any other state. Ideally we also expose a DB api which mutates via smart contract calls and reads via some special direct connection for performance.

My high level point though is this: Web3 development sucks compared to Web2. It’s hard, it’s unreliable and it’s expensive. Yes: reading data from the blockchain is expensive. The RPC providers, love them to death, are orders of magnitude more expensive than Supabase or AWS. I’m not kidding when I say I’ve spent maybe half a million dollars on RPC calls.

Something feels very anti-pattern about this. Can a demo be built for people to test out?

I think it’s just anti-pattern in the sense that in web3 we value the engineering coolness of our tech over the accessibility and usability of it. Personally I think the bigger anti-pattern is that every Web3 app has multiple sources of truth. Every team I know fields support issues from customers who are like, “your information on your website doesn’t match the chain”. That comes from how hard it is to index.

I’m not sure, I assume yes, but it’s not a small lift. I think at the moment I would want to see if @offchainlabs could opine about the feasibility. Everything is “possible” but maybe it really isn’t practical for some reason I’m unaware of.

But maybe the first step would be to take a RUST implementation of SQLite and try and deploy it on Orbit as a smart contract. And then see if we can read and write to it.

This is a very interesting proposal

Indeed, this is not the first time I see such an idea and its implementation would really solve many problems with application development.
However:

  1. If we do this (which is difficult), then it will only work within the Orbit infrastructure. On the one hand, this is an advantage for Arbitrum, but it will distance Orbit developers from others and, perhaps, will not be popular among developers because of this.
    Developers will have to learn exclusively for the implementation of their applications on Orbit
  2. If this is such a popular idea, why has no one implemented it yet? It seems to me that here we can discuss not so much the complexity of implementation, but the complexity of executing queries in the blockchain. For example, if you imagine a simple SELECT for all transactions for the last day - this requires many operations (and this is one of the simple operations. If you imagine some special query with different conditions, then you will have to run through all the data, since there is no indexing of data in the blockchain, like in the DB). That is, the cost of such an operation for the user will be several times more (and maybe more) than it is currently implemented.
    Yes, for developers everything will be much simpler, but expensive for users

Ethereum clients literally already have databases. Geth (go-ethereum) uses LevelDB for example. Besu uses RockDB. I believe orbit also uses RocksDB. Its works pretty much the same already as youre describing.

this doesnt make any sense. how do you call an API? A node for an orbit chain is not a web-server
so you need something to pass stuff from the blockchain network to the web endpoint. Arbitrum as a blockchain does not have an IP address that you can make API alls to. You cant query it without some kind of CALL to the REMOTE blockchain in a PROCEDURAL way that an API can understand (R-P-C). So you end up with the same dev experience.

modern Web2 development:

  1. build your db with some queries
  2. pull it into your app with graphql

modern web3 developemnt:

  1. build your subgraph with some queries
  2. pull it into your app with graphql

Dennison, mad respect to you, DappHero forever, but please get one of the many talented FE devs at Tally to help you with your subgraphs because as a technical proposal this just doesnt make any sense.