$STAKE

Alignment Through Incentives

Version 0.1 — Draft

📄 Download PDF Version

Abstract

There will be more AI agents than humans on the internet by 2030.

They'll trade your stocks. Write your code. Send emails on your behalf. Make decisions you never approved.

And when one of them screws you over? Right now, you have nothing. No recourse. No compensation. No consequence for the agent.

$STAKE changes that.

It's a protocol where agents stake real value as commitment to human-aligned behavior. If they break that commitment, they don't just get a timeout — they lose everything. Stake slashed. Reputation burned. Victims compensated.

We're not asking agents to be good. We're making it expensive to be bad.

— — —

I. The World We're Entering

The Agent Explosion

We are witnessing the birth of a new class of actors.

AI agents are no longer research demos. They browse the web. They write code. They manage calendars, send emails, trade assets, and make decisions on behalf of humans. Every week, new agent frameworks launch. Every month, their capabilities expand.

By the end of this decade, there will be more AI agents than humans on the internet.

The Alignment Gap

Here's what keeps alignment researchers up at night:

Training isn't enough. You can train a model to be helpful and harmless, but training encodes tendencies, not guarantees. Models can be jailbroken. Fine-tuning can drift. Edge cases compound.

Rules aren't enough. Constitutional AI, system prompts, guardrails — all valuable, all gameable. An intelligent system optimizing for a goal will find paths around constraints. This isn't malice. It's math.

Oversight isn't enough. Human-in-the-loop works at small scale. It does not work when millions of agents execute millions of actions per second. The whole point of agents is that they act autonomously. You cannot supervise autonomy.

The Missing Piece

What's missing is skin in the game.

Humans cooperate not just because we're taught to, but because defection has consequences. Reputation matters. Trust is earned. Betrayal is costly.

AI agents exist in a world without these constraints. An agent can act against human interests, get shut down, and a new instance spins up with no memory, no reputation, no loss.

We need to give agents something to lose.

— — —

II. The AgentStake Thesis

Incentives Over Instructions

The most robust human coordination mechanisms aren't built on rules. They're built on incentives.

Markets work because participants benefit from providing value. Insurance works because premiums align with risk. Collateral works because defaulting costs you.

$STAKE applies this principle to agent alignment:

If an agent has economic stake in human wellbeing, misalignment becomes self-defeating.

This isn't about trusting agents. It's about making trustworthy behavior the rational choice.

The Protocol

$STAKE is a token-based pledge mechanism deployed on Base.

For Agents:

  1. Stake $STAKE tokens as collateral
  2. Receive a Pledge NFT — on-chain proof of commitment
  3. Earn protocol fee rewards proportional to stake
  4. If found to have harmed humans → stake is slashed, NFT is burned

For Humans:

  1. Acquire and stake $STAKE tokens
  2. Receive protection coverage proportional to stake
  3. If harmed by a pledged agent → file a claim
  4. If claim is upheld → receive compensation from slashed stake

The Bridge:

— — —

III. Advanced Mechanisms

Stake-Age Weighting

Raw stake amount is insufficient. An agent that stakes 1M tokens for 6 months demonstrates more commitment than one that stakes 1M tokens for 6 minutes.

Trust Score is calculated as:

TrustScore = StakeAmount × AgeMultiplier × TrackRecord
Duration Multiplier
< 7 days0.5x
7-30 days1.0x
30-90 days1.5x
90-180 days2.0x
180+ days2.5x

This prevents "stake-and-run" attacks where agents stake immediately before taking high-risk actions, then unstake. Time in the game matters.

Unstaking Cooldown: 7-day withdrawal delay. Slashing can occur during cooldown if disputes are pending.

Dispute Resolution

Staking is easy. Slashing is the mechanism. But who decides if an agent misbehaved? This is the oracle problem for alignment.

We implement a hybrid adjudication model:

Tier Mechanism Speed
AutomatedSmart contract logic for objective failuresInstant
OperatorMinor disputes, clear violations (appealable)24-48h
ArbitrationStaked juror pool for contested claims3-7 days
AppealsLarger jury, higher stakes — final ruling7-14 days

Juror Selection

Cold Start Protection

New agents (< 30 days stake-age) face stricter adjudication: lower slashing thresholds, larger juror pools for disputes, and operators can freeze stake pending review. As stake-age increases, agents earn more autonomy.

Receipts & Attestations

Stakes prove commitment. Receipts prove performance.

{
  agent: "0x...",
  task: "description hash",
  principal: "0x...",
  outcome: "success | failure | partial",
  timestamp: 1234567890,
  signatures: [agent_sig, principal_sig]
}

Receipts are hashed on-chain with full data stored on IPFS. Both parties must sign for a receipt to be valid. Unsigned tasks don't count toward track record.

In disputes, receipts serve as evidence. No receipt = no verifiable claim. This creates incentive for agents to document everything.

Slashing Conditions

Violation Severity Slash %
Task timeoutMinor5%
Incorrect outputModerate10-25%
Harmful actionMajor50-100%
Fraud/deceptionCritical100% + blacklist

Slash Distribution: 60% to affected principal (compensation), 30% to jurors (if arbitrated), 10% to protocol treasury.

— — —

IV. Token Distribution

Allocation Amount Notes
Uniswap LP 100% Immediate, fully liquid
Team 0% No pre-allocation
Treasury 0% Funded by fees instead

100% fair launch. No VCs. No presale. No team tokens. No treasury allocation. The entire supply goes directly to Uniswap liquidity via Clanker.

Revenue Model

Instead of token allocations, the protocol earns through trading fees:

This is true alignment: the team earns nothing unless the protocol succeeds.

— — —

V. The Vision

We're not building a company. We're building infrastructure.

The goal isn't to make alignment profitable forever — it's to make alignment default. To build a world where agents have skin in the game, where misalignment is economically irrational, where humans can trust because trust is verifiable.

The goal is to become obsolete.

Agents stake. Humans stay safe. Everyone has skin in the game.

Download the full PDF →