Pricing - x402 Singularity Layer

SGL Grid is usage-based: you pay per request, priced by the model and the tokens it processes. No subscription, no minimums.

How cost is calculated

Each model has a per-token price (input and output).
Cost for a request = tokens × the model's rate.
Credits are charged the exact amount after inference (on real token usage). x402 charges at request time based on the model and your max_tokens.

The live per-model price is shown in the Models view and the models API:

curl https://grid.x402compute.cc/grid/models

Per-provider pricing

Most nodes serve at the platform suggested rate, but operators may set their own per-token price within a band (suggested × 0.5 to × 5). You can compare the nodes serving a model — each with its effective price, cheapest first — and optionally pin one:

curl "https://grid.x402compute.cc/v1/providers?model=llama-3.2-3b"

Pass the chosen node_id as node on your chat request (or the node= argument in the SDKs) to use a specific provider; omit it and the grid routes for you.

Capping price (`max_price`)

Don’t want to pick manually? Pass max_price (blended USD per 1M tokens) on your chat/reserve request — the grid only routes to nodes at or under that rate, and the request is never billed above it. Works in every routing mode. It’s a typed option in the SDKs (max_price / maxPrice).

# raw API
curl -X POST https://grid.x402compute.cc/v1/reserve \
  -H "Content-Type: application/json" \
  -d '{"model":"llama-3.2-3b","max_price":0.01}'

If no node qualifies under the cap, the request returns price_cap_unmet instead of overcharging you.

Where the money goes

Each paid inference is split across the people who make the network work:

Share	Goes to
80%	The node operator who served your request
5%	Platform
10%	$SGL stakers (paid as USDC +$ SGL)
5%	$SGL buy-and-burn

So usage directly rewards operators and stakers, and continuously buys + burns $SGL.

Keeping costs down

Pick the smallest model that does the job (see Models).
Set a sensible max_tokens.
Use credits for steady usage (exact billing) and x402 for occasional/agent use.

Tracking spend

The dashboard’s API usage view shows your requests, tokens, spend, and a per-model breakdown over time. See Billing.

Confidential Inference Staking metrics

​How cost is calculated

​Per-provider pricing

​Capping price (max_price)

​Where the money goes

​Keeping costs down

​Tracking spend