Skip to main content
SGL Grid is usage-based: you pay per request, priced by the model and the tokens it processes. No subscription, no minimums.

How cost is calculated

  • Each model has a per-token price (input and output).
  • Cost for a request = tokens × the model's rate.
  • Credits are charged the exact amount after inference (on real token usage). x402 charges at request time based on the model and your max_tokens.
The live per-model price is shown in the Models view and the models API:
curl https://grid.x402compute.cc/grid/models

Per-provider pricing

Most nodes serve at the platform suggested rate, but operators may set their own per-token price within a band (suggested × 0.5 to × 5). You can compare the nodes serving a model — each with its effective price, cheapest first — and optionally pin one:
curl "https://grid.x402compute.cc/v1/providers?model=llama-3.2-3b"
Pass the chosen node_id as node on your chat request (or the node= argument in the SDKs) to use a specific provider; omit it and the grid routes for you.

Capping price (max_price)

Don’t want to pick manually? Pass max_price (blended USD per 1M tokens) on your chat/reserve request — the grid only routes to nodes at or under that rate, and the request is never billed above it. Works in every routing mode. It’s a typed option in the SDKs (max_price / maxPrice).
# raw API
curl -X POST https://grid.x402compute.cc/v1/reserve \
  -H "Content-Type: application/json" \
  -d '{"model":"llama-3.2-3b","max_price":0.01}'
If no node qualifies under the cap, the request returns price_cap_unmet instead of overcharging you.

Where the money goes

Each paid inference is split across the people who make the network work:
ShareGoes to
80%The node operator who served your request
5%Platform
10%SGLstakers(paidasUSDC+SGL stakers (paid as USDC + SGL)
5%$SGL buy-and-burn
So usage directly rewards operators and stakers, and continuously buys + burns $SGL.

Keeping costs down

  • Pick the smallest model that does the job (see Models).
  • Set a sensible max_tokens.
  • Use credits for steady usage (exact billing) and x402 for occasional/agent use.

Tracking spend

The dashboard’s API usage view shows your requests, tokens, spend, and a per-model breakdown over time. See Billing.