Node CLI Reference - x402 Singularity Layer

sgl is the open-source node operator CLI. It generates your node’s keys, registers it on the grid, runs inference, manages the background service, and toggles maintenance mode. This is the complete command reference.

Install sgl from the official open-source release (see Node setup). You also need a local inference runtime (llama-server) and a GGUF model file to serve.

Typical first run

sgl login                                  # browser-approve with your staked wallet
sgl attest                                 # produce + submit the hardware attestation
sgl service install \                       # run as a background service, serving a model
  --model-path ~/models/Llama-3.2-3B-Instruct-Q4_K_M.gguf \
  --model-name llama-3.2-3b \
  --resource-percent 50

Commands

Flag	Default	Description
`--tee-type`	`apple_se`	TEE type on this machine (e.g. `apple_se`)
`--models`	—	Comma-separated models to advertise

`sgl init`

Initialize the node without the browser flow: generate keys and register directly under a wallet.

Flag	Default	Description
`--wallet`	(required)	Staked Solana wallet address this node operates under
`--tee-type`	`apple_se`	TEE type on this machine
`--models`	—	Comma-separated models to advertise

`sgl attest`

Verify attestation — sign the orchestrator’s challenge and submit the hardware attestation proof. Only attestation-verified nodes receive jobs. Run after login/init and after any binary update.

`sgl start`

Run the node in the foreground: start the local inference server, begin heartbeating, and process jobs.

Flag	Default	Description
`--model-path`	—	Path to the GGUF model file
`--model-name`	—	Model name to advertise (e.g. `llama-3.2-3b`)
`--inference-port`	`8081`	Port for the local llama-server
`--resource-percent`	`100`	Quick preset (1–100): sets threads, GPU layers, and concurrency proportionally
`--threads`	auto	CPU threads for inference (overrides the preset)
`--gpu-layers`	auto	GPU layers to offload (0 = CPU only, 99 = all)
`--context-size`	`4096`	Context window in tokens
`--max-jobs`	`1`	Max concurrent jobs to accept
`--batch-size`	`512`	Prompt batch size
`--heartbeat-interval`	`5`	Heartbeat seconds (lower = faster pickup, more traffic)

Use sgl start to test in the foreground; use sgl service install for production so the node survives reboots, logout, crashes, and idle sleep.

`sgl status`

Show node status, hardware capabilities, and orchestrator connection info.

`sgl off-grid`

Go off-grid (maintenance): stop receiving new jobs without penalty, for planned downtime. In-flight jobs finish cleanly. Tamper slashing is unaffected — off-grid only pauses job routing, it can’t dodge a tamper penalty.

`sgl on-grid`

Come back on-grid: resume receiving jobs.

`sgl price`

View or set your per-token price. By default your node bills the platform suggested rate (and you keep 80%); you may optionally set your own price within an allowed band — floor = suggested × 0.5, ceiling = suggested × 5. Prices are USD per 1M tokens, split into input (prompt) and output (completion). A model with no custom price simply bills at the suggested rate.

sgl price show                                              # your prices + the suggested rate + the band
sgl price set --model llama-3.2-3b --input 0.004 --output 0.004   # undercut to win more jobs
sgl price reset --model llama-3.2-3b                        # back to the suggested price

Subcommand	Flags	Description
`show`	—	Show each served model’s price, the suggested rate, and the band
`set`	`--model`, `--input`, `--output`	Set a custom price (USD per 1M tokens; must be in-band and for a model you advertise)
`reset`	`--model`	Revert a model to the platform suggested price

The server enforces the band and a short cooldown between changes. You can also manage prices from the operator dashboard, and callers compare nodes at GET /v1/providers?model=….

`sgl service` — background service

Runs the node as a managed OS service (launchd on macOS / systemd on Linux) so it keeps serving across reboots, logout, crashes, and idle sleep.

`sgl service install`

Install and start the background service.

Flag	Default	Description
`--model-path`	—	Path to the GGUF model file
`--model-name`	—	Model name to advertise
`--resource-percent`	`50`	Percentage of system resources to dedicate (1–100)
`--inference-port`	`8081`	Port for the local llama-server
`--max-jobs`	`1`	Max concurrent jobs
`--heartbeat-interval`	`5`	Heartbeat seconds
`--sandbox`	off	macOS only. Run the node under a Seatbelt sandbox that walls off SSH keys, wallets, keychains, and browser data from the inference process. On Linux, equivalent systemd hardening is applied automatically.

`sgl service stop`

Stop and remove the background service.

`sgl service status`

Show whether the service is installed and running.

Maintenance workflow

sgl off-grid          # before planned downtime — stop new jobs
# ... do maintenance ...
sgl on-grid           # resume serving

After updating the node binary, re-run sgl attest so the network re-verifies your enclave. See Node setup and Earnings.

​Typical first run

​Commands

​sgl login

​sgl init

​sgl attest

​sgl start

​sgl status

​sgl off-grid

​sgl on-grid

​sgl price

​sgl service — background service

​sgl service install

​sgl service stop

​sgl service status

​Maintenance workflow

Typical first run

Commands

`sgl login`

`sgl init`

`sgl attest`

`sgl start`

`sgl status`

`sgl off-grid`

`sgl on-grid`

`sgl price`

`sgl service` — background service

`sgl service install`

`sgl service stop`

`sgl service status`

Maintenance workflow