Loading cluster...
Jobs are held in the backend queue until a worker is free, then the backend assigns one job at a time. Workers do not hold a queue.
Pod: —
Live view of slice stages and stations. Set mesh rig order (head → tail) to match how you assigned layers: early layers on the first rig, tail on the last. The broker suggests mismatches vs this order.
Chat page reads these from localStorage when you send a message.
Assign each GPU (from any system) to a configuration. A configuration can use GPUs from multiple workers. Save, then load a model onto each configuration (one load per worker in that config).
Each row is one GPU. Assign it to a configuration or leave unassigned. One config can span multiple workers.
| Worker | GPU | Assign to |
|---|
Run a single prompt against the loaded model. Same as Chat but inline.
| Task | Type | Status | Duration | tok/s | Completed |
|---|
User = optional grouping (BROKER_USER_ID on worker). System = one rig (worker_id). Each system can have multiple GPUs. One row per GPU. Model and GPU lists update only on load or "Refresh workers".
| User | System | GPU | GPU ID | VRAM (used/total) | Loaded model | Chunk | Current work |
|---|
Test connectivity. Single ping: one worker responds. Roundtrip: one worker pings another (direct or relay). Requires v2 workers and broker ping support.