Running LLMs on Chameleon GPUs from FABRIC via Stitch Ports

Mon, 16 Mar 2026 00:00:00 +0000

What if you could combine Chameleon's bare-metal GPU servers with FABRIC's programmable network fabric — and access the GPU over a private network without ever assigning a public IP? That's exactly what Chameleon's stitch port feature enables, and we've published a Trovi artifact that demonstrates the full workflow end to end.

The artifact provisions an RTX 6000 GPU server on Chameleon, connects it to a FABRIC slice over fabnetv4, installs Ollama with a DeepSeek-R1 model on the GPU, and queries the LLM from a FABRIC node — all through the private stitched network. You can use it as-is to run LLM inference, or adapt it as a starting point for your own cross-testbed experiments.

Ganesh C. Sankaran on Blog | Chameleon

Running LLMs on Chameleon GPUs from FABRIC via Stitch Ports