MCP-enabled computer-use agents are proliferating faster than our ability to evaluate them — and most existing benchmarks depend on commercial APIs that get deprecated without notice. OpenMCP is an open-source, fully self-hosted benchmarking harness built by NYU researchers that lets anyone reproducibly evaluate MCP agents across diverse hardware, from H100 datacenter GPUs to a Raspberry Pi 5 at the edge.
Categories – Featured
Announcing NSDI 2026 Bird-of-Feather (BoF) Session on Reproducibility
Our Reproducibility Ambassador Is Heading to NSDI ’26 — Here’s What to Expect
A PhD student selected through the Reproducibility Ambassador program funded by the NSF REPETO project is heading to NSDI '26 to share how researchers can package and reproduce their experiments using Chameleon and Trovi. Join the BoF sessions on May 4th and May 5th to see it in action.
Running Artifact Evaluations on Chameleon
A practical guide for AE organizers using shared research infrastructure
Chameleon has supported artifact evaluations at more than 30 events across 16 major HPC and systems conferences. This guide distills those lessons into practical advice for AE organizers: how to plan hardware access, structure author and reviewer workflows, and keep reproducible artifacts alive after the evaluation closes.
Baremetal H100 nodes on Chameleon
Chameleon Newsletter & Changelog March 2026
Welcome to the Chameleon March 2026 Newsletter!
This month we're highlighting the last chance to register for the Sixth Chameleon User Meeting, a new webinar recording from UTEP's MINCER team, platform updates including multi-instance GPU support, new cloud traces, switch performance improvements, and several testbed usability enhancements.