<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Agustin Leon on Blog | Chameleon</title><link>https://blog.chameleoncloud.org/authors/agustin-leon/</link><description>Recent content in Agustin Leon on Blog | Chameleon</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 26 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.chameleoncloud.org/authors/agustin-leon/index.xml" rel="self" type="application/rss+xml"/><item><title>OpenMCP: A Reproducible Benchmarking Harness for Evaluating Computer-Use Agents on Chameleon</title><link>https://blog.chameleoncloud.org/posts/openmcp-reproducible-benchmarking-mcp-agents/</link><pubDate>Sun, 26 Apr 2026 00:00:00 +0000</pubDate><guid>https://blog.chameleoncloud.org/posts/openmcp-reproducible-benchmarking-mcp-agents/</guid><description>&lt;p&gt;&lt;em&gt;Agustin Leon and Anup Raj Niroula, New York University&lt;/em&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;What is the central challenge your experiment investigates?&lt;/h2&gt;
&lt;p&gt;Computer-use agents are language model programs that operate a computer through the same channels a person does: controlling a cursor, typing into applications, and calling tools. Think of it like an assistant that can organize your Notion notes, update a GitHub issue, or manage your calendar without needing step-by-step instruction. Through &lt;a href="https://www.anthropic.com/news/model-context-protocol"&gt;Model Context Protocol (MCP)&lt;/a&gt;, an open standard released in late 2024, agents can perform actions with specific applications in a consistent way. Adoption of MCP has been rapid: by the end of 2025, PulseMCP, a popular directory of MCP servers, had already &lt;a href="https://www.pulsemcp.com/statistics"&gt;tracked over 15 million downloads&lt;/a&gt; and more than 6,000 MCP servers.&lt;/p&gt;</description></item></channel></rss>