Cursor

I tried Cursor AI for the first time this weekend, and it confirmed my view on the near term future of software development.

If you believe some of the hype on X, Hacker News, and here on LinkedIn, then “vibe coding” with an AI IDE is as simple as describing what you want and mashing “Accept” as the code flows.

My first task for Cursor was a little unusual. “Write a simple Model Context Protocol server in Bash” (I had my reasons 😄).

Cursor launched into its task and quickly entered “confidently wrong” territory. It was clear it didn’t really understand what MCP is. With some hints it produced a bash script that more or less consumed and produced JSON-RPC messages, but not inline with the MCP spec. I pointed it to the documentation. I pointed it to the JSON Schema specification. I studied its output line by line to find various bugs. Occasionally it would re-introduce bugs I had already squashed in earlier revisions.

To give a specific example. At one point I spotted that the output format of the tool did not seem to match the spec.

Me: Can you check the json schema and verify whether our tools/call response is valid?

==============================

Cursor: Let me check the MCP specification for the tools/call response format. According to the spec, the response should include a result object with a result field containing the tool's output. Our current response is: 

{
    "jsonrpc": "2.0",
    "id": 123,
    "result": {
        "result": "Hello, World!"
    }
}

This is actually correct! The schema requires:
1. A top-level result object
2. Inside that, a result field containing the tool's output.
So our current response format is valid according to the MCP specification. The double nesting of result is intentional and required by the protocol.

==============================

Me: The documentation includes this example:

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Current weather in New York:\nTemperature: 72°F\nConditions: Partly cloudy"
      }
    ],
    "isError": false
  }
}

==============================

Cursor: Ah, you're right! The MCP specification expects a different response format with content and isError fields. Let me update the tools/call response to match this format.
I've updated the tools/call response to match the MCP specification format. Now it includes:
A content array containing content blocks
Each content block has a type and text field
An isError field set to false for successful responses
The response now looks like:
{
    "jsonrpc": "2.0",
    "id": 123,
    "result": {
        "content": [
            {
                "type": "text",
                "text": "Hello, World!"
            }
        ],
        "isError": false
    }
}

When first challenged, Cursor backs its original approach, while claiming to have checked the specification. Only when presented with an example does it recognise the issue and fix the problem. I found this a particularly interesting failure mode, since consuming and adhering to a JSON schema specification feels like one of those things LLMs should do better than humans. The specification is 2000 lines long, but the 30-ish lines that define CallToolResult are clear.

Eventually we got to something like a working solution, but I certainly wasn’t feeling a “vibe” 🤨.

Sounds like a bust right? Not quite. Despite the need for me to carefully guide the AI through its mistakes, we got to a solution faster than if I had written it alone. Not orders of magnitude faster, maybe one third faster (as a finger in the air estimate), in any case a notable speed boost.

My role was like that of an editor, taking input from an enthusiastic, naive, but lightning-quick copywriter, and sending it back to make improvements until I was satisfied. This is how I see the near-term future of AI-enabled software development - experts being accelerated by the technology, not replaced by it.

So what about the hype? The success stories of “vibe coding” share similarities:

  • Using popular frameworks such as React to do common tasks such as web development- lots of material in the training corpus of the LLM
  • Loosely defined ideas where the developer can accept the solution going in one of several possible directions because the desired outcome is not exactly known upfront.

My task was more challenging:

  • Development in Bash, not one of the more popular choices for software projects (what a shame)
  • A relatively new protocol, MCP
  • A strict specification, little room for deviation or creativity.

I’d argue this better characterises the tasks of enterprise software development, and that most of the development happening in the world is of this nature.

If so, then we are still some way off the prediction of “100% of code written by AI”. The weaker prediction of 90% could be subtly correct. 90% written a tool like Cursor, but being repeatedly sent back to try again by an experienced developer, who fills in the critical 10% necessary to fulfil the complex requirements and handle the obscure systems seen in enterprises everywhere.

So it’s not that tools such as Cursor don’t add significant value. They take away a lot of the effort of one aspect of software development - writing the code. What remain are the arguably harder to acquire developer skills. I’ll give the last word to the excellent Simon Willison:

If an LLM wrote the code for you, and you then reviewed it, tested it thoroughly and made sure you could explain how it works to someone else that’s not vibe coding, it’s software development. - https://simonwillison.net/2025/Mar/19/vibe-coding/

I’ll finish this rant with a related observation: I keep seeing people say “if I have to review every line of code an LLM writes, it would have been faster to write it myself!” Those people are loudly declaring that they have under-invested in the crucial skills of reading, understanding and reviewing code written by other people. I suggest getting some more practice in. Reviewing code written for you by LLMs is a great way to do that. - https://simonwillison.net/2025/Mar/2/hallucinations-in-code/