<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="https://psschwei.com/xml/base.min.xml" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Posts on Paul Schweigert</title>
    <link>https://psschwei.com/post/</link>
    <description>Recent content in Posts on Paul Schweigert</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Tue, 14 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://psschwei.com/post/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Typed PR Reviews on Autopilot: mellea &#43; Claude Code Routines</title>
      <link>https://psschwei.com/post/blog-claude-code-routines/</link>
      <pubDate>Tue, 14 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://psschwei.com/post/blog-claude-code-routines/</guid>
      <description>

&lt;center&gt;&lt;img src=&#34;https://psschwei.com/images/mellea-routines.png&#34; width=&#34;200&#34; alt=&#34;mellea + Claude Code Routines&#34;&gt;&lt;/center&gt;


&lt;p&gt;Every team that has tried AI-powered code review hits the same wall. The model returns a paragraph of unstructured text, or a JSON blob with missing fields, or a review that looks plausible but silently dropped half the checklist. You parse and re-parse, add retry logic, and eventually accept that 1 in 5 reviews will need manual cleanup. Then someone has to remember to trigger the thing in the first place.&lt;/p&gt;
&lt;p&gt;Anthropic just shipped &lt;a href=&#34;https://code.claude.com/docs/en/routines&#34;&gt;Claude Code Routines&lt;/a&gt;, saved configurations that run Claude Code autonomously on a schedule, an API call, or a GitHub event. Routines support MCP connectors, so any tool that speaks the Model Context Protocol can plug directly into the automation. mellea fits here naturally: a &lt;code&gt;@generative&lt;/code&gt; function that returns typed, validated Pydantic output, exposed as an MCP tool that a Routine calls on every new pull request. No more manual triggering. No more fragile JSON parsing.&lt;/p&gt;
&lt;h2 id=&#34;what-are-claude-code-routines&#34;&gt;What are Claude Code Routines?&lt;/h2&gt;
&lt;p&gt;A Routine is a prompt paired with one or more repositories, an environment, and a set of triggers. You configure it once, and it runs on Anthropic&amp;rsquo;s cloud infrastructure even when your laptop is closed. Triggers can be scheduled (hourly, daily, custom cron), API-driven (POST to an HTTP endpoint), or GitHub event-based (&lt;code&gt;pull_request.opened&lt;/code&gt;, pushes, releases, issue comments).&lt;/p&gt;
&lt;p&gt;Each Routine run creates a full Claude Code session. The session can call shell commands, use skills committed to the repo, and invoke any MCP connectors you attach. If you can serve a tool over MCP, a Routine can call it.&lt;/p&gt;
&lt;h2 id=&#34;the-demo-a-structured-pr-reviewer&#34;&gt;The demo: a structured PR reviewer&lt;/h2&gt;
&lt;p&gt;We will build a mellea-powered MCP tool that takes a PR diff and description and returns a typed &lt;code&gt;PRReview&lt;/code&gt; object: risk level, summary, affected modules, a review checklist, and suggested reviewers. Then we wire it into a Routine that fires on every new PR.&lt;/p&gt;
&lt;h3 id=&#34;step-1-define-the-output-schema&#34;&gt;Step 1: Define the output schema&lt;/h3&gt;
&lt;p&gt;Start with what you want back from the model. A Pydantic model makes the contract explicit:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; enum &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; Enum
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; pydantic &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; BaseModel
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;class&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;RiskLevel&lt;/span&gt;(str, Enum):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    low &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;low&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    medium &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;medium&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    high &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;high&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    critical &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;critical&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;class&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;ChecklistItem&lt;/span&gt;(BaseModel):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    area: str
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    concern: str
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    passed: bool
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;class&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;PRReview&lt;/span&gt;(BaseModel):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    risk_level: RiskLevel
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    summary: str
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    affected_modules: list[str]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    review_checklist: list[ChecklistItem]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    suggested_reviewers: list[str]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Every field is typed and constrained. If the model returns &lt;code&gt;&amp;quot;risk_level&amp;quot;: &amp;quot;maybe&amp;quot;&lt;/code&gt;, Pydantic rejects it before your code ever sees it.&lt;/p&gt;
&lt;h3 id=&#34;step-2-write-the-generative-function&#34;&gt;Step 2: Write the generative function&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;@generative&lt;/code&gt; decorator turns a typed Python function into an LLM call. The docstring becomes the prompt instruction, and the return type tells mellea what schema to enforce:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; mellea &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; generative, MelleaSession
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; mellea.stdlib.requirements &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; req
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; mellea.stdlib.sampling &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; RejectionSamplingStrategy
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;@generative&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;review_pr&lt;/span&gt;(diff: str, description: str) &lt;span style=&#34;color:#f92672&#34;&gt;-&amp;gt;&lt;/span&gt; PRReview:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&amp;#34;&amp;#34;Review a pull request.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;    Analyze the diff and description to produce a structured review.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;    Identify the risk level, summarize the change, list affected
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;    modules, generate a review checklist, and suggest reviewers
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;    based on the modules touched.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;    &amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The function body is &lt;code&gt;...&lt;/code&gt;. mellea replaces it with an LLM call that returns a &lt;code&gt;PRReview&lt;/code&gt;. But the model might cut corners, so we add requirements:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;run_review&lt;/span&gt;(session: MelleaSession, diff: str, desc: str) &lt;span style=&#34;color:#f92672&#34;&gt;-&amp;gt;&lt;/span&gt; PRReview:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; review_pr(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        session,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        diff&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;diff,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        description&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;desc,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        requirements&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;[
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            req(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The review checklist must have at least 3 items.&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            req(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The summary field must be under 100 words.&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            req(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Each checklist item must reference a specific &amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;file or function from the diff.&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        strategy&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;RejectionSamplingStrategy(loop_budget&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If the model returns a checklist with two items or a vague summary, mellea catches the violation, feeds the failure reason back to the model, and retries up to &lt;code&gt;loop_budget&lt;/code&gt; times. The caller gets a valid &lt;code&gt;PRReview&lt;/code&gt; or an exception. Never a half-baked result.&lt;/p&gt;
&lt;h3 id=&#34;step-3-expose-as-an-mcp-tool&#34;&gt;Step 3: Expose as an MCP tool&lt;/h3&gt;
&lt;p&gt;Wrap the generative function in a FastMCP server. This is the entire server file:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;# pr_review_server.py&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; mcp.server.fastmcp &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; FastMCP
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; mellea &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; MelleaSession
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; mellea.backends.ollama &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; OllamaModelBackend
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;from&lt;/span&gt; mellea.backends &lt;span style=&#34;color:#f92672&#34;&gt;import&lt;/span&gt; ModelOption, model_ids
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;# ... PRReview model and review_pr function defined above ...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mcp &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; FastMCP(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;mellea-pr-review&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;@mcp&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;tool()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;structured_pr_review&lt;/span&gt;(diff: str, description: str) &lt;span style=&#34;color:#f92672&#34;&gt;-&amp;gt;&lt;/span&gt; str:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&amp;#34;&amp;#34;Produce a typed, validated review of a pull request.&amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    session &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; MelleaSession(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        OllamaModelBackend(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            model_ids&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;IBM_GRANITE_4_HYBRID_MICRO,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            model_options&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;{ModelOption&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;MAX_NEW_TOKENS: &lt;span style=&#34;color:#ae81ff&#34;&gt;2048&lt;/span&gt;},
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    result &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; run_review(session, diff, description)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; result&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;model_dump_json(indent&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Test it locally with the MCP inspector:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;uv run mcp dev pr_review_server.py
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This opens a debug UI at &lt;code&gt;http://localhost:5173&lt;/code&gt; where you can call &lt;code&gt;structured_pr_review&lt;/code&gt; with a sample diff and see the typed JSON output. Swap &lt;code&gt;OllamaModelBackend&lt;/code&gt; for OpenAI, vLLM, WatsonX, or LiteLLM without changing the function signature.&lt;/p&gt;
&lt;h3 id=&#34;step-4-wire-it-into-a-routine&#34;&gt;Step 4: Wire it into a Routine&lt;/h3&gt;
&lt;p&gt;With the MCP server deployed and reachable, create a Routine that calls it on every new PR.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Go to &lt;a href=&#34;https://claude.ai/code/routines&#34;&gt;claude.ai/code/routines&lt;/a&gt; and click &lt;strong&gt;New routine&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Prompt&lt;/strong&gt; (tell Claude to use the tool):&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;When a new pull request is opened, call the `structured_pr_review` MCP tool
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;with the PR&amp;#39;s diff and description. Post the result as a PR comment formatted
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;as a markdown table. If the risk_level is &amp;#34;critical&amp;#34;, also add the &amp;#34;needs-
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;security-review&amp;#34; label.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ol start=&#34;3&#34;&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt; — select the repo you want reviewed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Trigger&lt;/strong&gt; — add a GitHub event trigger: &lt;code&gt;pull_request.opened&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Connectors&lt;/strong&gt; — include the MCP connector pointing to your &lt;code&gt;pr_review_server.py&lt;/code&gt; deployment.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Create&lt;/strong&gt;. Every new PR now gets a typed, validated review posted as a comment. No manual step, no cron job on your laptop.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;before-and-after&#34;&gt;Before and after&lt;/h2&gt;
&lt;p&gt;Without mellea, a Routine prompt that asks for structured review output looks like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Review the PR. Return JSON with these fields: risk_level (low/medium/high/
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;critical), summary, affected_modules, review_checklist, suggested_reviewers.
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Make sure the JSON is valid.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What actually happens:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The model sometimes wraps the JSON in a markdown code fence. Your parser breaks.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;risk_level&lt;/code&gt; comes back as &lt;code&gt;&amp;quot;moderate&amp;quot;&lt;/code&gt; instead of &lt;code&gt;&amp;quot;medium&amp;quot;&lt;/code&gt;. No validation catches it.&lt;/li&gt;
&lt;li&gt;The checklist has one item that says &amp;ldquo;looks good.&amp;rdquo; No specificity, no way to enforce it.&lt;/li&gt;
&lt;li&gt;You add regex-based cleanup, retry logic, and a fallback prompt. It works 80% of the time.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With mellea, the Pydantic schema rejects &lt;code&gt;&amp;quot;moderate&amp;quot;&lt;/code&gt; at parse time. The &lt;code&gt;RejectionSamplingStrategy&lt;/code&gt; retries with the failure reason in context. The requirements enforce at least 3 specific checklist items and a concise summary. The Routine caller gets a &lt;code&gt;PRReview&lt;/code&gt; object or an error, not malformed output that silently passes through.&lt;/p&gt;
&lt;h2 id=&#34;limitations&#34;&gt;Limitations&lt;/h2&gt;
&lt;p&gt;Routines are in research preview. Behavior, limits, and the API surface may change, and daily run caps apply per account.&lt;/p&gt;
&lt;p&gt;The MCP server must be reachable from Anthropic&amp;rsquo;s cloud. If you run the mellea server locally with Ollama, you will need a publicly accessible endpoint or a tunnel for the Routine to reach it.&lt;/p&gt;
&lt;p&gt;Large diffs challenge small models. Granite 4 Micro handles diffs up to a few hundred lines well. For large PRs, use a bigger model or truncate the diff to changed files only.&lt;/p&gt;
&lt;p&gt;Repair is not free. Each &lt;code&gt;loop_budget&lt;/code&gt; retry is an additional LLM call. For latency-sensitive workflows, set &lt;code&gt;loop_budget=1&lt;/code&gt; and accept occasional validation failures rather than waiting for retries.&lt;/p&gt;
&lt;h2 id=&#34;try-it&#34;&gt;Try it&lt;/h2&gt;
&lt;p&gt;If you are already parsing LLM output with regex and hoping for the best, mellea gives you typed returns with automatic repair. If you are already running Claude Code but triggering reviews manually, Routines give you event-driven automation. Together, they replace the glue code.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;uv add mellea &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;mcp[cli]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;Full MCP server example: &lt;a href=&#34;https://github.com/generative-computing/mellea/tree/main/docs/examples/mcp&#34;&gt;mellea MCP integration docs&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Routines documentation: &lt;a href=&#34;https://code.claude.com/docs/en/routines&#34;&gt;Claude Code Routines&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
  </channel>
</rss>
