Back to Blog
mcpaidevopstooling

MCP Tools as a DevOps Interface: Managing 8 Production Apps from Claude

Ryan C·

What if you could debug a production outage by saying "the site is returning a 502" and having an AI agent check health status, read logs, identify the root cause, and fix it — all in 30 seconds? That's what I built with MCP tools, and it's changed how I think about DevOps.

My old debugging workflow: SSH into the VPS, cd to the project directory, tail -f the logs, grep for the error, check the systemd status, maybe restart the service, check the Nginx config, query the database with psql. Each step is simple. Chaining them together for eight different projects across multiple terminal windows is not.

Now I describe the problem in natural language and an AI agent runs the right commands.

What MCP Tools Are (The Short Version)

Model Context Protocol (MCP) tools are functions that an AI model can call during a conversation. Instead of the model generating text that tells you what to run, it actually runs the commands and processes the results.

I built 17 MCP tools that cover the full lifecycle of managing production apps:

ToolPurpose
hostkit_deploy_localDeploy pre-built files from local machine
hostkit_statusHealth check, services, URL, recent logs
hostkit_wait_healthyPoll health endpoint after deploy
hostkit_executeRun arbitrary CLI commands (rollback, restart, etc.)
hostkit_fixAuto-diagnose and suggest fixes for common errors
hostkit_solutionsCross-project knowledge base of solved problems
hostkit_env_get / hostkit_env_setRead/write environment variables
hostkit_db_schemaInspect database table structure
hostkit_db_queryExecute SQL queries (read or write mode)
hostkit_validateCheck project configuration for common issues
hostkit_stateVPS resource usage (CPU, memory, disk)
hostkit_eventsDeployment and service event history
hostkit_auth_guideAuth integration examples for the project

A Real Debugging Session

Here's what debugging a 502 error looks like now:

Me: "The gilded-tiers site is returning a 502."

The agent calls hostkit_status(project="gilded-tiers") and gets back:

  • Health: failing
  • Service: stopped
  • Last log lines: Error: Cannot find module '.prisma/client'

Then calls hostkit_fix(error="502 Bad Gateway", project="gilded-tiers") which:

  • Identifies the root cause: Prisma client wasn't copied to the standalone build
  • Suggests the fix: copy node_modules/.prisma into .next/standalone/node_modules/
  • Optionally applies the fix and redeploys

The whole interaction takes 30 seconds. The equivalent SSH workflow takes 5 minutes if I remember where everything is, longer if I don't.

The hostkit_fix Pattern

This is the tool I'm most proud of. It takes an error description and a project name, then:

  1. Checks the project status (health, services, recent logs)
  2. Matches the error pattern against known solutions
  3. Suggests a fix with a confidence level
  4. Can apply the fix automatically if approved

The known solutions database (hostkit_solutions) is cross-project. If I solve a "502 due to missing Prisma client" on one project, the solution is available for every project. The knowledge compounds.

Common patterns it handles:

  • 502 Bad Gateway — process crashed, port mismatch, Nginx misconfigured
  • Auth failures — expired JWT key, missing AUTH_URL, cookie domain mismatch
  • Database errors — connection refused, pool exhaustion, missing migration
  • Build failures — missing dependencies in standalone output, wrong Node version
  • SSL issues — cert expired, wrong domain in Nginx config

Database Queries Without psql

One of the most common ops tasks is "look up a value in the production database." Before MCP tools, that meant SSH-ing into the server and opening a psql session:

ssh my-server
sudo -u myapp psql myapp_db
SELECT * FROM "User" WHERE email = '[email protected]';

Now I just ask in plain English:

Me: "How many active subscriptions does gilded-tiers have?"

The agent calls hostkit_db_query(project="gilded-tiers", query="SELECT count(*) FROM \"ProjectService\" WHERE status = 'active'") and returns the answer.

For schema exploration, hostkit_db_schema returns table structures without memorizing column names. For write operations, the query tool has an explicit write_mode parameter that must be set intentionally — no accidental UPDATEs.

Environment Variable Management

Env var debugging is the most common ops task that isn't technically "debugging." Did I set the Stripe webhook secret? Is the base URL correct? What's the S3 bucket name?

hostkit_env_get(project="emergent")
→ Returns all env vars (secrets redacted)

hostkit_env_set(project="emergent", key="NEXT_PUBLIC_BASE_URL", value="https://emergentaiagency.com")
→ Sets the var and optionally restarts the service

No SSH. No editing .env files. No forgetting to restart the service after changing a variable.

The Compound Effect

The value of MCP tools isn't any single tool. It's the conversation-level context that ties them together. When I'm debugging, the agent remembers what it already checked. It correlates the health check failure with the log output with the missing env var. It doesn't lose context between steps like I do when I'm switching terminal windows.

And because every tool interaction is logged in the conversation, I have an automatic audit trail of what was checked, what was changed, and why. No more "I think I restarted that service but I'm not sure."

Building Your Own MCP Tools

If you're running any production infrastructure, the pattern is worth adopting. Start with three tools:

  1. Status check — health, logs, service state in one call
  2. Execute — run arbitrary commands with output capture
  3. Env management — read and write environment variables

These three cover 80% of ops tasks. Build specialized tools (database queries, auto-fix, deploy) as the patterns emerge from your actual debugging sessions.

The best DevOps interface is the one that matches how you think about problems — in natural language, with context, without remembering which directory you need to be in.

Want us to build something like this for you?

We ship production software in days, not months. Tell us what you need — our AI receptionist is standing by.

Back to Blog
Page

Page

Client AI · Online

Page

Hey, I'm Page.

Tell me what you need. I'll point you to the right person — or tell you if we're not the right fit.

Powered by Claude · Responses may vary