If you’re running a site with Claude Code on Linux, the most useful document you can write is not a design spec or a feature list. It’s a Claude Code hosting runbook: a short, practical guide that explains what to do when the site is healthy, what to check when it isn’t, and who is responsible for each step.

A good runbook does not replace automation. It complements it. When something breaks at 11:47 p.m., you do not want to reinvent the system from memory. You want a clear sequence: where to look, what to verify, how to roll back, and when to stop guessing. That matters even more if your site is managed by an AI engineer in a sandboxed Linux environment, because the agent can move quickly but still needs guardrails.

This guide shows you how to build a Claude Code hosting runbook that actually gets used. The goal is not documentation for documentation’s sake. The goal is fewer bad deploys, faster incident response, and less time spent asking, “What changed?”

What a Claude Code hosting runbook should cover

Think of the runbook as the operating manual for your site. If someone new had to take over your server today, what would they need first?

For most AI-built Linux sites, the runbook should cover five areas:

Service overview — what the site does, where it runs, and how it is deployed
Routine checks — how to confirm the app, database, nginx, and background jobs are healthy
Deploy steps — how a release moves from code to production
Incident steps — what to do for 500s, broken assets, SSL issues, or a failed deploy
Recovery steps — backups, rollback, restore, and escalation contacts

If you use a hosting platform like Vibesies, some of the infrastructure details are already standardized inside the container. That makes the runbook easier to write, because you can focus on your app instead of inventing server policy from scratch.

Long-tail keyword: Claude Code hosting runbook for Linux sites

The phrase Claude Code hosting runbook for Linux sites is a good way to frame the work because it’s specific enough to be useful and broad enough to apply to blogs, SaaS apps, marketing sites, and internal tools.

A lot of teams already have fragments of this information scattered across notes, Slack, and commit messages. The runbook pulls it into one place. It should be short enough that you will actually open it during an incident, but detailed enough that it stops you from making a dumb mistake under pressure.

Start with a one-page service summary

Before you write troubleshooting steps, write the basics. This section should fit on one screen.

Include:

App name and primary domain
Hosting location and environment name
Primary stack — for example Flask, Django, Node, PostgreSQL, Redis
How deployments happen — Claude Code, git push, CI, or a mix
Critical dependencies — payment provider, email provider, object storage, third-party APIs
Owner and escalation contact

Example:

Site: docs.example.com
App: Flask 3 + Gunicorn + Nginx
Database: PostgreSQL
Deploy method: Claude Code edits in sandbox, then reloads app service
Backups: nightly snapshot, 7-day retention

This sounds basic, but it saves time when you are tired. If your future self has to ask where the app lives or whether backups are enabled, the runbook failed at page one.

Document the normal path before the failure path

Many runbooks jump straight to incidents. That is backward. You should first document what “normal” looks like, because most debugging starts with comparing reality to the expected state.

For a Claude Code hosted site, write down the standard checks for:

Web process — is the app server running and responding?
Reverse proxy — is nginx serving traffic cleanly?
Database — can the app connect and run a simple query?
Static files — are CSS, JS, and images loading?
Background jobs — if you use queues, are workers healthy?
Logs — do recent entries show errors, retries, or timeouts?

Keep these checks concrete. Avoid phrases like “verify everything is okay.” Instead, write commands or exact observations the agent should use. For example: “Confirm the homepage returns 200,” or “Check that the latest deploy timestamp matches the current commit.”

Build incident playbooks for the most common failures

A useful Claude Code hosting runbook for Linux sites is really a set of small playbooks. Each playbook should answer three questions:

What does this failure look like?
What is the fastest safe fix?
What do we do if the first fix does not work?

1. 500 Internal Server Error

Write down the first three checks:

Does the app process start correctly?
Did the last deploy introduce a bad config or syntax error?
Is the database reachable?

Then add the rollback rule. For example: “If the error began immediately after deploy and logs point to app startup failure, revert to the previous working release.”

2. Site loads, but styling is broken

This is usually a static asset or cache issue. Your playbook should include:

Check whether assets were rebuilt
Confirm static files are being served from the expected path
Verify browser cache or CDN cache is not masking the fix

3. SSL or domain problems

Document how to confirm the certificate status, DNS record, and canonical hostname. If you run multiple environments, note which domain is primary and which ones should redirect.

4. Email stopped sending

For many small sites, this is business-critical. Capture the provider, the sender domain, SPF/DKIM/DMARC status, and any rate limits or suppression lists to check.

5. Backups or restores fail

Backups are only useful if you have tested a restore. Your runbook should state:

Where backups are stored
How often they run
How to restore a single file vs. the full app
How long a restore should take

Add a release checklist to every runbook

A release checklist is one of the most underrated parts of a hosting runbook. It prevents “works on my machine” releases from reaching production and gives Claude Code a predictable sequence to follow.

Here is a practical checklist you can adapt:

Confirm the branch or commit to deploy
Review recent changes for config, secrets, or dependency updates
Run tests or smoke checks
Verify environment variables are present
Confirm database migrations are safe and reversible
Take a backup or snapshot if the change is risky
Deploy to production
Check homepage, login, and one critical user flow
Watch logs for the first 10–15 minutes

If your site is mission-critical, add a “go/no-go” line. Example: “Do not deploy if the checkout provider is experiencing issues,” or “Do not deploy within one hour of a product launch email.”

Write rollback instructions while you still remember them

Rollback is where many runbooks get vague. They say “revert if needed,” which is not enough when traffic is failing.

Instead, specify:

What counts as a rollback trigger
How to identify the last known good release
Whether database changes can be rolled back safely
How to restore static assets or config files
How to confirm the rollback worked

If a deploy includes both code and schema changes, note whether the schema is backward-compatible. That’s the difference between a five-minute recovery and a messy restore.

Decide what Claude Code should do automatically

One of the advantages of working with Claude Code on a hosted Linux environment is that the agent can perform many routine tasks without hand-holding. But a runbook still needs boundaries.

A good rule is to split actions into three buckets:

Safe to automate — health checks, log review, restart a failed service, run a backup verification
Confirm first — deploys, migrations, DNS changes, cache purges
Human only — billing changes, credential rotation, production data deletion, compliance-related edits

This keeps the agent useful without turning it loose on tasks that should still require a second set of eyes.

A simple runbook template you can copy

If you want a starting point, use this structure:

1. Service summary
2. Critical contacts
3. Normal health checks
4. Deploy procedure
5. Rollback procedure
6. Incident playbooks
7. Backup and restore steps
8. Escalation criteria

If you prefer keeping operational docs inside the project itself, store it in the repo as RUNBOOK.md or ops/runbook.md. If your team uses a hosted workspace or container platform, that file can live right alongside the app code so Claude Code can reference it while working.

Keep the runbook alive with post-incident updates

The fastest way for a runbook to become useless is to write it once and never touch it. Every incident should end with a short update to the document.

After a fix, ask:

What did we learn?
Which step was missing?
Which check should happen earlier next time?
Did any assumptions turn out to be wrong?

Even a small note helps. For example: “Add database credential check before app restart,” or “Document that image uploads fail when disk usage exceeds 90%.” Over time, this turns the runbook into a real operational asset instead of stale documentation.

Runbook review checklist

Before you call it done, review your Claude Code hosting runbook against this checklist:

It fits on a few pages, not a wiki labyrinth
It includes exact checks, not vague advice
It covers deploys, rollbacks, backups, and incidents
It names owners and escalation paths
It reflects the current app, not last quarter’s architecture
It is stored where you and Claude Code can find it quickly

If you use Vibesies or another AI hosting setup where each site has its own persistent Linux environment, this is the document that makes the whole arrangement easier to operate. The AI can help execute tasks, but the runbook defines the rules.

Final thought

The best Claude Code hosting runbook for Linux sites is boring in the right way. It removes guesswork. It shortens recovery time. It keeps deploys calm. And it makes your AI engineer more effective because it knows what “good” looks like before anything goes wrong.

If you write nothing else this week, write the runbook first. Your future self will thank you the first time production gets weird.

Back to Blog

["Claude Code", "Linux hosting", "runbook", "site reliability", "deployment"]

How to Write a Claude Code Hosting Runbook