If you run an AI-built Linux website, the hardest part is usually not the initial build. It is the moment something breaks after a deploy and you need to figure out why. A page won’t load, a form stops submitting, or the app works locally but fails in production. The good news is that most problems can be traced with a repeatable process instead of random edits.
This guide walks through how to debug a broken AI-built website on Linux with a practical, low-drama workflow. It is written for people using tools like Claude Code in a real server environment, where you have logs, services, permissions, and configuration files to inspect. If you are hosting on a platform like Vibesies, you still want the same fundamentals: isolate the error, confirm the runtime, and change one thing at a time.
How to debug a broken AI-built website on Linux
When an AI agent writes or modifies your site, the failure is often caused by one of a few common layers:
- Application code — a route, template, dependency, or environment variable is wrong.
- Web server config — nginx, gunicorn, or a reverse proxy is misrouted.
- File permissions — the service user cannot read or write what it needs.
- Process state — the app did not restart cleanly, or an old process is still running.
- Infrastructure — disk full, memory pressure, SSL, DNS, or firewall issues.
The mistake many people make is jumping straight into code edits. Start by identifying which layer failed. That will save you a lot of time and prevent accidental regressions.
Step 1: Reproduce the problem exactly
Before you touch anything, answer three questions:
- What is broken?
- When did it start?
- Does it fail for everyone or only certain requests?
Open the site in a browser and note the exact behavior. A blank page is not the same as a 500 error. A 500 error is not the same as a timeout. A form that submits but does nothing is a different class of problem again.
Try to reproduce the issue with the same path, query string, and user state. If the failure happens only after login, only on mobile, or only for a certain upload size, that is an important clue.
Step 2: Check the app logs first
For most AI-built sites, the fastest answer is in the logs. If you are using gunicorn with nginx, inspect both application logs and web server logs.
Look for:
- Python tracebacks
- Import errors
- Missing environment variables
- Database connection failures
- Template errors
- Permission denied messages
If the app starts but crashes on a request, you will usually see the traceback right where the request hit the problem. If nothing is logged, the request may never reach the app, which shifts the focus to nginx, DNS, TLS, or a firewall.
Helpful commands on a Linux box often include:
journalctl -u your-service -n 100tail -f /var/log/nginx/error.logtail -f /var/log/nginx/access.log
If you are working inside a managed environment, the exact paths may differ, but the idea is the same: inspect the place where the error would naturally appear.
Step 3: Separate server failure from application failure
A site can look “broken” for several reasons. You need to know whether the server is down or the code is down.
Use this quick split:
- If the site is unreachable: check nginx, the upstream app process, DNS, SSL, and networking.
- If the site loads but one feature fails: check the app code, database, or external API integration.
- If the site is slow: check resource usage, database queries, blocking I/O, and memory limits.
A classic example: the homepage loads, but the contact form returns an error. That is usually not an nginx problem. It is more likely a missing SMTP setting, a broken API key, a validation bug, or a permissions issue when writing form submissions to disk.
Step 4: Verify environment variables and secrets
AI-generated code often assumes environment variables are present. In development, you may have had them in a .env file. In production, they may be missing, renamed, or set incorrectly.
Check the basics:
- Are all required variables present?
- Do values match production, not staging?
- Are secrets loaded in the service unit or process manager?
- Has a key expired, rotated, or been revoked?
Common examples include database URLs, API keys, OAuth credentials, mail settings, and secret keys. A single typo can make a site behave as if the whole app is down.
If the issue appeared right after a deploy, compare the current environment with the last known good deployment. This often reveals the problem immediately.
Step 5: Check permissions and ownership
Linux permissions are a frequent source of confusion, especially when an AI agent has created files as one user and the service runs as another. If the app cannot read its templates, write to storage, or access a socket, it will fail in ways that look unrelated.
Look at:
- File ownership for app directories
- Write permissions for uploads, cache, and logs
- Socket permissions for gunicorn or similar app servers
- SELinux or AppArmor restrictions, if applicable
A good rule: if the app writes files anywhere, make sure the runtime user can write there consistently. If it only reads files, verify that deploy steps did not accidentally lock them down.
Step 6: Confirm the service is actually running
Sometimes the app is not “broken” so much as stopped.
Check whether the relevant process is active:
- Is gunicorn running?
- Is the systemd service healthy?
- Did the process crash during startup?
- Is there a stale PID file or old socket?
Startup failures are often caused by one of these:
- Syntax error in a newly edited file
- Missing dependency after
pip install - Bad environment variable
- Port already in use
- Database migration that was not applied
If the service fails at boot, the system logs usually say why. Focus on the first error, not the cascade of secondary errors that follow it.
Step 7: Roll back the last change before rewriting code
If the site broke right after a deploy, the fastest fix is often to revert the last known change. That is especially true when an AI agent made a broad edit across several files.
Before rewriting the logic, ask:
- What changed in the last deploy?
- Did the change touch config, dependencies, or routing?
- Can I temporarily disable the new code path?
This matters because AI-generated fixes can sometimes mask the real issue. A rollback restores a known baseline and makes debugging much simpler.
Step 8: Use a minimal test case
When a feature fails, isolate it. Remove all the extra variables.
For example:
- If an API call fails, test it with curl first.
- If a template breaks, render the smallest possible version.
- If a form fails, submit a single field with no attachments.
- If a database query times out, run the query directly and measure it.
This helps separate “the full app is broken” from “one path is broken.” That distinction is crucial when an AI agent has touched multiple layers at once.
Step 9: Watch for the usual AI-built-site mistakes
AI can build working sites quickly, but it also tends to make predictable mistakes. Keep an eye out for:
- Hard-coded paths that do not exist in production
- Assumptions about default environment variables
- Using development-only settings on a public server
- Missing imports after refactors
- Dependency versions that differ from local dev
- Database migrations forgotten during deploy
None of these are exotic. They are common and fixable. The trick is to recognize the pattern instead of treating each failure as a mystery.
Step 10: Build a simple incident checklist
If you own more than one site, write down a short checklist and reuse it every time. Here is a practical version:
- Confirm the exact symptom
- Check app logs
- Check web server logs
- Verify the service is running
- Check recent deploys
- Confirm environment variables
- Test permissions
- Run a minimal reproduction
- Roll back if needed
- Apply the smallest safe fix
You do not need a giant runbook. You need a consistent sequence that removes guesswork.
Example: fixing a 500 error after deploy
Suppose your site throws a 500 error after the latest deploy. A disciplined workflow might look like this:
- Open the page and confirm the error.
- Check nginx logs to see if the request reached the upstream app.
- Inspect the application log for the first traceback.
- Find that a newly added environment variable is missing.
- Add the missing value and restart the service.
- Retry the page and confirm the response is normal.
That entire process can take minutes if you go in order. If you start by rewriting code, it can take hours.
Why this matters for AI-hosted sites
When an AI agent is part of your deployment workflow, the speed advantage is real. But so is the chance of introducing a subtle bug quickly. Debugging is what keeps that speed from turning into chaos.
Managed environments can help by keeping your app isolated and persistent. Tools like Vibesies are built around that model: an AI agent in a Linux container, with the same real-world failure modes you see on other servers. That means the debugging habits you build here transfer anywhere you host serious projects.
If you are using a system like this, the biggest advantage is not that problems disappear. It is that the stack is visible enough to diagnose them cleanly.
Conclusion: debug the layer, not the symptom
The best way to debug a broken AI-built website on Linux is to stay methodical. Reproduce the issue, check the logs, identify the layer that failed, and change one thing at a time. That approach works whether you are fixing a 500 error, a failed form submission, a missing asset, or a service that never came back after deploy.
AI can help you move faster, but only if you keep a good debugging process. The more real your Linux environment is, the more valuable that process becomes. Once you have it, you will spend less time guessing and more time shipping.