โ† Back to Overview
โš™๏ธ
DevOps / Infrastructure
Active

Auto-deploy pipeline, AI-powered bug triage, health monitoring, test suite, security layer, and 3 AI agent skills for status checks, incident response, and deploy management. The systems that keep production running without a dedicated ops team.

Active Systems
๐Ÿš€ Auto-Deploy Pipeline Auto
Push to main auto-deploys frontend (Render static CDN) and backend (Render web service). Feature branches deploy to staging environments. Database migrations run automatically on deploy. Zero manual intervention.
Render Git push Auto-migrate Auto-seed
๐Ÿ” Triage Agent AI Agent
Autonomous Node.js agent polls GitHub issues hourly via macOS launchd. Spawns Claude Code instances (Agent SDK) to investigate bug reports โ€” reads codebase, traces errors, creates fix branches + draft PRs for valid bugs, comments and closes invalid ones. POSTs results to server for email notifications.
Claude Agent SDK launchd GitHub API DRY_RUN mode
๐Ÿ›ก๏ธ Maintenance Detection Auto
4-state health machine: healthy โ†’ suspicious โ†’ down โ†’ recovering. Heartbeat polling (60s healthy, 5s suspicious), exponential backoff (2sโ†’30s when down), recovery requires 3 consecutive successes. Admin-toggled (default OFF). Fixed overlay preserves app tree state.
State Machine Admin Toggle Exponential Backoff
๐Ÿงช Test Suite CI
789+ tests across frontend (Vitest/jsdom) and backend (supertest). Covers lesson JSON validation, interactive components, palette utilities, hooks, forum endpoints, admin API, auth flows. Run before every deploy.
Vitest supertest jsdom 789+ tests
๐Ÿ” Security Layer Auto
Multi-layer security: prompt injection guards (sanitizeForPrompt), HTML sanitization, content moderation (profanity + link spam via moderateContent), CSS color validation, input size limits via Zod, XML delimiter wrapping for AI inputs.
Zod sanitizeForPrompt moderateContent HTML sanitizer
AI Agent Skills
๐Ÿฉบ Status Check AI Skill
Quick 10-second production health probe. Curls API, frontend, and DB endpoints in parallel. Checks last deploy status and recent error count. Traffic light verdict: green/yellow/red.
/devops-status --full Render MCP
๐Ÿš‘ Incident Response AI Skill
Traces production errors from Render logs through the codebase to root cause. Classifies severity (critical/high/medium), identifies the breaking commit, proposes or implements fixes on a feature branch.
/devops-incident --fix --dry-run
๐Ÿ“ฆ Deploy Manager AI Skill
Pre-deploy safety checklist (tests, migrations, env vars, breaking changes), post-deploy health verification, deploy history, and rollback guidance. The safety net before pushing to main.
/devops-deploy check | verify | rollback history
Missing โ€” To Build
๐Ÿšจ Error Monitoring Missing
No Sentry or equivalent. Frontend crashes in production go undetected unless a student manually reports via forum or GitHub issue. Need real-time error tracking with source maps and session replay.
Sentry? LogRocket?
๐Ÿ“ก Uptime Monitoring Missing
No external uptime monitor. Maintenance page detects outages reactively from the client side. Need proactive alerts (Slack/email) and a public status page for trust.
UptimeRobot? Checkly?
Roadmap
3 AI skills active. Next: Sentry error monitoring integration, external uptime probes with status page, automated incident runbooks.