Clonado de producción el 2026-05-07 vía AMI snapshot. Sirve para probar features end-to-end (código + DB + nginx + agentes) en condiciones idénticas a prod, antes de promoverlas. Cero riesgo sobre clientes reales.
Cómo el tráfico se separa entre producción y stage. Cada uno tiene su propia EC2, su propio Postgres local, sus propios subdominios, y cero shared state.
flowchart TD
Internet([Internet]) --> CF[Cloudflare DNS
zone sociovirtual.ai]
CF -->|share / control / backoffice
commandcenter / live / meta
ext / webhook| ProdEC2[sv-production
i-09b50a47b30aa4893
3.138.80.125
t3.xlarge Ubuntu 24.04]
CF -->|share-stage1 / control-stage1
backoffice-stage1 / commandcenter-stage1
live-stage1 / meta-stage1
ext-stage1 / webhook-stage1| StageEC2[sv-production-stage
i-0d53ae90be25b1bcd
3.12.134.57
t3.large Ubuntu 24.04]
ProdEC2 --> ProdState[26 agentes activos
72 bindings WA reales
Meta WA Cloud activo
Stripe live + sandbox
Notion / Tokko / Odoo writes
Anthropic key SV principal]
StageEC2 --> StageState[12 agentes core
0 bindings WA
Meta deshabilitado
Stripe sandbox
9 keys STAGE-BLANK
Anthropic key dedicada stage]
ProdEC2 -.->|"AMI snapshot
(2026-05-07 18:17, --no-reboot)"| StageEC2
Legacy[i-0775f92ac229dbfca
sv-production-stage anterior
Amazon Linux 2023
backup en 16GB tar]:::dead
Legacy -.->|terminated 18:38| StageEC2
classDef dead fill:#220,stroke:#666,color:#888,stroke-dasharray:5
El EIP 3.12.134.57 se reasignó del legacy terminado a la nueva instancia. Mismo alias SSH, IP idéntica, OS distinto (Ubuntu en lugar de AL2023).
Timeline real, sin downtime sobre prod. La AMI se creó con --no-reboot, todos los pasos siguientes corrieron sobre la instancia stage nueva.
gantt
title Migración prod → stage1 · UTC
dateFormat YYYY-MM-DD HH:mm
axisFormat %H:%M
section Backup
Backup legacy 16GB :done, b1, 2026-05-07 17:06, 25m
pg_dump openclaw_mt 21MB :done, b2, 2026-05-07 17:15, 5m
Branch stage en GitHub :done, b3, 2026-05-07 17:15, 2m
section AMI + Launch
Crear AMI ami-0689e1aa :done, c1, 2026-05-07 18:17, 20m
Terminate legacy 3.12.134.57 :done, c2, 2026-05-07 18:38, 1m
Launch nueva t3.large :done, c3, 2026-05-07 18:38, 2m
Reasignar EIP :done, c4, 2026-05-07 18:40, 1m
section Sanitize
user-data firstboot :done, d1, 2026-05-07 18:41, 2m
SQL sanitize Postgres :done, d2, 2026-05-07 18:43, 6m
nginx + LE certs cleanup :done, d3, 2026-05-07 18:48, 2m
section Configuración
Restaurar 35 creds críticas :done, e1, 2026-05-07 21:55, 5m
Arrancar 12 servicios core :done, e2, 2026-05-07 22:00, 5m
section DNS + TLS
Token Cloudflare con perms :done, f1, 2026-05-08 15:35, 5m
8 records *-stage1.sociovirtual.ai :done, f2, 2026-05-08 15:38, 1m
Certbot SAN para 8 dominios :done, f3, 2026-05-08 15:42, 4m
Landing en share-stage1 :active, f4, 2026-05-08 15:45, 15m
Branch stage larga viva en el repo sociovirtualai/openclaw-multitenant. Cada server checked out en su branch. Las features se prueban en stage, después se promueven a prod via PR squash.
gitGraph
commit id: "182b816 main"
commit id: "promote: PL rule"
branch stage
checkout stage
commit id: "stage branch creada"
branch "feature/nuevo-skill"
checkout "feature/nuevo-skill"
commit id: "skill nueva"
commit id: "tests"
checkout stage
merge "feature/nuevo-skill"
commit id: "validar en stage"
commit id: "fix issue"
checkout main
merge stage tag: "promote"
commit id: "deploy prod"
Comandos típicos del flujo:
cd ~/ocmt git checkout stage git pull origin stage git checkout -b feature/x # trabajar en stage server, validar git push origin feature/x gh pr create --base stage --head feature/x # después de validar en stage: git checkout stage git merge feature/x git push origin stage gh pr create --base main --head stage --title "promote: feature/x" # después del merge: ssh sv-production cd ~/ocmt && git pull origin main && node mt/db/migrate.ts systemctl --user restart <servicio>
El sanitize neutralizó cualquier camino donde stage podría tocar producción real. La DB postgres es local del stage (la copia, no compartida con prod).
flowchart LR
subgraph PROD["Producción · sv-production"]
PGProd[("Postgres prod
localhost:5432
15 accesses
7 ext_channels active
4 provider_keys
12 tenant_balance
528 hha_tasks")]
WAProd["~/.wacli
Alfred Baileys
~/.wacli-personal
Pedro Baileys"]
MetaProd["META_APP_SECRET
token activo"]
NotionProd["NOTION_API_KEY
CLOUDFLARE_API_TOKEN
TOKKO/ODOO/GHL"]
end
subgraph STAGE["Stage-1 · sv-production-stage"]
PGStage[("Postgres stage
localhost:5432
0 accesses
0 ext_active
0 provider_keys
0 balance
0 hha_tasks")]
Archive[("accesses_archive_stage
15 bindings preservados
solo lectura")]
WAStage["~/.wacli movido
a backup
cabina-wa-bridge
MASKED"]
MetaStage["META_APP_SECRET
= STAGE-BLANK"]
NotionStage["NOTION_API_KEY
CLOUDFLARE_API_TOKEN
TOKKO / ODOO / GHL
= STAGE-BLANK"]
end
PROD -.->|"AMI snapshot
zero downtime"| STAGE
STAGE -.->|"SQL sanitize
UPDATE ext_channels SET is_active=false
TRUNCATE accesses, provider_keys, hha_tasks
DELETE FROM tenant_balance"| Archive
classDef prod fill:#1a1f2e,stroke:#1a73e8,color:#e8e8e8
classDef stage fill:#1f1a0d,stroke:#f5a623,color:#e8e8e8
classDef arch fill:#0d1f0d,stroke:#2ecc71,color:#e8e8e8
class PGProd,WAProd,MetaProd,NotionProd prod
class PGStage,WAStage,MetaStage,NotionStage stage
class Archive arch
external_channels.is_active=false para todos.hha-auto-topup masked, key sandbox.STAGE-BLANK.is_stage=true).De los 30+ servicios systemd que vienen en la AMI de prod, solo 18 están activos en stage. Los que tocan sistemas externos vivos están masked vía symlinks a /dev/null.
flowchart TD
AMI[AMI booted
30+ unit files heredados] --> Decide{Tipo de servicio}
Decide -->|infra core| Active[18 ACTIVE en stage]
Decide -->|toca sistemas externos
de clientes| Masked[17 MASKED en stage]
Decide -->|cobra dinero o
mata sesiones WA| Masked
Active --> ActiveList["substrate-server
substrate-feedback-worker
substrate-mcp-http
substrate-incremental
substrate-watcher
openclaw-mt
openclaw-gateway
openclaw-centinela
openclaw-media-relay
ocmt-control-api
control-ui
ocmt-backoffice
ocmt-live
webhook-server
informante-admin-endpoint
constelacion-server
contact-attributes-endpoint
visitas-endpoint"]
Masked --> MaskedList["cabina-wa-bridge
cabina-wa-operante-bridge
cabina-linkedin-bridge
wacli-personal-sync
external-channels
tokko-sync.timer
hha-auto-topup.timer
hha-detector
informante-metrics-poll.timer
informante-channel-refresh.timer
informante-hashtag-pack.timer
sv-claude-config-autocommit.timer
reporte-semanal-tenants.timer
soc81-reminder.timer
whatsapp-mcp-bridge
openclaw-rule-review.timer
session-migration"]
classDef active fill:#0d1f0d,stroke:#2ecc71,color:#e8e8e8
classDef masked fill:#1f0d0d,stroke:#e74c3c,color:#e8e8e8
class Active,ActiveList active
class Masked,MaskedList masked
| Stage URL | Equivalente prod | Sirve | Status |
|---|---|---|---|
| share-stage1 | share | Statics + landings + papers | 200 |
| control-stage1 | control | Cabina (control-api + control-ui + WS) | 401 Basic Auth |
| backoffice-stage1 | backoffice | Panel admin OCMT | 200 |
| commandcenter-stage1 | commandcenter | Command Center | 401 Basic Auth |
| live-stage1 | live | OCMT Live (Next.js) | 200 |
| meta-stage1 | meta | Webhooks Meta (deshabilitado) | 404 esperado |
| ext-stage1 | ext | External channels (intencionalmente masked) | 502 intencional |
| webhook-stage1 | webhook | Webhook server genérico | 404 esperado |
-stage1 con TLSis_stagessh -i ~/.ssh/SOCIOVIRTUAL_KEY.pem ubuntu@3.12.134.57 # o con alias: ssh sv-production-stage
Verificación obligatoria antes de cualquier acción:
cat /etc/ocmt-stage-marker # debe decir "stage" hostname # ocmt-stage echo $OCMT_ENV # stage psql "$DATABASE_URL" -c "SELECT inet_server_addr()" # 127.0.0.1
Si los cuatro no matchean, abortar inmediatamente — el comando podría estar apuntando a producción.
El plan original con justificación, riesgos contemplados y prompt para revisión por IAs externas vive en el filesystem del stage:
~/.claude/plans/quiero-empezar-a-migrar-delegated-treasure.md
El ~/CLAUDE.md del stage tiene un bloque arriba de todo (489 líneas) con todas las reglas de aislación que cualquier agente Claude que entre a este server debe respetar.