DevOps Operations¶
This document covers infrastructure, deployment, and operational procedures for the Maricusco trading system.
Infrastructure Overview¶
Infrastructure Architecture¶
graph TB
subgraph "External Access"
USER[Users/Developers]
LB[Load Balancer<br/>Optional]
end
subgraph "Application Layer"
APP[Maricusco Application<br/>FastAPI + Uvicorn<br/>Port: 8000]
end
subgraph "Data Services"
PG[(PostgreSQL<br/>TimescaleDB<br/>Port: 5432)]
REDIS[(Redis Cache<br/>Port: 6379)]
CHR[(ChromaDB<br/>Vector Store<br/>Port: 8000)]
end
subgraph "Monitoring Stack"
PROM[Prometheus<br/>Metrics Collection<br/>Port: 9090]
GRAF[Grafana<br/>Dashboards<br/>Port: 3000]
end
subgraph "Storage"
VOL1[(postgres_data)]
VOL2[(redis_data)]
VOL3[(chromadb_data)]
VOL4[(prometheus_data)]
VOL5[(grafana_data)]
end
USER --> LB
LB --> APP
USER --> APP
APP --> PG
APP --> REDIS
APP --> CHR
APP -->|/metrics| PROM
APP -->|/health| PROM
PROM --> GRAF
PG --> VOL1
REDIS --> VOL2
CHR --> VOL3
PROM --> VOL4
GRAF --> VOL5
style APP fill:#7A9FB3,stroke:#6B8FA3,color:#fff
style PG fill:#9B8AAB,stroke:#8B7A9B,color:#fff
style REDIS fill:#9B8AAB,stroke:#8B7A9B,color:#fff
style CHR fill:#9B8AAB,stroke:#8B7A9B,color:#fff
style PROM fill:#7A9A7A,stroke:#6B8E6B,color:#fff
style GRAF fill:#7A9A7A,stroke:#6B8E6B,color:#fff
style LB fill:#C4A484,stroke:#B49474,color:#fff
style VOL1 fill:#C4A484,stroke:#B49474,color:#fff
style VOL2 fill:#C4A484,stroke:#B49474,color:#fff
style VOL3 fill:#C4A484,stroke:#B49474,color:#fff
style VOL4 fill:#C4A484,stroke:#B49474,color:#fff
style VOL5 fill:#C4A484,stroke:#B49474,color:#fff
Service Architecture¶
The system runs as a containerized application with the following services:
Note: All services are automatically deployed and monitored through the CI/CD pipeline.
| Service | Container | Port | Purpose |
|---|---|---|---|
| Application | maricusco-app |
8000 | Main FastAPI application |
| PostgreSQL | maricusco-postgres |
5432 | Time-series database (TimescaleDB) |
| Redis | maricusco-redis |
6379 | Caching layer |
| ChromaDB | maricusco-chromadb |
8000 | Vector memory storage |
| Prometheus | maricusco-prometheus |
9090 | Metrics collection |
| Grafana | maricusco-grafana |
3000 | Metrics visualization |
Network Configuration¶
All services communicate via the maricusco-network bridge network. Services use internal DNS names (e.g., postgres, redis, chromadb) for inter-service communication.
Resource Limits¶
Default resource constraints per service:
app:
limits: { cpus: '2.0', memory: 4G }
reservations: { cpus: '0.5', memory: 512M }
postgres:
limits: { cpus: '2.0', memory: 2G }
reservations: { cpus: '0.5', memory: 512M }
redis:
limits: { cpus: '1.0', memory: 1G }
reservations: { cpus: '0.25', memory: 256M }
chromadb:
limits: { cpus: '2.0', memory: 2G }
reservations: { cpus: '0.5', memory: 512M }
grafana:
limits: { cpus: '1.0', memory: 512M }
reservations: { cpus: '0.25', memory: 128M }
Adjust these values in docker-compose.yml based on workload requirements.
Deployment¶
Prerequisites¶
- Docker 20.10+ and Docker Compose 2.0+
- Minimum 8GB RAM, 4 CPU cores
- 20GB free disk space for volumes
Initial Deployment¶
-
Clone repository:
-
Configure environment:
-
Set required environment variables:
-
Start services:
-
Verify deployment:
Application Configuration¶
The application entrypoint supports environment variable overrides:
# Uvicorn configuration
export APP_MODULE=maricusco.api.app:app
export APP_HOST=0.0.0.0
export APP_PORT=8000
export UVICORN_WORKERS=4 # Scale workers
export UVICORN_RELOAD=false
export EXTRA_UVICORN_ARGS="--log-level info"
Scaling¶
Horizontal scaling (multiple app instances):
-
Update
docker-compose.yml: -
Use a load balancer (nginx, traefik) in front of multiple instances.
Vertical scaling (resource limits):
Update resource limits in docker-compose.yml based on monitoring metrics.
Health Checks¶
sequenceDiagram
participant DC as Docker Compose
participant APP as Application
participant PG as PostgreSQL
participant RD as Redis
participant CH as ChromaDB
Note over DC: Health check interval: 30s
DC->>APP: Health check request<br/>(/usr/local/bin/healthcheck)
APP->>APP: Check /health endpoint
APP->>PG: Test connection<br/>(pg_isready)
PG-->>APP: Connection status
APP->>RD: Test connection<br/>(redis-cli ping)
RD-->>APP: PONG
APP->>CH: HTTP heartbeat<br/>(/api/v1/heartbeat)
CH-->>APP: 200 OK
APP-->>DC: Health status<br/>(healthy/unhealthy)
Note over DC,CH: All dependencies must be healthy<br/>for container to be marked healthy
Health checks run every 30 seconds with a 2-second timeout:
- Application:
/healthendpoint (required) - PostgreSQL:
pg_isreadycommand - Redis:
redis-cli ping - ChromaDB: HTTP heartbeat endpoint
Verify health status:
docker-compose ps
# Check individual service
docker exec maricusco-app curl -f http://localhost:8000/health
Environment Management¶
Environment Variables¶
Configuration priority (highest to lowest):
1. Environment variables
2. .env file
3. Default values in code
Security:
- Never commit .env files to version control
- Use secrets management (AWS Secrets Manager, HashiCorp Vault) in production
- Rotate credentials regularly
See Configuration Reference for complete environment variable documentation.
CI/CD Pipeline¶
The project uses GitHub Actions for continuous integration and deployment. See CI/CD Pipeline for complete documentation.
Key stages: 1. Lock file validation 2. Code quality (lint, type check, security) 3. Test suite execution 4. Docker image build and scan 5. Deployment (manual or automated)
Local CI simulation:
Monitoring and Observability¶
Monitoring Flow¶
flowchart LR
subgraph "Application"
APP[Maricusco App]
METRICS[/metrics endpoint]
HEALTH[/health endpoint]
end
subgraph "Collection"
PROM[Prometheus<br/>Scrapes every 15s]
end
subgraph "Visualization"
GRAF[Grafana<br/>Dashboards]
end
subgraph "Storage"
TSDB[(Time Series DB<br/>15 day retention)]
end
APP --> METRICS
APP --> HEALTH
METRICS -->|HTTP GET| PROM
HEALTH -->|HTTP GET| PROM
PROM --> TSDB
PROM -->|Query| GRAF
TSDB -->|Query| GRAF
GRAF -->|Display| USER[Users/Operators]
style APP fill:#7A9FB3,stroke:#6B8FA3,color:#fff
style PROM fill:#7A9A7A,stroke:#6B8E6B,color:#fff
style GRAF fill:#7A9A7A,stroke:#6B8E6B,color:#fff
style TSDB fill:#9B8AAB,stroke:#8B7A9B,color:#fff
style METRICS fill:#C4A484,stroke:#B49474,color:#fff
style HEALTH fill:#C4A484,stroke:#B49474,color:#fff
See Monitoring and Metrics for complete documentation on metrics, logging, and Grafana dashboards.
Security¶
Container Security¶
- Non-root user (
appuser, UID 1000) in application container - Minimal base images (Python slim)
- Multi-stage builds to reduce image size
- Regular security scans via Trivy in CI/CD
Network Security¶
- Services communicate via internal Docker network
- Expose only necessary ports (8000 for app, 3000 for Grafana)
- Use reverse proxy (nginx, traefik) for TLS termination
Secrets Management¶
Production recommendations: - Use secrets management service (AWS Secrets Manager, HashiCorp Vault) - Never hardcode credentials - Rotate API keys and passwords regularly - Use least-privilege access principles
Security Scanning¶
Docker image scanning:
docker run --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy image maricusco:latest
Dependency scanning:
Maintenance¶
Log Management¶
View logs:
# All services
docker-compose logs -f
# Specific service
docker-compose logs -f app
# Last 100 lines
docker-compose logs --tail=100 app
Log cleanup:
Database Maintenance¶
PostgreSQL maintenance:
# Vacuum database
docker exec maricusco-postgres psql -U maricusco maricusco -c "VACUUM ANALYZE;"
# Check database size
docker exec maricusco-postgres psql -U maricusco maricusco -c "SELECT pg_size_pretty(pg_database_size('maricusco'));"
Redis maintenance:
# Check memory usage
docker exec maricusco-redis redis-cli INFO memory
# Clear cache (use with caution)
docker exec maricusco-redis redis-cli FLUSHDB
Updates and Upgrades¶
Application update:
# Pull latest code
git pull origin main
# Rebuild and restart
docker-compose build app
docker-compose up -d app
Dependency updates: - Dependabot automatically creates PRs for dependency updates - Review and merge PRs after CI passes - See CI/CD Pipeline for details
Troubleshooting¶
Service Won't Start¶
Check logs:
Common issues:
- Missing environment variables → Check .env file
- Port conflicts → Change port mappings in docker-compose.yml
- Insufficient resources → Increase resource limits
- Database connection failure → Verify POSTGRES_* environment variables
Health Check Failures¶
Application unhealthy:
# Check health endpoint directly
docker exec maricusco-app curl -v http://localhost:8000/health
# Check dependency connectivity
docker exec maricusco-app ping postgres
docker exec maricusco-app ping redis
Database connection issues:
# Test PostgreSQL connection
docker exec maricusco-postgres pg_isready -U maricusco
# Check PostgreSQL logs
docker-compose logs postgres
Performance Issues¶
High memory usage:
Slow queries:
# Enable query logging in PostgreSQL
# Check slow query logs
docker-compose logs postgres | grep "slow query"
Data Issues¶
Corrupted or missing data:
1. Check volume mounts: docker volume inspect maricusco_postgres_data
2. Check service logs: docker-compose logs <service>
3. Verify database connectivity and permissions
Operational Procedures¶
Service Restart¶
Restart all services:
Restart specific service:
Graceful restart (zero downtime):
Service Scaling¶
Scale application:
Note: Use load balancer for multiple instances. Docker Compose scaling is limited for production use.
Maintenance Windows¶
Scheduled maintenance:
1. Notify users of maintenance window
2. Stop services: docker-compose down
3. Perform maintenance (updates, backups, etc.)
4. Start services: docker-compose up -d
5. Verify health: curl http://localhost:8000/health
References¶
- Docker Setup - Detailed Docker configuration
- CI/CD Pipeline - Continuous integration and deployment
- Monitoring and Metrics - Observability setup
- Configuration Reference - Complete configuration options