Release Engineering and Runtime Ops
Operate Elixir releases in production with strong runtime practices. Covers release config, migrations, rollbacks, health checks, and incident-ready runbooks.
Shipping a release is only the start. Reliable operations require clear procedures for configuration, migrations, rollback, and diagnostics.
Release Checklist
- reproducible build artifacts,
- immutable release bundles,
- runtime config from environment/providers,
- pre-deploy and post-deploy health checks,
- rollback procedure validated in staging.
Runtime Configuration
Use runtime config for environment-specific values:
# runtime.exs
config :my_app, MyApp.Repo,
url: System.fetch_env!("DATABASE_URL"),
pool_size: String.to_integer(System.get_env("POOL_SIZE", "10"))
Avoid baking secrets into compile-time config.
Migration Strategy
Decide deployment order:
- backward-compatible schema changes,
- deploy app code,
- cleanup migrations later.
For zero-downtime systems, avoid destructive schema changes in first rollout.
Health and Readiness
Track:
- startup success,
- DB connectivity,
- queue/subscription readiness,
- critical dependency reachability.
Expose readiness checks that reflect true service ability, not only process liveness.
Rollback Planning
A rollback plan should include:
- artifact version selection,
- configuration compatibility check,
- migration compatibility constraints,
- communication and validation steps.
# Typical release ops
# Virtualenv/container artifact + migration orchestration + health checks.
// Typical Node ops
// Container image deploy + env injection + rollback to previous image.
# Elixir releases
# OTP release artifacts + runtime config + operational scripts.
Exercise
Create a Production Runbook
Write and test a deployment runbook:
- Build and version a release artifact.
- Define pre-deploy checks and rollback trigger thresholds.
- Add migration execution and verification steps.
- Define post-deploy health validation.
- Rehearse rollback in staging with a known-bad release.
FAQ and Troubleshooting
Why did the release boot locally but fail in production?
Most often due to missing runtime env vars, network differences, or secret/config provider issues.
Can I always roll back quickly?
Not if migrations are incompatible. Design schema changes for rollback safety.
What is the most valuable ops artifact?
A tested, concise runbook that a different engineer can execute under pressure.
Related Lessons
Key Takeaways
- Release engineering is about repeatability, safety, and operability under failure
- Runtime configuration and migration strategy must be explicit and tested
- Runbooks and health checks reduce incident recovery time