Performance Profiling and Benchmarking

Measure and optimize Elixir performance with Benchee, profiling tools, and telemetry. Covers CPU/memory hotspots, process-level analysis, and safe optimization workflow.

Performance work in Elixir should follow a repeatable loop:

measure,
identify bottleneck,
change one variable,
measure again.

Without this loop, optimization often creates complexity without meaningful gains.

Benchmarking with Benchee

Use Benchee for controlled micro/mid-level comparisons:

Benchee.run(%{
  "enum_map" => fn -> Enum.map(1..100_000, &(&1 * 2)) end,
  "for_comp" => fn -> for x <- 1..100_000, do: x * 2 end
})

Benchmark tips:

warm-up sufficiently,
test realistic input sizes,
isolate noisy external dependencies.

Profiling for Hotspots

Profilers answer where time is spent, not just which function is faster in isolation.

Useful tools:

:fprof for call-time analysis,
:eprof for function-level profiling,
tracing and telemetry for production paths.

BEAM-Specific Performance Signals

scheduler utilization imbalance,
mailbox growth and message queue pressure,
excessive process churn,
binary memory retention.

Inspect these before rewriting algorithms prematurely.

Optimization Priorities

reduce unnecessary allocations/copies,
batch expensive IO,
move repeated computation out of hot loops,
choose appropriate concurrency boundaries.

# cProfile + pytest-benchmark + tracing stacks
# Similar measurement-first methodology.

// Node perf hooks + clinic/flame tools
# Similar hotspot and throughput analysis approach.

# Benchee + BEAM profilers + Telemetry
# Combined offline and runtime profiling for reliable optimization.

Exercise

Profile and Optimize a Real Endpoint

Choose one slow endpoint/job and run an optimization cycle:

Capture baseline latency and throughput.
Add telemetry around key operations.
Profile to identify top hotspot.
Apply one focused optimization.
Re-measure and document before/after metrics.

FAQ and Troubleshooting

Why do microbenchmarks improve but endpoint latency does not?

The real bottleneck may be IO, contention, or downstream services not represented in the microbenchmark.

How do I avoid over-optimizing?

Set target SLOs first. Stop optimizing when targets are met and complexity costs outweigh gains.

Should I optimize memory or CPU first?

Optimize the limiting resource observed in production metrics for your actual workload.

Key Takeaways

Optimization should follow measurement, never intuition alone
Benchmarking and profiling answer different questions and should be used together
Process-level visibility is essential for diagnosing BEAM performance issues

Prerequisites