Back to examples
Error Handling
Jitter
Add randomised jitter to retry delays to avoid thundering herd.
When to use
Use jitter when multiple pipeline instances may retry simultaneously against the same external resource, risking a thundering herd.
error-handling/jitter.py
"""
03_jitter.py: Retry delay with jitter to avoid thundering herd.
Demonstrates:
- retry_jitter_factor=0.5: actual delay = delay * uniform(1-factor, 1+factor)
- Combined with a fixed retry_delay_seconds base
"""
from dagy import flow, task
_hits: dict[str, int] = {}
@task(retries=4, retry_delay_seconds=0.02, retry_jitter_factor=0.5)
def rate_limited_call(resource: str) -> str:
"""Fails twice to show jitter in action."""
_hits[resource] = _hits.get(resource, 0) + 1
n = _hits[resource]
print(f" rate_limited_call({resource!r}) attempt #{n}")
if n < 3:
raise IOError(f"Rate limited (attempt #{n})")
return f"Success on attempt #{n}"
@task
def log_result(message: str) -> None:
print(f" Result: {message}")
@flow(name="jitter_flow")
def jitter_flow(resource: str = "shared-db") -> None:
result = rate_limited_call(resource)
log_result(result)
if __name__ == "__main__":
result = jitter_flow.run_local(resource="shared-db")
print(f"Run ID : {result.run_id}")
print(f"Status : {result.status}")How it works
- `retry_jitter_factor=0.5` randomises the actual delay: `delay * uniform(1 - factor, 1 + factor)`.
- Combined with a base `retry_delay_seconds=0.02`.
- `rate_limited_call` simulates a rate-limited service that rejects the first two attempts.
- The jitter ensures retries from multiple instances don't all hit at the same instant.
Inputs
- `resource` (str, default `"shared-db"`): resource name being accessed.
Outputs
- A success message string.
Dependencies
`dagy`