// Celery — Separate Container Architecture

ECS Fargate · Django producer · Redis broker · Worker consumer · Beat scheduler

EC2 / Fargate

ECS Task — api
Django Container
Django App (Gunicorn/Daphne)
Handles HTTP/SSE/GraphQL. Calls .delay() or .apply_async() to enqueue work.
:8000
task_producer.py
send_message.delay(payload)
Serializes to JSON → pushes message envelope to broker queue. Does not block the request thread.
celery.py (app init)
Shared Celery() instance. Both this container and the worker container import the same app object. Config comes from Django settings via app.config_from_object().
KEY: The Django container never runs celery worker. It only calls tasks. The Celery app object is imported for task registration, not execution.
ElastiCache / EC2
Broker — Redis (or SQS)
Default Queue
celery (default)
FIFO list in Redis.
RPUSH / BLPOP
LIST key
Priority Queues
high_priority
low_priority
Workers pull by queue arg.
routing_key
TRANSPORT: Redis uses a LIST per queue. Each task is a JSON blob with headers (task name, id, retries, eta). Worker does BLPOP — blocks until item available, then ACKs on completion. No at-least-once guarantee without acks_late=True.
Task Lifecycle
01
PUBLISH
Django calls .delay() → Kombu serializes task to JSON → RPUSH to Redis LIST. Task ID returned immediately (UUID).
02
FETCH
Worker's MainProcess runs BLPOP on queue. Blocks (timeout configurable). Task message deserialized from Kombu envelope.
03
EXECUTE
Task dispatched to a subprocess (prefork) or greenlet (gevent). @app.task function runs. Django ORM works — each subprocess has its own DB connection pool.
04
RESULT
If result_backend set: outcome stored in Redis/Postgres under task ID. Caller can poll AsyncResult(task_id).get() — but this blocks, so do it async.
05
RETRY / DLQ
On exception: self.retry(exc=e, countdown=60) re-enqueues with incremented retry counter. After max_retries, task fails — no native DLQ in Redis, implement manually or use SQS.
Worker Concurrency Model
PoolMechanismBest ForDjango ORM?
prefork os.fork() — N subprocesses CPU-bound, default Yes — each worker has own conn
gevent greenlets (cooperative) I/O-bound (HTTP calls) Careful — shared conn pool
eventlet greenlets I/O-bound Same caveat as gevent
threads threading.Thread GIL-aware I/O Yes — thread-local conns
solo single-process Debug only Yes
PREFORK INTERNALS: MainProcess forks N children (concurrency=N). Each child is a full Python interpreter — no GIL sharing. Django connection pool is per-process. Set CONN_MAX_AGE=0 or handle worker_process_init signal to close inherited connections before children open their own.
EC2 / Fargate

ECS Task — worker
Worker Container
celery worker
Entrypoint: celery -A myapp worker -c 4 -Q default,high
Runs the MainProcess + N child workers. Imports all @app.task decorated functions.
subprocess
Same codebase, different CMD
Worker container uses the identical Docker image as the Django container. Only the CMD differs in the ECS task definition.

Django: CMD gunicorn ...
Worker: CMD celery worker ...
Health Check
ECS has no built-in worker health check. Use:
celery inspect ping
or the celery[redis] health probe via a sidecar / custom CMD.
ECS healthCheck
ECS Task — beat (1 replica)
Beat Scheduler Container
celery beat
celery -A myapp beat -s /tmp/celerybeat-schedule
Reads CELERY_BEAT_SCHEDULE, enqueues tasks on cron/interval. Runs as a single process — only ever 1 replica.
cron engine
django-celery-beat (alt)
Stores schedule in Postgres instead of a local file. Allows runtime editing via Django admin. Beat process polls DB for schedule changes. Safer for ECS (no ephemeral file state).
DB-backed
CRITICAL: Beat must be exactly 1 replica. If you scale to 2, every task runs twice. Use ECS desired count = 1 and do NOT put beat in the same task definition as the worker.
Result Backend (optional)
Redis / PostgreSQL
Task results stored keyed by UUID. TTL configurable (result_expires). If you don't poll results, skip this — it adds overhead. For fire-and-forget tasks, disable entirely.
SETEX / upsert
End-to-End Message Flow
Django
.delay(args)
→ Kombu
→ serialize
api container
RPUSH
JSON blob
Redis
LIST: celery
queue name
FIFO order
ElastiCache
BLPOP
(blocking)
Worker
deserialize
→ dispatch
→ execute fn
worker container
ECS Task: api
image: myapp:latest
command: gunicorn ...
cpu: 512
memory: 1024
portMappings: 8000
desiredCount: 2+ (scale)
env: CELERY_BROKER_URL
env: DATABASE_URL
ECS Task: worker
image: myapp:latest ← SAME
command: celery -A app worker
-c 4 -Q default
cpu: 1024
memory: 2048
desiredCount: 2+ (scale)
No port mappings needed
No ALB target group
ECS Task: beat
image: myapp:latest ← SAME
command: celery -A app beat
--scheduler django_celery_beat
cpu: 256
memory: 512
desiredCount: 1 — NEVER scale
Separate service from worker
Ephemeral schedule file risk
Django Producer
Broker (Redis/SQS)
Celery Worker
Beat Scheduler
Result Backend
EC2 / ECS boundary