WORKERS ACTIVE
SCHEDULER RUNNING
REAPER WATCHING
DISTRIBUTED JOB QUEUE · GO 1.21+

Jobs that survive.
Queues that scale.

Your app writes a job. Postgres records it. Redis schedules it. When Redis goes down — and it will — Postgres still has everything. No lost jobs. No silent failures.

go get github.com/codetesla51/kyu
View on GitHub
Your App write full record push job ID PostgreSQL Redis Worker Pool fetch record pop ID Handler fn() run handler update status completed

Three engines. One queue.

Call Start(). Three goroutines spin up and stay running. They don't talk to each other — they talk to Postgres and Redis. That's the entire coordination layer.

Worker Pool Scheduler Stale Reaper continuous every 5s every 60s pop ID fetch record run handler update status query Postgres find due jobs push IDs to Redis find stuck jobs past timeout? reset to pending Redis PostgreSQL shared foundation
01 · goroutine
Worker Pool

Pops a job ID from Redis. Fetches the full record from Postgres. Runs your handler. Writes the result back. If it fails and retries remain, it goes back in the queue with a longer wait. If retries are exhausted, it's marked dead and stays in Postgres forever — queryable, not silently dropped.

02 · goroutine
Scheduler

Wakes every 5 seconds. Asks Postgres which scheduled jobs are due. Pushes their IDs into Redis. Postgres owns the schedule. Redis just gets told when to run things.

03 · goroutine
Stale Reaper

Wakes every 60 seconds. Finds jobs stuck in running longer than their timeout — usually because the worker that claimed them crashed. Resets them to pending. You don't have to think about this.

Every job. Full history.

Every time a job changes state, Postgres gets a write. pending, running, failed, dead — all of it is in your database, in a normal table, queryable with normal SQL. No vendor API. No dashboard you have to pay for. Just SELECT.

pending cancelled CancelJob() running dequeue completed success failed error re-queued retries remain backoff elapsed dead no retries
Column Type Description
idSTRINGUnique job identifier
job_typeSTRINGRegistered handler name, used to find the handler
priorityINTHigher = processed first. Default 0
created_atTIMEWhen the job was created
updated_atTIMEWhen the job record was last updated
deleted_atTIMESoft delete timestamp. Null if not deleted
payloadSTRINGJSON string passed to the handler
statusSTRINGpending / running / failed / completed / dead / cancelled
completed_atTIMEWhen the job finished. Null if not completed
scheduled_atTIMEWhen to run the job. Null means run immediately
max_retriesINTMaximum number of retry attempts allowed
retry_countINTHow many times this job has been retried
error_messageSTRINGLast error returned by the handler
locked_atTIMEWhen a worker claimed this job
locked_bySTRINGID of the worker that claimed this job

Normal Postgres table. Query it however you want. No proprietary API required.

Up in minutes.

middleware

Production-grade from day one.

dp
Durable by default

Redis is a cache. Postgres is a database. Kyu treats them accordingly. Clear Redis entirely — your jobs are still there.

write PostgreSQL Redis goes down jobs safe
eb
Exponential backoff

First retry waits 1s. Second waits 2s. Third waits 4s. Your downstream service gets a chance to recover instead of getting hammered by a retry loop.

retry delay 1s 2s 4s 8s
pq
Priority queues

Pass a Priority value at enqueue time. Higher score runs first. Redis sorted sets handle the ordering — O(log N) inserts, no scanning.

process_payment 10 send_email 5 generate_report 1 first
sj
Scheduled jobs

Pass a *time.Time to ScheduledAt. nil means run immediately. The Scheduler picks it up when the time comes. No separate cron process needed.

now ScheduledAt send_email running
sr
Stale job reaper

A worker claims a job then the process dies. Without a reaper, that job is stuck in running forever. Kyu resets it automatically based on StaleJobTimeout.

running stuck StaleJobTimeout pending
mw
Middleware system

Register middleware once. Every job runs through it in order — in on the way down, out on the way back. Panics in handlers are caught, not propagated.

Logger → in out ← Timer Panic Recovery Handler fn()
pg
Prometheus + Grafana

/metrics is live the moment you set MetricsPort. Drop in the included docker-compose.yml and Grafana is reading it. Each Kyu instance gets its own registry — run multiple instances, no metric collisions.

/metrics kyu_jobs_total kyu_queue_depth Grafana
ro
RunOnce / Cron mode

Set RunOnce: true. Kyu drains the queue and exits with code 0. Wire it to a Kubernetes CronJob or a plain crontab. No long-running process needed for batch workloads.

queue exit 0 0 * * * *

See everything.

Set MetricsPort in your config. /metrics comes up automatically. The repo includes a docker-compose.yml that runs Postgres, Redis, Prometheus, and Grafana together. One command, full observability stack.

Kyu Grafana dashboard
[dashboard]
Grafana dashboard — queue depth, throughput, failure rates by job type
github.com/codetesla51/kyu
Metric Type Description
kyu_jobs_totalcounterTotal jobs ever submitted
kyu_jobs_processed_totalcounter_vecCompleted jobs, labelled by status
kyu_job_failures_totalcounter_vecFailures, labelled by job_type
kyu_jobs_dead_totalcounterJobs that exhausted all retries
kyu_queue_depthgaugeJobs currently waiting in Redis

Fast by design.

0
register
0 allocations
0
execute
5 allocs/op
0
execute + middleware
7 allocs/op
0
execute parallel
5 allocs/op

These numbers measure dispatch overhead only. In practice your bottleneck is the Postgres write and the Redis round-trip — which is fine, because that's where the durability comes from.

How Kyu fits the landscape.

Redis-only Asynq · BullMQ Fast · fragile Postgres-only River Durable · slower Redis + Postgres Kyu - you are here Fast AND durable

Asynq is fast. It's also Redis-only, which means your job history lives in a key-value store with TTLs. River puts everything in Postgres — durable, but no sorted set for priority dispatch. Kyu doesn't pick one. Postgres is the ledger. Redis is the dispatcher. You get the durability of a relational database and the scheduling performance of a sorted set.

Feature Kyu Asynq River Machinery BullMQ (Node.js)
Language Go Go Go Go Node.js
Storage backend PostgreSQL + Redis Redis only PostgreSQL only Redis / AMQP / MongoDB Redis only
Priority scheduling Redis sorted set Redis sorted set Weighted queues Basic Redis sorted set
Job durability Survives Redis restart Lost if Redis clears Full durability Depends on backend Lost if Redis clears
Transactional enqueue Yes No Yes No No
Full job history Yes (SQL) Limited (Redis TTL) Yes (SQL) Limited Limited (Redis TTL)
Stale job reaper Yes Limited Yes No Limited
Prometheus native Yes Separate No No Separate
Scheduled jobs Yes Yes Yes Yes Yes
Middleware system Yes Yes No Yes Yes
Retries + backoff Exponential Exponential Exponential Basic Exponential
Open source MIT MIT MIT MIT MIT

These numbers measure dispatch overhead only. In practice your bottleneck is the Postgres write and the Redis round-trip — which is fine, because that's where the durability comes from.

Config reference.

All fields have defaults. kyu.New(kyu.Config{}) connects to local Postgres and Redis with 5 workers. If you're running multiple apps against the same Redis instance, set a unique QueueName per app — workers compete for any job in their queue.

FieldTypeDefaultDescription
DSNSTRINGlocalhost:5432Postgres connection string
RedisAddrSTRINGlocalhost:6379Redis address
WorkersINT5Number of concurrent goroutines processing jobs
QueueNameSTRINGkyu:defaultRedis sorted set key. Use different names to isolate queues between apps
MetricsPortINT0 (disabled)Port for Prometheus /metrics endpoint. Set to 0 to disable
StaleJobTimeoutDURATION10mJobs stuck in running beyond this are reset to pending. Handles crashed workers
MaxOpenConnsINT25Postgres connection pool max open connections
MaxIdleConnsINT25Postgres connection pool max idle connections
ConnMaxLifetimeDURATION5mPostgres connection max lifetime
Logger*log.Loggerlog.Default()Logger instance
RunOnceBOOLfalseDrain the queue and exit instead of running a persistent loop

Inspect. Cancel. Recover.

Every job is a row in Postgres. You can query it with normal SQL, or use the built-in methods. Dead jobs stay in the table forever — queryable, not silently dropped.

CancelJob works on pending, scheduled, and failed jobs. It has no effect once a job is running — the worker already holds the lock.

Full stack in one command.

A docker-compose.yml is included. It starts Postgres, Redis, Prometheus, and Grafana together. A pre-built dashboard covers queue depth, job throughput, failure rates by job type, goroutine count, and memory usage.

docker compose up --build
ServiceAddress
Metricslocalhost:9090
Prometheuslocalhost:9090
Grafanalocalhost:3000
Postgreslocalhost:5432
Redislocalhost:6380

Prometheus scrape config.

Running tests.

go test -short ./...
go test ./...
go test -bench=. -benchmem -count=3

Integration tests require Postgres on 5432 and Redis on 6380. Update the DSN and Redis address in kyu_integration_test.go to match your local setup before running.