Using PostgreSQL as a Dead Letter Queue for Event-Driven Systems

(diljitpr.net)

45 points | by tanelpoder 2 hours ago

11 comments

exabrial 29 minutes ago
> FOR UPDATE SKIP LOCKED
Learned something new today. I knew what FOR UPDATE did, but somehow I've never RTFM'd hard enough to know about the SKIP LOCKED directive. Thats pretty cool.
rbranson an hour ago
Biggest thing to watch out with this approach is that you will inevitably have some failure or bug that will 10x, 100x, or 1000x the rate of dead messages and that will overload your DLQ database. You need a circuit breaker or rate limit on it.
[-]
- shayonj 41 minutes ago
  This! Only thing worse than your main queue backing off is you dropping items from going into the DLQ because it can’t stay up.
- pletnes 40 minutes ago
  If you can’t deliver to the DLQ, then what? Then you’re missing messages either way. Who cares if it’s down this way or the other?
  [-]
  - xyzzy_plugh 32 minutes ago
    Not necessarily. If you can't deliver the message somewhere you don't ACK it, and the sender can choose what to do (retry, backoff, etc.)
    Sure, it's unavailability of course, but it's not data loss.
    [-]
    - konart 7 minutes ago
      If you are reading from Kafka (for example) and you can't do anything with a message (broken json as an example) and you can't put it into a DLQ - you have not other option but to skip it or stop on it, no?
  - rbranson 33 minutes ago
    Sure, but you still need to design around this problem. It’s going to be a happy accident that everything turns out fine if you don’t.
  - RedShift1 29 minutes ago
    The point is to not take the whole server down with it. Keeps the other applications working.
renewiltord 14 minutes ago
Segment uses MySQL as queue not even as DLQ. It works at their scale. So there are many (not all) systems that can tolerate this as queue.
I have simple flow: tasks are order of thousands an hour. I just use postgresql. High visibility, easy requeue, durable store. With appropriate index, it’s perfectly fine. LLM will write skip locked code right first time. Easy local dev. I always reach for Postgres for event bus in low volume system.
reactordev 20 minutes ago
Another day, another “Using PostgreSQL for…” thing it wasn’t designed for. This isn’t a good idea. What happens when the queue goes down and all messages are dead lettered? What happens when you end up with competing messages? This is not the way.
[-]
- odie5533 4 minutes ago
  You wouldn't ack the message if you're not up to process it.