5. Distributed Systems & Reliability

Design for failure: idempotency, outbox, retries with jitter, circuit breakers, DLQs, and sagas.

Q1 What is the Transactional Outbox pattern and why is it useful?

Answer: The Transactional Outbox pattern ensures that events are reliably published in response to a database change. It works by saving the business state and the event to be published into an "outbox" table within the same local database transaction. A separate relayer process then reads from this table and guarantees delivery of the event to a message broker.

Explanation: This pattern solves the "dual write" problem. If you first write to your database and then make a separate call to a message broker, the second call might fail, leaving your system in an inconsistent state. By writing the event to an outbox table in the same transaction as the business data, you guarantee that the event will be captured if and only if the business data is successfully saved. The relayer then handles the "at-least-once" delivery to the message broker.

Q2 How do you handle idempotency for write operations in an API?

Answer: Idempotency is typically handled by requiring the client to generate and send a unique key (e.g., Idempotency-Key header) for each state-changing request. The server tracks these keys for a period of time. If a request comes in with a key that has already been processed, the server can safely skip the operation and return the previously generated result.

Explanation: This pattern is crucial for building reliable systems, as it makes retries safe. A client can safely retry a request that timed out without fear of creating duplicate transactions or objects. The idempotency key store is often a fast key-value store like Redis with a TTL on the keys, ensuring they are kept long enough to handle duplicate requests but not forever.

Q3 How do you implement resilient retries and circuit breaking?

Answer: Use bounded exponential backoff with jitter for retriable errors, and a circuit breaker to shed load when a dependency is failing persistently.

Explanation: Backoff reduces thundering herds; circuit breakers prevent cascading failures and fast-fail unhealthy paths.

base := 100 * time.Millisecond
for attempt := 0; attempt < 5; attempt++ {
    err := call()
    if err == nil { break }
    if !isRetryable(err) { return err }
    sleep := base << attempt
    jitter := time.Duration(rand.Int63n(int64(sleep / 2)))
    time.Sleep(sleep/2 + jitter)
}

5. Distributed Systems & Reliability

Q1 What is the Transactional Outbox pattern and why is it useful?

Q2 How do you handle idempotency for write operations in an API?

Q3 How do you implement resilient retries and circuit breaking?

Q4 How do you classify errors for retries?

Q5 How do you design Dead Letter Queues (DLQs) and retries for message processing?

Q6 What is the Saga pattern, and when do you use it?

Q7 Can you achieve exactly-once processing?