7. Robustness & Error Handling

Fail fast with context, keep exceptions specific, and design retries to be safe and bounded.

Q1 What are your principles for robust exception handling?

Answer: Fail fast, be specific in except clauses, and add context to exceptions. Never swallow exceptions silently. A good strategy is to wrap low-level exceptions in custom, domain-specific exceptions to create a clear boundary and prevent implementation details from leaking.

Explanation: Using raise NewException from old_exception is critical because it preserves the original stack trace, making debugging much easier. This creates a causal chain of exceptions. Logging actionable information—not just the exception name, but context like relevant IDs or parameters—is also essential for production systems.

class DomainError(Exception):
    pass

def process_item(item_id):
    try:
        risky_call(item_id)
    except ExternalError as e:
        # Add context and preserve original cause
        raise DomainError(f"Failed to process item {item_id}") from e

Q2 What is an `ExceptionGroup` and how do you handle it?

Answer: ExceptionGroup groups multiple exceptions into one (common with parallel/async tasks). Use except* to handle by type.

Explanation: This avoids losing errors from sibling tasks.

try:
    raise ExceptionGroup("many", [ValueError(), KeyError()])
except* ValueError as eg:
    handle_value_errors(eg.exceptions)

Q3 How do you implement resilient retries?

Answer: Use bounded retries with exponential backoff and jitter; respect idempotency and timeouts.

Explanation: Prevents thundering herds and minimizes cascading failures.

import random, time
def retry(op, attempts=5, base=0.1):
    for n in range(attempts):
        try:
            return op()
        except TransientError:
            sleep = base * (2 ** n) + random.uniform(0, base)
            time.sleep(sleep)
    raise

7. Robustness & Error Handling

Q1 What are your principles for robust exception handling?

Q2 What is an ExceptionGroup and how do you handle it?

Q3 How do you implement resilient retries?

Q2 What is an `ExceptionGroup` and how do you handle it?