6. Datastores & Query Design

Tune pools, avoid hotspots, design retries for serialization conflicts, and plan online schema changes.

Q1 What are the most important settings to tune in `database/sql`'s connection pool?

Answer: The key settings are SetMaxOpenConns, SetMaxIdleConns, SetConnMaxIdleTime, and SetConnMaxLifetime. MaxOpenConns limits total connections, MaxIdleConns bounds idle ones, ConnMaxIdleTime retires long-idle connections, and ConnMaxLifetime refreshes connections periodically to handle restarts/topology changes.

Explanation: The defaults are often unsuitable for production. MaxOpenConns is critical backpressure. Avoid MaxIdleConns == MaxOpenConns. Use ConnMaxIdleTime to age out stale connections and ConnMaxLifetime (a few minutes) to survive transient network issues or failovers.

db.SetMaxOpenConns(50)
db.SetMaxIdleConns(25)
db.SetConnMaxIdleTime(2 * time.Minute)
db.SetConnMaxLifetime(10 * time.Minute)

Q2 How do you avoid row-level contention in work queues?

Answer: Use SELECT ... FOR UPDATE SKIP LOCKED to let workers skip locked rows; combine with small batches.

Explanation: This prevents thundering herds on the same row, improving throughput and reducing deadlocks.

Q3 How can you mitigate "hot row" contention in a database?

Answer: "Hot row" contention occurs when many transactions try to update the same row(s) simultaneously. Mitigation strategies include sharding the data to distribute writes, using randomized keys instead of sequential ones to avoid writing to the same page, and redesigning the access pattern to use queueing or batching at the application layer instead of direct database updates.

Explanation: This is a classic scalability problem. For example, a global counter stored in a single row will become a bottleneck. Instead of UPDATE counters SET value = value + 1, you could write increments to a log or queue and have a background process aggregate them periodically. For NoSQL databases, the choice of partition key is critical to avoid hot partitions, which is the same problem under a different name.

Q4 How do you design transactions and handle retries on serialization conflicts?

Answer: Execute business logic inside a retry loop; start a transaction with an appropriate isolation level, roll back on serialization/deadlock errors, and retry with backoff.

Explanation: Under high contention, optimistic retries are required (e.g., PostgreSQL 40001). Keep transactions short and idempotent.

for attempt := 0; attempt < 5; attempt++ {
    tx, err := db.BeginTx(ctx, &sql.TxOptions{Isolation: sql.LevelSerializable})
    if err != nil { return err }
    if err := doWork(ctx, tx); err != nil {
        _ = tx.Rollback()
        if isSerialization(err) { continue }
        return err
    }
    if err := tx.Commit(); err != nil {
        if isSerialization(err) { continue }
        return err
    }
    break
}

Q5 How do you run online schema changes?

Answer: Use online migration tools (gh-ost, pt-online-schema-change) for MySQL or CREATE INDEX CONCURRENTLY in Postgres; deploy in small steps with rollbacks.

Explanation: Avoid long locks and replication lag. Gate with feature flags.

Q6 How do you read from replicas safely?

Answer: Use read-after-write strategies (session stickiness, WAIT_FOR_LAST_LOCALLY_APPLIED/READ YOUR WRITES) or only serve stale-tolerant queries from replicas.

Explanation: Replication lag breaks consistency; prefer leader reads for critical paths.