3. Performance & Memory

Measure before optimizing; choose the right structures and minimize allocations and copies.

Question: What are some key Python data structures, and what are their performance characteristics?

Answer: The choice of data structure is critical for performance. Key structures include list (dynamic array), deque (double-ended queue), and dict/set (hash tables).

Explanation:

  • list: O(1) time complexity for append and pop from the end. O(n) for insertion/deletion at the start.

  • collections.deque: O(1) for append/pop from both ends. Better for implementing queues.

  • dict/set: Average O(1) for lookups, insertions, and deletions due to their hash table implementation.

  • heapq: Implements a min-heap, providing an efficient O(log n) priority queue.

Question: What is the difference between a list comprehension and a generator expression?

Answer: A list comprehension creates and returns a new list containing all the elements at once, consuming memory proportional to the list's size. A generator expression returns a generator object, which produces elements one by one (lazily) and consumes constant, minimal memory.

Explanation: Use a list comprehension when you need the full list in memory, for example, to iterate over it multiple times. Use a generator expression for large datasets or when you are iterating over the results only once, as it is far more memory-efficient.

# List comprehension (builds a list in memory)
squares_list = [x*x for x in range(1000)]

# Generator expression (produces values on demand)
squares_gen = (x*x for x in range(1000))

# The generator is more efficient if you just need to sum the results
total = sum(x*x for x in range(1000))

Question: When should you use array.array or struct over list of numbers?

Answer: Use array('d') for compact numeric storage and struct for packing/unpacking binary data with fixed layouts.

from array import array
import struct

buf = array('d', [1.0, 2.0])          # dense doubles
packed = struct.pack('!I', 123)       # network-order uint32
value, = struct.unpack('!I', packed)

Question: How is a Python dict implemented, and why are lookups O(1) on average?

Answer: Python's dictionaries are implemented using a hash table. This structure allows for average O(1) time complexity for insertions, deletions, and lookups.

Explanation: When you add a key-value pair, Python uses the key's hash value (hash(key)) to determine which "bucket" or slot in the hash table to store the pair. Lookups work the same way. The hash is used to instantly find the correct bucket, making the process very fast. In the case of a hash collision (two keys hashing to the same bucket), Python uses a technique called open addressing to find the next available slot. As the dictionary grows, it is dynamically resized to keep it sparse, which maintains the O(1) average time complexity.

Additionally, dictionaries preserve insertion order (guaranteed since Python 3.7), which is frequently leveraged in APIs and serialization.

Question: Your Python application is slow. How would you profile it to find the bottleneck?

Answer: I would use a combination of tools depending on the problem. For micro-benchmarks, timeit. For overall CPU usage, cProfile. For line-by-line CPU analysis, line_profiler. For memory issues, tracemalloc.

Explanation:

  • timeit is for timing small, isolated snippets of code accurately.

  • cProfile and pstats give a high-level overview of which functions take the most time.

  • line_profiler gives a line-by-line breakdown of time spent within a specific function.

  • tracemalloc helps identify where memory is being allocated, which is crucial for diagnosing memory leaks.

Question: How can you implement caching in Python?

Answer: Caching can be done in-memory using decorators like @functools.lru_cache, or externally using a dedicated service like Redis for distributed systems.

Explanation: @lru_cache is a simple and effective way to cache the results of expensive, deterministic function calls in memory using a Least Recently Used (LRU) eviction policy. For distributed systems, an external cache like Redis is superior, providing features like Time-To-Live (TTL) and persistence.

from functools import lru_cache

@lru_cache(maxsize=1024)
def get_config_from_db(key: str) -> str | None:
    # Expensive database call
    return load_from_db(key)

Question: What is the difference between copy.copy and copy.deepcopy?

Answer: copy makes a shallow copy (outer container copied, elements referenced). deepcopy recursively copies nested objects.

Explanation: Use shallow copy when inner objects are immutable or intentionally shared; use deep copy to isolate state.

import copy

a = {"nums": [1, 2]}
b = copy.copy(a)
c = copy.deepcopy(a)
b["nums"].append(3)  # also changes a["nums"]
c["nums"].append(4)  # does not change a

Question: When should you use bisect or heapq instead of sorting repeatedly?

Answer: Maintain a sorted sequence with bisect for O(log n) insertion and use heapq for efficient top-k/min extraction, rather than O(n log n) resorting.

Explanation: For streaming data and sliding windows, incremental updates beat full recomputation.

import bisect

arr: list[int] = []
for x in stream():
    bisect.insort(arr, x)

Question: How can memoryview and bytearray improve performance?

Answer: memoryview enables zero-copy slicing of bytes-like objects; bytearray allows in-place edits, reducing allocations.

Explanation: Useful for parsers and binary protocols; avoid repeated bytes concatenation.