1. Python Internals & Data Model

Master the runtime: execution model, object layout, hashing, attribute lookup, and the implications for correctness and performance.

Question: Can you describe the CPython execution model and how it manages memory?

Answer: CPython first compiles Python source code into bytecode, which is then executed by a virtual machine. For memory management, it primarily uses reference counting, supplemented by a cyclic garbage collector to deallocate objects with circular references.

Explanation: A critical component of this model is the Global Interpreter Lock (GIL), a mutex that ensures only one thread executes Python bytecode at a time. This simplifies memory management by preventing race conditions on object reference counts, but it is a known bottleneck for CPU-bound multi-threaded applications.

Question: What is the difference between is and ==? What is the relationship between an object's equality and its hash?

Answer: is compares object identity (i.e., if two variables point to the exact same object in memory), while == compares equality by calling the __eq__ method. The hash/equality contract states that if two objects are considered equal (a == b), then their hash values must also be equal (hash(a) == hash(b)).

Explanation: The reverse is not true: two objects with the same hash are not necessarily equal, which is known as a hash collision. This contract is essential for objects to work correctly as keys in dictionaries or as elements in sets.

Question: Explain the difference between __getattr__ and __getattribute__.

Answer: __getattribute__ is called for every attribute access on an object, regardless of whether the attribute exists. __getattr__ is a fallback that is only called if the requested attribute is not found through normal mechanisms.

Explanation: You must be extremely careful when implementing __getattribute__ to avoid infinite recursion by calling the base class's __getattribute__ method. __getattr__ is safer and is commonly used to implement proxy objects or to compute attributes on the fly.

Question: What is the purpose of __slots__ and when is it appropriate to use it?

Answer: __slots__ is a class variable that pre-declares instance attributes. By defining __slots__, you prevent the creation of a __dict__ and __weakref__ for each instance, leading to significant memory savings and slightly faster attribute access.

Explanation: This is an optimization best used when you expect to create a very large number of small objects where memory is a concern. The main trade-off is inflexibility: you cannot add new attributes to instances that are not declared in __slots__.

class Money:
    __slots__ = ("amount", "currency")
    def __init__(self, amount: int, currency: str) -> None:
        self.amount = amount
        self.currency = currency

Question: How does a Python dict or set work internally to achieve O(1) lookups on average?

Answer: Python's dictionaries and sets are implemented using a hash table. When an object is added, its hash is used to determine which "bucket" to place it in. This allows for average O(1) time complexity for lookups, insertions, and deletions.

Explanation: The hash table is a sparse array. To find an element, Python re-computes the hash of the key to immediately find the correct bucket. If multiple keys hash to the same bucket (a collision), Python uses a technique called open addressing to probe for the next available slot. The table is automatically resized as it grows to maintain sparsity, which is why the O(1) complexity is an amortized average.

Question: What is the Method Resolution Order (MRO) and how does super() work?

Answer: MRO is the order in which Python looks up attributes on a class and its bases. super() uses the class’s MRO to delegate to the next method in the resolution chain, enabling cooperative multiple inheritance.

Explanation: In the diamond pattern, all classes must call super() to ensure each base is initialized exactly once. The C3 linearization algorithm defines the MRO.

class A:
    def __init__(self):
        self.trace = ["A"]

class B(A):
    def __init__(self):
        super().__init__()
        self.trace.append("B")

class C(A):
    def __init__(self):
        super().__init__()
        self.trace.append("C")

class D(B, C):
    def __init__(self):
        super().__init__()  # runs A -> C -> B -> D
        self.trace.append("D")
# D().trace == ['A','C','B','D']

Question: What is the difference between __new__ and __init__?

Answer: __new__ creates and returns a new instance (it’s a static method on the class), while __init__ initializes the already created instance.

Explanation: Override __new__ to control instance creation (e.g., for immutables like tuple, singletons, or caching). Use __init__ for normal post-construction initialization.

class Singleton:
    _inst = None
    def __new__(cls, *a, **kw):
        if cls._inst is None:
            cls._inst = super().__new__(cls)
        return cls._inst

Question: What is the buffer protocol and why use memoryview?

Answer: The buffer protocol exposes raw memory of objects (like bytes, bytearray, NumPy arrays) without copying. memoryview lets you slice and manipulate large binary data efficiently.

Explanation: It avoids allocations and copies, which is critical in high-throughput I/O and binary processing.

data = bytearray(b"\x00" * 10)
mv = memoryview(data)
mv[2:5] = b"abc"   # in-place, no copy

Question: When and how would you use weak references?

Answer: Use weakref to reference objects without increasing their reference count, allowing garbage collection when no strong refs remain.

Explanation: Useful for caches or cross-references that shouldn’t keep objects alive.

import weakref
cache = weakref.WeakValueDictionary()
obj = SomeHeavyObject()
cache[obj.id] = obj  # Evicted automatically when obj is GC’d