Performance & Optimization

Write fast code by choosing the right tools and avoiding common mistakes.

Measure First

Do not guess — measure

Never optimize without measuring. Find the slow part first, then fix it.

Using `time.perf_counter()`

import time

start = time.perf_counter()
result = process_data(large_dataset)
elapsed = time.perf_counter() - start
print(f"Took {elapsed:.3f} seconds")

Using `cProfile`

uv run python -m cProfile -s cumtime my_script.py

This shows which functions take the most time.

Efficient Data Structures

Choose the Right Structure

Operation	list	set	dict
Access by index	O(1)	N/A	N/A
Search (membership)	O(n)	O(1)	O(1)
Insert at end	O(1)	O(1)	O(1)
Insert at start	O(n)	N/A	N/A
Delete by value	O(n)	O(1)	O(1)

Membership Check

# Slow — O(n) for each check
names_list = ["Alice", "Bob", "Charlie", ...]
if "Alice" in names_list: ...

# Fast — O(1) for each check
names_set = {"Alice", "Bob", "Charlie", ...}
if "Alice" in names_set: ...

Use sets for membership checks

If you only need to check "is this item in the collection?", use a set.

String Performance

# Slow — creates new string on each iteration
result = ""
for item in large_list:
    result += str(item) + ", "

# Fast — joins all at once
result = ", ".join(str(item) for item in large_list)

List Performance

List Comprehensions Are Faster

# Slower
result = []
for x in range(10000):
    result.append(x ** 2)

# Faster
result = [x ** 2 for x in range(10000)]

Pre-allocate When Possible

# If you know the size, use a generator
squares = (x ** 2 for x in range(10000))

Caching

functools.lru_cache

Cache function results to avoid repeated computation:

from functools import lru_cache

@lru_cache(maxsize=128)
def get_user_permissions(user_id: int) -> list[str]:
    # Expensive database query
    return fetch_permissions_from_db(user_id)

When to Cache

Good for Caching	Bad for Caching
Same input, same output	Random or changing results
Expensive computation	Quick operations
Frequently called	Rarely called
Small result set	Large result set

Common Performance Mistakes

1. Searching in Lists

# Bad — O(n) per search
if item in large_list: ...

# Good — O(1) per search
large_set = set(large_list)
if item in large_set: ...

2. Repeated Attribute Access in Loops

# Slower
for item in items:
    result.append(item.value.strip().lower())

# Faster — local reference
append = result.append
for item in items:
    append(item.value.strip().lower())

3. Loading Everything into Memory

# Bad — loads entire file
data = Path("huge_file.txt").read_text().splitlines()

# Good — process line by line
with Path("huge_file.txt").open() as f:
    for line in f:
        process(line)

Optimization Checklist

Step	Action
1	Measure with profiler
2	Find the bottleneck
3	Choose the right data structure
4	Reduce unnecessary work
5	Cache repeated computations
6	Measure again to confirm

Best Practices

Measure before optimizing — do not guess
Choose the right data structure (set for lookup, dict for mapping)
Use list comprehensions and generator expressions
Use str.join() for building strings
Use lru_cache for expensive, pure functions
Avoid premature optimization — clean code first, optimize later
Process large files line by line, not all at once
Use async for I/O-bound tasks, not CPU-bound tasks