Every concept starts from a real data-pipeline problem, then an everyday analogy, then runnable Python you can edit. Read the idea, run the code, watch it move.
Every structure earns its place by solving a real pipeline problem.
| Structure | In a data pipeline | Why it fits |
|---|---|---|
| list | Rows read from a CSV | Ordered, index-addressable batch |
| linear search | Finding a customer in a flat file | No index? Scan every row — O(n) |
| binary search | Lookup over sorted partitions | Sorted data collapses cost to O(log n) |
| dict | Dimension lookup / join key map | O(1) average key access |
| set | Deduplication of records | Uniqueness + fast membership |
| queue | Event / message pipeline | FIFO ordering of arrivals |
| stack | Transformation history / undo | LIFO — last applied, first reverted |
| recursion | Walking nested catalogs / trees | Self-similar structure, self-similar code |
| itertools | Streaming / lazy ETL over large files | Process records without loading all of them |
| functools | Reusable transforms, cached lookups | Compose and memoize pure functions |