Locality

2026-04-08

When a program operates on some data, the CPU typically has to fetch that data from RAM to be able to work on it. Data is fetched in what are called cache lines. So, when a program requests a little bit of data, the data immediately next to it will also be swept up and sent to the CPU for processing. This means that programs operating on bits of data that are all right next to each other will generally be faster than programs with more random access patterns because it’s more likely that the stuff it needs to work on has already been fetched and is ready to go.

That’s probably a terrible explanation of locality, but the gist is that if you design your data structures with good locality, your programs will be faster. There are so many tradeoffs to make when it comes to designing data structures, and I kind of feel like I’m going in circles. I think I might just do the naive thing after all and come back to optimize it later.

I know that’s the common wisdom, but I also know ahead of time that certain data is going to be accessed in very hot loops, so I didn’t feel that it was premature.