Optimize C++ loops with std::memory_order_relaxed

I changed one word in C++ and my loop got faster. The word I removed? Nothing. It was already there, silently by default. Here's something I didn't know until recently: Every time you write flag.store(true), C++ quietly applies the strongest possible memory guarantee under the hood. No warning. No footnote. Just the default. It's called memory_order_seq_cst. And on x86-64, that one store doesn't compile to a plain memory write. It compiles to MFENCE a CPU instruction whose only job is to freeze the entire pipeline, force every pending write to flush through the cache hierarchy, wait for every other core to acknowledge it, and only then let your program continue. That's 40 to 100 cycles of your CPU sitting completely idle. Once? Noise. In a loop running millions of times per second? That's your latency budget. The fix is literally one word: // before — full pipeline stall, every iteration counter.fetch_add(1); // after — plain atomic, no fence counter.fetch_add(1, std::memory_order_relaxed); But you shouldn't copy-paste that without understanding why it's safe here and catastrophic somewhere else. The difference between relaxed, acquire, release, acq_rel, and seq_cst isn't just performance it's correctness. You write std::atomic, it works, you move on and the whole time the compiler is making decisions for you that have real hardware consequences you never see. So I wrote the post I wish existed when I first touched atomics. It covers: * What all 5 memory orderings actually mean with mental models, not just definitions * Why your CPU has a store buffer and why that changes everything * What LOCK XCHG and MFENCE physically do to your pipeline * How to use godbolt.org to see the assembly your compiler silently generates * The one question to ask yourself when picking an ordering Link in comments 👇 If you've ever written std::atomic and moved on without a second thought this one's for you. #cpp #systemsprogramming #concurrency #performance #programming

  • No alternative text description for this image

Thanks, good article on the basics of memory coherency operation in C++.

See more comments

To view or add a comment, sign in

Explore content categories