Neat article about avoiding a memcpy in a circular buffer.
This is a neat trick but it works best on Linux and maybe macOS.
Implementing it on Windows requires some luck as you can’t just map adjacent pages, you have to just request two of them and hope the OS gives you two contiguous ones. For example, see this abandoned Rust crate: https://github.com/gnzlbg/slice_deque/blob/045fb28701d3b674b5da413266ca84b3e5a70190/src/mirrored/winapi.rs#L57
I feel like there could be a decent intermediate option here. It quickly glosses over page sizes and then talks about the modulus operator, but misses the fact that bitwise operations can emulate modulus for powers of two, which is generally the sorts of sizes that pages tend to be, and bitwise is generally much faster than the division that modulus performs.
In short,
x % z
is generally equivalent tox & (z-1)
when z is a power of two and is often much faster.Now, I get that the compiler might be clever enough to turn a modulus operation of the right size into a bitwise operation, but it’s still necessary that the programmer specify that power-of-two size for their circular buffer in the first place.
I would be curious as to whether this “greyer” magic has any benefit when not performing the page table hack.