Understanding the visibility guarantee of volatile

In the current CPU architecture, visibility is mainly achieved by adding barriers when reading and writing variables marked as volatile.

invalidQueue, storeBuffer, and Visibility

At the low level, CPUs mainly achieve cache coherence between multi-core CPUs through the MESI protocol (specific details can be found in related content). In MESI, to achieve efficient processing of CPUs, two data structures, invalidQueue, and storeBuffer, are introduced. Firstly, in a CPU, only variables (shared variables) occupied by multiple CPUs (CPU0, CPU1) will have visibility issues. For example, if CPU0 and CPU1 both occupy variable x in their caches, and CPU0 starts modifying x at a certain moment, CPU0 will notify CPU1 to invalidate its corresponding cache line. Specifically, at this point, for efficient processing, CPU0 stores the modified value in the storeBuffer, while simultaneously notifying CPU1. CPU1 then puts the corresponding invalidation message in the invalidQueue, and informs CPU0 that the processing is complete. After receiving a response, CPU0 will flush the corresponding modified value from the storeBuffer into the cache. The above is part of how the MESI protocol ensures cache coherence between multi-core CPUs. At this point, we can see that there is actually a problem because CPU0 does not flush the modified value into the main memory, and CPU1 does not execute the corresponding invalidation operation (if the invalidation operation is not executed, CPU1 will still obtain the value from the cache, and only fetch the corresponding value from the main memory after invalidation). Therefore, the volatile modifier ensures that when CPU0 flushes the modified value into the cache, it also flushes it into the main memory, and when CPU1 reads x, it completes all messages in the invalidQueue to ensure visibility between multi-core CPUs.