C++, the volatile / memory barrier

The C and C++ standards do not address multiple threads (or multiple processors), and as such, the usefulness of volatile depends on the compiler and hardware. Although volatile guarantees that the reads and writes will happen in the exact order specified in the source code, the compiler may generate code which reorders a volatile read or write with non-volatile reads or writes, thus limiting its usefulness as an inter-thread flag or mutex. Moreover, you are not guaranteed that volatile reads and writes will be seen in the same order by other processors due to caching, meaning volatile variables may not even work as inter-thread flags or mutexes.

Some languages and compilers may provide sufficient facilities to implement functions which address both the compiler reordering and machine reordering issues. In Java version 1.5 (also known as version 5), the volatile keyword is now guaranteed to prevent certain hardware and compiler re-orderings, as part of the new Java Memory Model. The proposed C++ memory model does not use volatile, instead C++0x will include special atomic types and operations with semantics similar to those of volatile in the Java Memory Model.


The code you write is not necessarily executed in the order in which the instructions appear in the source.

Optimizing compilers, such as the Microsoft C compiler, sometimes eliminate or reorder read and write instructions if the optimizations do not break the logic of the routine being compiled. In addition, certain hardware architectures sometimes reorder read and write instructions to improve performance. Furthermore, on multiprocessor architectures, the sequence in which read and write operations are executed can appear different from the perspective of different processors.

Most of the time, reordering by the compiler or the hardware is completely invisible and has no effect on results other than generating them more efficiently. However, in a few situations, you must prevent or control reordering. The volatile keyword in C and the Windows synchronization mechanisms can ensure program order of execution in nearly all situations. In some rare instances, the executable code must contain memory barriers to prevent hardware reordering.

Complete information about compiler and hardware reordering and the use of memory barriers is now available in Multiprocessor Considerations for Kernel-Mode Drivers. This information expands on the information previously available in the paper "Memory Barriers in Kernel-Mode Drivers."


If you look at the sample drivers shipped with the Windows DDK, you will see that volatile appears infrequently. In general, volatile is of limited use in driver code for the following reasons:
•    Using volatile prevents optimization only of the volatile variables themselves. It does not prevent optimizations of nonvolatile variables relative to volatile variables. For example, a write to a nonvolatile variable that precedes a read from a volatile variable in the source code might be moved to execute after the read.
•    Using volatile does not prevent the reordering of instructions by the processor hardware.
•    Using volatile correctly is not enough on a multiprocessor system to guarantee that all CPUs see memory accesses in the same order. 
Windows synchronization mechanisms are more useful in preventing all these potential problems.


0 komentarze:

Prześlij komentarz

Tomasz Kulig