annotate docs/Atomics.rst @ 33:e4204d083e25 LLVM3.5

LLVM 3.5
author Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
date Thu, 12 Dec 2013 14:32:10 +0900
parents 95c75e76d11b
children 54457678186b
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
1 ==============================================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
2 LLVM Atomic Instructions and Concurrency Guide
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
3 ==============================================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
4
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
5 .. contents::
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
6 :local:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
7
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
8 Introduction
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
9 ============
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
10
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
11 Historically, LLVM has not had very strong support for concurrency; some minimal
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
12 intrinsics were provided, and ``volatile`` was used in some cases to achieve
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
13 rough semantics in the presence of concurrency. However, this is changing;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
14 there are now new instructions which are well-defined in the presence of threads
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
15 and asynchronous signals, and the model for existing instructions has been
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
16 clarified in the IR.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
17
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
18 The atomic instructions are designed specifically to provide readable IR and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
19 optimized code generation for the following:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
20
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
21 * The new C++0x ``<atomic>`` header. (`C++0x draft available here
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
22 <http://www.open-std.org/jtc1/sc22/wg21/>`_.) (`C1x draft available here
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
23 <http://www.open-std.org/jtc1/sc22/wg14/>`_.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
24
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
25 * Proper semantics for Java-style memory, for both ``volatile`` and regular
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
26 shared variables. (`Java Specification
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
27 <http://java.sun.com/docs/books/jls/third_edition/html/memory.html>`_)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
28
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
29 * gcc-compatible ``__sync_*`` builtins. (`Description
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
30 <http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html>`_)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
31
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
32 * Other scenarios with atomic semantics, including ``static`` variables with
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
33 non-trivial constructors in C++.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
34
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
35 Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++ volatile,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
36 which ensures that every volatile load and store happens and is performed in the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
37 stated order. A couple examples: if a SequentiallyConsistent store is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
38 immediately followed by another SequentiallyConsistent store to the same
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
39 address, the first store can be erased. This transformation is not allowed for a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
40 pair of volatile stores. On the other hand, a non-volatile non-atomic load can
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
41 be moved across a volatile load freely, but not an Acquire load.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
42
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
43 This document is intended to provide a guide to anyone either writing a frontend
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
44 for LLVM or working on optimization passes for LLVM with a guide for how to deal
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
45 with instructions with special semantics in the presence of concurrency. This
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
46 is not intended to be a precise guide to the semantics; the details can get
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
47 extremely complicated and unreadable, and are not usually necessary.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
48
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
49 .. _Optimization outside atomic:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
50
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
51 Optimization outside atomic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
52 ===========================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
53
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
54 The basic ``'load'`` and ``'store'`` allow a variety of optimizations, but can
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
55 lead to undefined results in a concurrent environment; see `NotAtomic`_. This
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
56 section specifically goes into the one optimizer restriction which applies in
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
57 concurrent environments, which gets a bit more of an extended description
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
58 because any optimization dealing with stores needs to be aware of it.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
59
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
60 From the optimizer's point of view, the rule is that if there are not any
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
61 instructions with atomic ordering involved, concurrency does not matter, with
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
62 one exception: if a variable might be visible to another thread or signal
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
63 handler, a store cannot be inserted along a path where it might not execute
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
64 otherwise. Take the following example:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
65
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
66 .. code-block:: c
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
67
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
68 /* C code, for readability; run through clang -O2 -S -emit-llvm to get
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
69 equivalent IR */
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
70 int x;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
71 void f(int* a) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
72 for (int i = 0; i < 100; i++) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
73 if (a[i])
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
74 x += 1;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
75 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
76 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
77
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
78 The following is equivalent in non-concurrent situations:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
79
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
80 .. code-block:: c
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
81
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
82 int x;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
83 void f(int* a) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
84 int xtemp = x;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
85 for (int i = 0; i < 100; i++) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
86 if (a[i])
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
87 xtemp += 1;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
88 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
89 x = xtemp;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
90 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
91
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
92 However, LLVM is not allowed to transform the former to the latter: it could
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
93 indirectly introduce undefined behavior if another thread can access ``x`` at
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
94 the same time. (This example is particularly of interest because before the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
95 concurrency model was implemented, LLVM would perform this transformation.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
96
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
97 Note that speculative loads are allowed; a load which is part of a race returns
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
98 ``undef``, but does not have undefined behavior.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
99
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
100 Atomic instructions
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
101 ===================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
102
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
103 For cases where simple loads and stores are not sufficient, LLVM provides
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
104 various atomic instructions. The exact guarantees provided depend on the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
105 ordering; see `Atomic orderings`_.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
106
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
107 ``load atomic`` and ``store atomic`` provide the same basic functionality as
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
108 non-atomic loads and stores, but provide additional guarantees in situations
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
109 where threads and signals are involved.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
110
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
111 ``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
112 atomic store (where the store is conditional for ``cmpxchg``), but no other
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
113 memory operation can happen on any thread between the load and store. Note that
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
114 LLVM's cmpxchg does not provide quite as many options as the C++0x version.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
115
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
116 A ``fence`` provides Acquire and/or Release ordering which is not part of
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
117 another operation; it is normally used along with Monotonic memory operations.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
118 A Monotonic load followed by an Acquire fence is roughly equivalent to an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
119 Acquire load.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
120
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
121 Frontends generating atomic instructions generally need to be aware of the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
122 target to some degree; atomic instructions are guaranteed to be lock-free, and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
123 therefore an instruction which is wider than the target natively supports can be
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
124 impossible to generate.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
125
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
126 .. _Atomic orderings:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
127
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
128 Atomic orderings
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
129 ================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
130
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
131 In order to achieve a balance between performance and necessary guarantees,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
132 there are six levels of atomicity. They are listed in order of strength; each
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
133 level includes all the guarantees of the previous level except for
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
134 Acquire/Release. (See also `LangRef Ordering <LangRef.html#ordering>`_.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
135
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
136 .. _NotAtomic:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
137
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
138 NotAtomic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
139 ---------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
140
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
141 NotAtomic is the obvious, a load or store which is not atomic. (This isn't
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
142 really a level of atomicity, but is listed here for comparison.) This is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
143 essentially a regular load or store. If there is a race on a given memory
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
144 location, loads from that location return undef.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
145
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
146 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
147 This is intended to match shared variables in C/C++, and to be used in any
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
148 other context where memory access is necessary, and a race is impossible. (The
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
149 precise definition is in `LangRef Memory Model <LangRef.html#memmodel>`_.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
150
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
151 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
152 The rule is essentially that all memory accessed with basic loads and stores
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
153 by multiple threads should be protected by a lock or other synchronization;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
154 otherwise, you are likely to run into undefined behavior. If your frontend is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
155 for a "safe" language like Java, use Unordered to load and store any shared
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
156 variable. Note that NotAtomic volatile loads and stores are not properly
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
157 atomic; do not try to use them as a substitute. (Per the C/C++ standards,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
158 volatile does provide some limited guarantees around asynchronous signals, but
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
159 atomics are generally a better solution.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
160
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
161 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
162 Introducing loads to shared variables along a codepath where they would not
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
163 otherwise exist is allowed; introducing stores to shared variables is not. See
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
164 `Optimization outside atomic`_.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
165
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
166 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
167 The one interesting restriction here is that it is not allowed to write to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
168 bytes outside of the bytes relevant to a store. This is mostly relevant to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
169 unaligned stores: it is not allowed in general to convert an unaligned store
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
170 into two aligned stores of the same width as the unaligned store. Backends are
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
171 also expected to generate an i8 store as an i8 store, and not an instruction
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
172 which writes to surrounding bytes. (If you are writing a backend for an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
173 architecture which cannot satisfy these restrictions and cares about
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
174 concurrency, please send an email to llvmdev.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
175
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
176 Unordered
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
177 ---------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
178
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
179 Unordered is the lowest level of atomicity. It essentially guarantees that races
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
180 produce somewhat sane results instead of having undefined behavior. It also
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
181 guarantees the operation to be lock-free, so it do not depend on the data being
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
182 part of a special atomic structure or depend on a separate per-process global
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
183 lock. Note that code generation will fail for unsupported atomic operations; if
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
184 you need such an operation, use explicit locking.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
185
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
186 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
187 This is intended to match the Java memory model for shared variables.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
188
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
189 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
190 This cannot be used for synchronization, but is useful for Java and other
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
191 "safe" languages which need to guarantee that the generated code never
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
192 exhibits undefined behavior. Note that this guarantee is cheap on common
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
193 platforms for loads of a native width, but can be expensive or unavailable for
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
194 wider loads, like a 64-bit store on ARM. (A frontend for Java or other "safe"
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
195 languages would normally split a 64-bit store on ARM into two 32-bit unordered
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
196 stores.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
197
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
198 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
199 In terms of the optimizer, this prohibits any transformation that transforms a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
200 single load into multiple loads, transforms a store into multiple stores,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
201 narrows a store, or stores a value which would not be stored otherwise. Some
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
202 examples of unsafe optimizations are narrowing an assignment into a bitfield,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
203 rematerializing a load, and turning loads and stores into a memcpy
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
204 call. Reordering unordered operations is safe, though, and optimizers should
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
205 take advantage of that because unordered operations are common in languages
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
206 that need them.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
207
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
208 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
209 These operations are required to be atomic in the sense that if you use
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
210 unordered loads and unordered stores, a load cannot see a value which was
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
211 never stored. A normal load or store instruction is usually sufficient, but
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
212 note that an unordered load or store cannot be split into multiple
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
213 instructions (or an instruction which does multiple memory operations, like
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
214 ``LDRD`` on ARM without LPAE, or not naturally-aligned ``LDRD`` on LPAE ARM).
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
215
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
216 Monotonic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
217 ---------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
218
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
219 Monotonic is the weakest level of atomicity that can be used in synchronization
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
220 primitives, although it does not provide any general synchronization. It
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
221 essentially guarantees that if you take all the operations affecting a specific
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
222 address, a consistent ordering exists.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
223
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
224 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
225 This corresponds to the C++0x/C1x ``memory_order_relaxed``; see those
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
226 standards for the exact definition.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
227
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
228 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
229 If you are writing a frontend which uses this directly, use with caution. The
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
230 guarantees in terms of synchronization are very weak, so make sure these are
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
231 only used in a pattern which you know is correct. Generally, these would
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
232 either be used for atomic operations which do not protect other memory (like
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
233 an atomic counter), or along with a ``fence``.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
234
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
235 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
236 In terms of the optimizer, this can be treated as a read+write on the relevant
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
237 memory location (and alias analysis will take advantage of that). In addition,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
238 it is legal to reorder non-atomic and Unordered loads around Monotonic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
239 loads. CSE/DSE and a few other optimizations are allowed, but Monotonic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
240 operations are unlikely to be used in ways which would make those
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
241 optimizations useful.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
242
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
243 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
244 Code generation is essentially the same as that for unordered for loads and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
245 stores. No fences are required. ``cmpxchg`` and ``atomicrmw`` are required
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
246 to appear as a single operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
247
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
248 Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
249 -------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
250
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
251 Acquire provides a barrier of the sort necessary to acquire a lock to access
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
252 other memory with normal loads and stores.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
253
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
254 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
255 This corresponds to the C++0x/C1x ``memory_order_acquire``. It should also be
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
256 used for C++0x/C1x ``memory_order_consume``.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
257
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
258 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
259 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
260 Acquire only provides a semantic guarantee when paired with a Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
261 operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
262
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
263 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
264 Optimizers not aware of atomics can treat this like a nothrow call. It is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
265 also possible to move stores from before an Acquire load or read-modify-write
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
266 operation to after it, and move non-Acquire loads from before an Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
267 operation to after it.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
268
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
269 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
270 Architectures with weak memory ordering (essentially everything relevant today
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
271 except x86 and SPARC) require some sort of fence to maintain the Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
272 semantics. The precise fences required varies widely by architecture, but for
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
273 a simple implementation, most architectures provide a barrier which is strong
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
274 enough for everything (``dmb`` on ARM, ``sync`` on PowerPC, etc.). Putting
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
275 such a fence after the equivalent Monotonic operation is sufficient to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
276 maintain Acquire semantics for a memory operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
277
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
278 Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
279 -------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
280
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
281 Release is similar to Acquire, but with a barrier of the sort necessary to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
282 release a lock.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
283
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
284 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
285 This corresponds to the C++0x/C1x ``memory_order_release``.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
286
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
287 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
288 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
289 Release only provides a semantic guarantee when paired with a Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
290 operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
291
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
292 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
293 Optimizers not aware of atomics can treat this like a nothrow call. It is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
294 also possible to move loads from after a Release store or read-modify-write
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
295 operation to before it, and move non-Release stores from after an Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
296 operation to before it.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
297
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
298 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
299 See the section on Acquire; a fence before the relevant operation is usually
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
300 sufficient for Release. Note that a store-store fence is not sufficient to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
301 implement Release semantics; store-store fences are generally not exposed to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
302 IR because they are extremely difficult to use correctly.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
303
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
304 AcquireRelease
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
305 --------------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
306
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
307 AcquireRelease (``acq_rel`` in IR) provides both an Acquire and a Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
308 barrier (for fences and operations which both read and write memory).
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
309
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
310 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
311 This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
312
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
313 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
314 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
315 Acquire only provides a semantic guarantee when paired with a Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
316 operation, and vice versa.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
317
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
318 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
319 In general, optimizers should treat this like a nothrow call; the possible
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
320 optimizations are usually not interesting.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
321
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
322 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
323 This operation has Acquire and Release semantics; see the sections on Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
324 and Release.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
325
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
326 SequentiallyConsistent
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
327 ----------------------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
328
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
329 SequentiallyConsistent (``seq_cst`` in IR) provides Acquire semantics for loads
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
330 and Release semantics for stores. Additionally, it guarantees that a total
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
331 ordering exists between all SequentiallyConsistent operations.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
332
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
333 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
334 This corresponds to the C++0x/C1x ``memory_order_seq_cst``, Java volatile, and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
335 the gcc-compatible ``__sync_*`` builtins which do not specify otherwise.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
336
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
337 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
338 If a frontend is exposing atomic operations, these are much easier to reason
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
339 about for the programmer than other kinds of operations, and using them is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
340 generally a practical performance tradeoff.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
341
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
342 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
343 Optimizers not aware of atomics can treat this like a nothrow call. For
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
344 SequentiallyConsistent loads and stores, the same reorderings are allowed as
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
345 for Acquire loads and Release stores, except that SequentiallyConsistent
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
346 operations may not be reordered.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
347
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
348 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
349 SequentiallyConsistent loads minimally require the same barriers as Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
350 operations and SequentiallyConsistent stores require Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
351 barriers. Additionally, the code generator must enforce ordering between
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
352 SequentiallyConsistent stores followed by SequentiallyConsistent loads. This
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
353 is usually done by emitting either a full fence before the loads or a full
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
354 fence after the stores; which is preferred varies by architecture.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
355
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
356 Atomics and IR optimization
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
357 ===========================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
358
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
359 Predicates for optimizer writers to query:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
360
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
361 * ``isSimple()``: A load or store which is not volatile or atomic. This is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
362 what, for example, memcpyopt would check for operations it might transform.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
363
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
364 * ``isUnordered()``: A load or store which is not volatile and at most
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
365 Unordered. This would be checked, for example, by LICM before hoisting an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
366 operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
367
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
368 * ``mayReadFromMemory()``/``mayWriteToMemory()``: Existing predicate, but note
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
369 that they return true for any operation which is volatile or at least
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
370 Monotonic.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
371
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
372 * Alias analysis: Note that AA will return ModRef for anything Acquire or
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
373 Release, and for the address accessed by any Monotonic operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
374
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
375 To support optimizing around atomic operations, make sure you are using the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
376 right predicates; everything should work if that is done. If your pass should
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
377 optimize some atomic operations (Unordered operations in particular), make sure
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
378 it doesn't replace an atomic load or store with a non-atomic operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
379
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
380 Some examples of how optimizations interact with various kinds of atomic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
381 operations:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
382
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
383 * ``memcpyopt``: An atomic operation cannot be optimized into part of a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
384 memcpy/memset, including unordered loads/stores. It can pull operations
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
385 across some atomic operations.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
386
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
387 * LICM: Unordered loads/stores can be moved out of a loop. It just treats
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
388 monotonic operations like a read+write to a memory location, and anything
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
389 stricter than that like a nothrow call.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
390
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
391 * DSE: Unordered stores can be DSE'ed like normal stores. Monotonic stores can
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
392 be DSE'ed in some cases, but it's tricky to reason about, and not especially
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
393 important.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
394
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
395 * Folding a load: Any atomic load from a constant global can be constant-folded,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
396 because it cannot be observed. Similar reasoning allows scalarrepl with
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
397 atomic loads and stores.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
398
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
399 Atomics and Codegen
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
400 ===================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
401
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
402 Atomic operations are represented in the SelectionDAG with ``ATOMIC_*`` opcodes.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
403 On architectures which use barrier instructions for all atomic ordering (like
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
404 ARM), appropriate fences are split out as the DAG is built.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
405
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
406 The MachineMemOperand for all atomic operations is currently marked as volatile;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
407 this is not correct in the IR sense of volatile, but CodeGen handles anything
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
408 marked volatile very conservatively. This should get fixed at some point.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
409
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
410 Common architectures have some way of representing at least a pointer-sized
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
411 lock-free ``cmpxchg``; such an operation can be used to implement all the other
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
412 atomic operations which can be represented in IR up to that size. Backends are
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
413 expected to implement all those operations, but not operations which cannot be
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
414 implemented in a lock-free manner. It is expected that backends will give an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
415 error when given an operation which cannot be implemented. (The LLVM code
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
416 generator is not very helpful here at the moment, but hopefully that will
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
417 change.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
418
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
419 The implementation of atomics on LL/SC architectures (like ARM) is currently a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
420 bit of a mess; there is a lot of copy-pasted code across targets, and the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
421 representation is relatively unsuited to optimization (it would be nice to be
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
422 able to optimize loops involving cmpxchg etc.).
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
423
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
424 On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
425 generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
426 fences generate an ``MFENCE``, other fences do not cause any code to be
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
427 generated. cmpxchg uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg``
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
428 uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
429 other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
430 on the users of the result, some ``atomicrmw`` operations can be translated into
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
431 operations like ``LOCK AND``, but that does not work in general.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
432
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
433 On ARM, MIPS, and many other RISC architectures, Acquire, Release, and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
434 SequentiallyConsistent semantics require barrier instructions for every such
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
435 operation. Loads and stores generate normal instructions. ``cmpxchg`` and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
436 ``atomicrmw`` can be represented using a loop with LL/SC-style instructions
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
437 which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX``
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
438 on ARM, etc.). At the moment, the IR does not provide any way to represent a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
439 weak ``cmpxchg`` which would not require a loop.