annotate docs/Atomics.rst @ 121:803732b1fca8

LLVM 5.0
author kono
date Fri, 27 Oct 2017 17:07:41 +0900
parents 1172e4bd9c6f
children c2174574ed3a
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
1 ==============================================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
2 LLVM Atomic Instructions and Concurrency Guide
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
3 ==============================================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
4
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
5 .. contents::
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
6 :local:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
7
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
8 Introduction
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
9 ============
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
10
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
11 LLVM supports instructions which are well-defined in the presence of threads and
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
12 asynchronous signals.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
13
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
14 The atomic instructions are designed specifically to provide readable IR and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
15 optimized code generation for the following:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
16
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
17 * The C++11 ``<atomic>`` header. (`C++11 draft available here
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
18 <http://www.open-std.org/jtc1/sc22/wg21/>`_.) (`C11 draft available here
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
19 <http://www.open-std.org/jtc1/sc22/wg14/>`_.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
20
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
21 * Proper semantics for Java-style memory, for both ``volatile`` and regular
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
22 shared variables. (`Java Specification
77
54457678186b LLVM 3.6
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 0
diff changeset
23 <http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html>`_)
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
24
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
25 * gcc-compatible ``__sync_*`` builtins. (`Description
77
54457678186b LLVM 3.6
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 0
diff changeset
26 <https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html>`_)
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
27
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
28 * Other scenarios with atomic semantics, including ``static`` variables with
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
29 non-trivial constructors in C++.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
30
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
31 Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++ volatile,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
32 which ensures that every volatile load and store happens and is performed in the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
33 stated order. A couple examples: if a SequentiallyConsistent store is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
34 immediately followed by another SequentiallyConsistent store to the same
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
35 address, the first store can be erased. This transformation is not allowed for a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
36 pair of volatile stores. On the other hand, a non-volatile non-atomic load can
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
37 be moved across a volatile load freely, but not an Acquire load.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
38
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
39 This document is intended to provide a guide to anyone either writing a frontend
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
40 for LLVM or working on optimization passes for LLVM with a guide for how to deal
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
41 with instructions with special semantics in the presence of concurrency. This
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
42 is not intended to be a precise guide to the semantics; the details can get
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
43 extremely complicated and unreadable, and are not usually necessary.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
44
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
45 .. _Optimization outside atomic:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
46
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
47 Optimization outside atomic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
48 ===========================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
49
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
50 The basic ``'load'`` and ``'store'`` allow a variety of optimizations, but can
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
51 lead to undefined results in a concurrent environment; see `NotAtomic`_. This
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
52 section specifically goes into the one optimizer restriction which applies in
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
53 concurrent environments, which gets a bit more of an extended description
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
54 because any optimization dealing with stores needs to be aware of it.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
55
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
56 From the optimizer's point of view, the rule is that if there are not any
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
57 instructions with atomic ordering involved, concurrency does not matter, with
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
58 one exception: if a variable might be visible to another thread or signal
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
59 handler, a store cannot be inserted along a path where it might not execute
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
60 otherwise. Take the following example:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
61
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
62 .. code-block:: c
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
63
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
64 /* C code, for readability; run through clang -O2 -S -emit-llvm to get
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
65 equivalent IR */
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
66 int x;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
67 void f(int* a) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
68 for (int i = 0; i < 100; i++) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
69 if (a[i])
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
70 x += 1;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
71 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
72 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
73
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
74 The following is equivalent in non-concurrent situations:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
75
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
76 .. code-block:: c
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
77
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
78 int x;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
79 void f(int* a) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
80 int xtemp = x;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
81 for (int i = 0; i < 100; i++) {
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
82 if (a[i])
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
83 xtemp += 1;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
84 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
85 x = xtemp;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
86 }
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
87
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
88 However, LLVM is not allowed to transform the former to the latter: it could
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
89 indirectly introduce undefined behavior if another thread can access ``x`` at
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
90 the same time. (This example is particularly of interest because before the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
91 concurrency model was implemented, LLVM would perform this transformation.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
92
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
93 Note that speculative loads are allowed; a load which is part of a race returns
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
94 ``undef``, but does not have undefined behavior.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
95
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
96 Atomic instructions
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
97 ===================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
98
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
99 For cases where simple loads and stores are not sufficient, LLVM provides
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
100 various atomic instructions. The exact guarantees provided depend on the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
101 ordering; see `Atomic orderings`_.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
102
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
103 ``load atomic`` and ``store atomic`` provide the same basic functionality as
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
104 non-atomic loads and stores, but provide additional guarantees in situations
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
105 where threads and signals are involved.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
106
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
107 ``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
108 atomic store (where the store is conditional for ``cmpxchg``), but no other
77
54457678186b LLVM 3.6
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 0
diff changeset
109 memory operation can happen on any thread between the load and store.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
110
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
111 A ``fence`` provides Acquire and/or Release ordering which is not part of
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
112 another operation; it is normally used along with Monotonic memory operations.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
113 A Monotonic load followed by an Acquire fence is roughly equivalent to an
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
114 Acquire load, and a Monotonic store following a Release fence is roughly
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
115 equivalent to a Release store. SequentiallyConsistent fences behave as both
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
116 an Acquire and a Release fence, and offer some additional complicated
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
117 guarantees, see the C++11 standard for details.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
118
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
119 Frontends generating atomic instructions generally need to be aware of the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
120 target to some degree; atomic instructions are guaranteed to be lock-free, and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
121 therefore an instruction which is wider than the target natively supports can be
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
122 impossible to generate.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
123
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
124 .. _Atomic orderings:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
125
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
126 Atomic orderings
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
127 ================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
128
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
129 In order to achieve a balance between performance and necessary guarantees,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
130 there are six levels of atomicity. They are listed in order of strength; each
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
131 level includes all the guarantees of the previous level except for
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
132 Acquire/Release. (See also `LangRef Ordering <LangRef.html#ordering>`_.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
133
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
134 .. _NotAtomic:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
135
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
136 NotAtomic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
137 ---------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
138
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
139 NotAtomic is the obvious, a load or store which is not atomic. (This isn't
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
140 really a level of atomicity, but is listed here for comparison.) This is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
141 essentially a regular load or store. If there is a race on a given memory
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
142 location, loads from that location return undef.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
143
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
144 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
145 This is intended to match shared variables in C/C++, and to be used in any
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
146 other context where memory access is necessary, and a race is impossible. (The
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
147 precise definition is in `LangRef Memory Model <LangRef.html#memmodel>`_.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
148
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
149 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
150 The rule is essentially that all memory accessed with basic loads and stores
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
151 by multiple threads should be protected by a lock or other synchronization;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
152 otherwise, you are likely to run into undefined behavior. If your frontend is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
153 for a "safe" language like Java, use Unordered to load and store any shared
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
154 variable. Note that NotAtomic volatile loads and stores are not properly
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
155 atomic; do not try to use them as a substitute. (Per the C/C++ standards,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
156 volatile does provide some limited guarantees around asynchronous signals, but
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
157 atomics are generally a better solution.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
158
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
159 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
160 Introducing loads to shared variables along a codepath where they would not
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
161 otherwise exist is allowed; introducing stores to shared variables is not. See
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
162 `Optimization outside atomic`_.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
163
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
164 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
165 The one interesting restriction here is that it is not allowed to write to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
166 bytes outside of the bytes relevant to a store. This is mostly relevant to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
167 unaligned stores: it is not allowed in general to convert an unaligned store
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
168 into two aligned stores of the same width as the unaligned store. Backends are
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
169 also expected to generate an i8 store as an i8 store, and not an instruction
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
170 which writes to surrounding bytes. (If you are writing a backend for an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
171 architecture which cannot satisfy these restrictions and cares about
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 83
diff changeset
172 concurrency, please send an email to llvm-dev.)
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
173
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
174 Unordered
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
175 ---------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
176
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
177 Unordered is the lowest level of atomicity. It essentially guarantees that races
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
178 produce somewhat sane results instead of having undefined behavior. It also
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
179 guarantees the operation to be lock-free, so it does not depend on the data
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
180 being part of a special atomic structure or depend on a separate per-process
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
181 global lock. Note that code generation will fail for unsupported atomic
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
182 operations; if you need such an operation, use explicit locking.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
183
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
184 Relevant standard
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
185 This is intended to match the Java memory model for shared variables.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
186
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
187 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
188 This cannot be used for synchronization, but is useful for Java and other
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
189 "safe" languages which need to guarantee that the generated code never
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
190 exhibits undefined behavior. Note that this guarantee is cheap on common
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
191 platforms for loads of a native width, but can be expensive or unavailable for
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
192 wider loads, like a 64-bit store on ARM. (A frontend for Java or other "safe"
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
193 languages would normally split a 64-bit store on ARM into two 32-bit unordered
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
194 stores.)
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
195
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
196 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
197 In terms of the optimizer, this prohibits any transformation that transforms a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
198 single load into multiple loads, transforms a store into multiple stores,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
199 narrows a store, or stores a value which would not be stored otherwise. Some
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
200 examples of unsafe optimizations are narrowing an assignment into a bitfield,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
201 rematerializing a load, and turning loads and stores into a memcpy
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
202 call. Reordering unordered operations is safe, though, and optimizers should
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
203 take advantage of that because unordered operations are common in languages
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
204 that need them.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
205
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
206 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
207 These operations are required to be atomic in the sense that if you use
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
208 unordered loads and unordered stores, a load cannot see a value which was
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
209 never stored. A normal load or store instruction is usually sufficient, but
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
210 note that an unordered load or store cannot be split into multiple
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
211 instructions (or an instruction which does multiple memory operations, like
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
212 ``LDRD`` on ARM without LPAE, or not naturally-aligned ``LDRD`` on LPAE ARM).
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
213
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
214 Monotonic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
215 ---------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
216
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
217 Monotonic is the weakest level of atomicity that can be used in synchronization
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
218 primitives, although it does not provide any general synchronization. It
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
219 essentially guarantees that if you take all the operations affecting a specific
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
220 address, a consistent ordering exists.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
221
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
222 Relevant standard
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
223 This corresponds to the C++11/C11 ``memory_order_relaxed``; see those
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
224 standards for the exact definition.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
225
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
226 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
227 If you are writing a frontend which uses this directly, use with caution. The
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
228 guarantees in terms of synchronization are very weak, so make sure these are
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
229 only used in a pattern which you know is correct. Generally, these would
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
230 either be used for atomic operations which do not protect other memory (like
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
231 an atomic counter), or along with a ``fence``.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
232
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
233 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
234 In terms of the optimizer, this can be treated as a read+write on the relevant
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
235 memory location (and alias analysis will take advantage of that). In addition,
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
236 it is legal to reorder non-atomic and Unordered loads around Monotonic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
237 loads. CSE/DSE and a few other optimizations are allowed, but Monotonic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
238 operations are unlikely to be used in ways which would make those
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
239 optimizations useful.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
240
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
241 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
242 Code generation is essentially the same as that for unordered for loads and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
243 stores. No fences are required. ``cmpxchg`` and ``atomicrmw`` are required
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
244 to appear as a single operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
245
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
246 Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
247 -------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
248
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
249 Acquire provides a barrier of the sort necessary to acquire a lock to access
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
250 other memory with normal loads and stores.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
251
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
252 Relevant standard
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
253 This corresponds to the C++11/C11 ``memory_order_acquire``. It should also be
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
254 used for C++11/C11 ``memory_order_consume``.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
255
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
256 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
257 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
258 Acquire only provides a semantic guarantee when paired with a Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
259 operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
260
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
261 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
262 Optimizers not aware of atomics can treat this like a nothrow call. It is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
263 also possible to move stores from before an Acquire load or read-modify-write
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
264 operation to after it, and move non-Acquire loads from before an Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
265 operation to after it.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
266
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
267 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
268 Architectures with weak memory ordering (essentially everything relevant today
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
269 except x86 and SPARC) require some sort of fence to maintain the Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
270 semantics. The precise fences required varies widely by architecture, but for
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
271 a simple implementation, most architectures provide a barrier which is strong
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
272 enough for everything (``dmb`` on ARM, ``sync`` on PowerPC, etc.). Putting
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
273 such a fence after the equivalent Monotonic operation is sufficient to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
274 maintain Acquire semantics for a memory operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
275
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
276 Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
277 -------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
278
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
279 Release is similar to Acquire, but with a barrier of the sort necessary to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
280 release a lock.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
281
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
282 Relevant standard
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
283 This corresponds to the C++11/C11 ``memory_order_release``.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
284
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
285 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
286 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
287 Release only provides a semantic guarantee when paired with a Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
288 operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
289
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
290 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
291 Optimizers not aware of atomics can treat this like a nothrow call. It is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
292 also possible to move loads from after a Release store or read-modify-write
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
293 operation to before it, and move non-Release stores from after an Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
294 operation to before it.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
295
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
296 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
297 See the section on Acquire; a fence before the relevant operation is usually
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
298 sufficient for Release. Note that a store-store fence is not sufficient to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
299 implement Release semantics; store-store fences are generally not exposed to
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
300 IR because they are extremely difficult to use correctly.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
301
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
302 AcquireRelease
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
303 --------------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
304
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
305 AcquireRelease (``acq_rel`` in IR) provides both an Acquire and a Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
306 barrier (for fences and operations which both read and write memory).
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
307
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
308 Relevant standard
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
309 This corresponds to the C++11/C11 ``memory_order_acq_rel``.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
310
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
311 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
312 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
313 Acquire only provides a semantic guarantee when paired with a Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
314 operation, and vice versa.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
315
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
316 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
317 In general, optimizers should treat this like a nothrow call; the possible
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
318 optimizations are usually not interesting.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
319
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
320 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
321 This operation has Acquire and Release semantics; see the sections on Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
322 and Release.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
323
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
324 SequentiallyConsistent
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
325 ----------------------
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
326
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
327 SequentiallyConsistent (``seq_cst`` in IR) provides Acquire semantics for loads
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
328 and Release semantics for stores. Additionally, it guarantees that a total
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
329 ordering exists between all SequentiallyConsistent operations.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
330
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
331 Relevant standard
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
332 This corresponds to the C++11/C11 ``memory_order_seq_cst``, Java volatile, and
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
333 the gcc-compatible ``__sync_*`` builtins which do not specify otherwise.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
334
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
335 Notes for frontends
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
336 If a frontend is exposing atomic operations, these are much easier to reason
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
337 about for the programmer than other kinds of operations, and using them is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
338 generally a practical performance tradeoff.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
339
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
340 Notes for optimizers
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
341 Optimizers not aware of atomics can treat this like a nothrow call. For
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
342 SequentiallyConsistent loads and stores, the same reorderings are allowed as
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
343 for Acquire loads and Release stores, except that SequentiallyConsistent
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
344 operations may not be reordered.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
345
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
346 Notes for code generation
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
347 SequentiallyConsistent loads minimally require the same barriers as Acquire
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
348 operations and SequentiallyConsistent stores require Release
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
349 barriers. Additionally, the code generator must enforce ordering between
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
350 SequentiallyConsistent stores followed by SequentiallyConsistent loads. This
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
351 is usually done by emitting either a full fence before the loads or a full
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
352 fence after the stores; which is preferred varies by architecture.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
353
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
354 Atomics and IR optimization
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
355 ===========================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
356
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
357 Predicates for optimizer writers to query:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
358
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
359 * ``isSimple()``: A load or store which is not volatile or atomic. This is
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
360 what, for example, memcpyopt would check for operations it might transform.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
361
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
362 * ``isUnordered()``: A load or store which is not volatile and at most
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
363 Unordered. This would be checked, for example, by LICM before hoisting an
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
364 operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
365
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
366 * ``mayReadFromMemory()``/``mayWriteToMemory()``: Existing predicate, but note
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
367 that they return true for any operation which is volatile or at least
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
368 Monotonic.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
369
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
370 * ``isStrongerThan`` / ``isAtLeastOrStrongerThan``: These are predicates on
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
371 orderings. They can be useful for passes that are aware of atomics, for
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
372 example to do DSE across a single atomic access, but not across a
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
373 release-acquire pair (see MemoryDependencyAnalysis for an example of this)
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
374
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
375 * Alias analysis: Note that AA will return ModRef for anything Acquire or
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
376 Release, and for the address accessed by any Monotonic operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
377
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
378 To support optimizing around atomic operations, make sure you are using the
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
379 right predicates; everything should work if that is done. If your pass should
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
380 optimize some atomic operations (Unordered operations in particular), make sure
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
381 it doesn't replace an atomic load or store with a non-atomic operation.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
382
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
383 Some examples of how optimizations interact with various kinds of atomic
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
384 operations:
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
385
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
386 * ``memcpyopt``: An atomic operation cannot be optimized into part of a
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
387 memcpy/memset, including unordered loads/stores. It can pull operations
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
388 across some atomic operations.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
389
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
390 * LICM: Unordered loads/stores can be moved out of a loop. It just treats
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
391 monotonic operations like a read+write to a memory location, and anything
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
392 stricter than that like a nothrow call.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
393
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
394 * DSE: Unordered stores can be DSE'ed like normal stores. Monotonic stores can
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
395 be DSE'ed in some cases, but it's tricky to reason about, and not especially
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
396 important. It is possible in some case for DSE to operate across a stronger
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
397 atomic operation, but it is fairly tricky. DSE delegates this reasoning to
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
398 MemoryDependencyAnalysis (which is also used by other passes like GVN).
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
399
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
400 * Folding a load: Any atomic load from a constant global can be constant-folded,
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
401 because it cannot be observed. Similar reasoning allows sroa with
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
402 atomic loads and stores.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
403
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
404 Atomics and Codegen
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
405 ===================
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
406
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
407 Atomic operations are represented in the SelectionDAG with ``ATOMIC_*`` opcodes.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
408 On architectures which use barrier instructions for all atomic ordering (like
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
409 ARM), appropriate fences can be emitted by the AtomicExpand Codegen pass if
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
410 ``setInsertFencesForAtomic()`` was used.
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
411
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
412 The MachineMemOperand for all atomic operations is currently marked as volatile;
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
413 this is not correct in the IR sense of volatile, but CodeGen handles anything
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
414 marked volatile very conservatively. This should get fixed at some point.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
415
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
416 One very important property of the atomic operations is that if your backend
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
417 supports any inline lock-free atomic operations of a given size, you should
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
418 support *ALL* operations of that size in a lock-free manner.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
419
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
420 When the target implements atomic ``cmpxchg`` or LL/SC instructions (as most do)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
421 this is trivial: all the other operations can be implemented on top of those
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
422 primitives. However, on many older CPUs (e.g. ARMv5, SparcV8, Intel 80386) there
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
423 are atomic load and store instructions, but no ``cmpxchg`` or LL/SC. As it is
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
424 invalid to implement ``atomic load`` using the native instruction, but
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
425 ``cmpxchg`` using a library call to a function that uses a mutex, ``atomic
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
426 load`` must *also* expand to a library call on such architectures, so that it
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
427 can remain atomic with regards to a simultaneous ``cmpxchg``, by using the same
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
428 mutex.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
429
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
430 AtomicExpandPass can help with that: it will expand all atomic operations to the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
431 proper ``__atomic_*`` libcalls for any size above the maximum set by
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
432 ``setMaxAtomicSizeInBitsSupported`` (which defaults to 0).
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
433
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
434 On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
435 generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
436 fences generate an ``MFENCE``, other fences do not cause any code to be
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
437 generated. ``cmpxchg`` uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg``
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
438 uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
439 other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
440 on the users of the result, some ``atomicrmw`` operations can be translated into
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
441 operations like ``LOCK AND``, but that does not work in general.
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
442
77
54457678186b LLVM 3.6
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 0
diff changeset
443 On ARM (before v8), MIPS, and many other RISC architectures, Acquire, Release,
54457678186b LLVM 3.6
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 0
diff changeset
444 and SequentiallyConsistent semantics require barrier instructions for every such
0
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
445 operation. Loads and stores generate normal instructions. ``cmpxchg`` and
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
446 ``atomicrmw`` can be represented using a loop with LL/SC-style instructions
95c75e76d11b LLVM 3.4
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
447 which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX``
77
54457678186b LLVM 3.6
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 0
diff changeset
448 on ARM, etc.).
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
449
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
450 It is often easiest for backends to use AtomicExpandPass to lower some of the
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
451 atomic constructs. Here are some lowerings it can do:
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
452
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
453 * cmpxchg -> loop with load-linked/store-conditional
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents: 83
diff changeset
454 by overriding ``shouldExpandAtomicCmpXchgInIR()``, ``emitLoadLinked()``,
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
455 ``emitStoreConditional()``
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
456 * large loads/stores -> ll-sc/cmpxchg
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
457 by overriding ``shouldExpandAtomicStoreInIR()``/``shouldExpandAtomicLoadInIR()``
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
458 * strong atomic accesses -> monotonic accesses + fences by overriding
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
459 ``shouldInsertFencesForAtomic()``, ``emitLeadingFence()``, and
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
460 ``emitTrailingFence()``
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
461 * atomic rmw -> loop with cmpxchg or load-linked/store-conditional
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
462 by overriding ``expandAtomicRMWInIR()``
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
463 * expansion to __atomic_* libcalls for unsupported sizes.
83
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
464
60c9769439b8 LLVM 3.7
Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp>
parents: 77
diff changeset
465 For an example of all of these, look at the ARM backend.
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
466
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
467 Libcalls: __atomic_*
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
468 ====================
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
469
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
470 There are two kinds of atomic library calls that are generated by LLVM. Please
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
471 note that both sets of library functions somewhat confusingly share the names of
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
472 builtin functions defined by clang. Despite this, the library functions are
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
473 not directly related to the builtins: it is *not* the case that ``__atomic_*``
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
474 builtins lower to ``__atomic_*`` library calls and ``__sync_*`` builtins lower
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
475 to ``__sync_*`` library calls.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
476
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
477 The first set of library functions are named ``__atomic_*``. This set has been
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
478 "standardized" by GCC, and is described below. (See also `GCC's documentation
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
479 <https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>`_)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
480
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
481 LLVM's AtomicExpandPass will translate atomic operations on data sizes above
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
482 ``MaxAtomicSizeInBitsSupported`` into calls to these functions.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
483
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
484 There are four generic functions, which can be called with data of any size or
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
485 alignment::
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
486
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
487 void __atomic_load(size_t size, void *ptr, void *ret, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
488 void __atomic_store(size_t size, void *ptr, void *val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
489 void __atomic_exchange(size_t size, void *ptr, void *val, void *ret, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
490 bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, void *desired, int success_order, int failure_order)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
491
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
492 There are also size-specialized versions of the above functions, which can only
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
493 be used with *naturally-aligned* pointers of the appropriate size. In the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
494 signatures below, "N" is one of 1, 2, 4, 8, and 16, and "iN" is the appropriate
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
495 integer type of that size; if no such integer type exists, the specialization
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
496 cannot be used::
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
497
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
498 iN __atomic_load_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
499 void __atomic_store_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
500 iN __atomic_exchange_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
501 bool __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired, int success_order, int failure_order)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
502
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
503 Finally there are some read-modify-write functions, which are only available in
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
504 the size-specific variants (any other sizes use a ``__atomic_compare_exchange``
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
505 loop)::
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
506
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
507 iN __atomic_fetch_add_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
508 iN __atomic_fetch_sub_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
509 iN __atomic_fetch_and_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
510 iN __atomic_fetch_or_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
511 iN __atomic_fetch_xor_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
512 iN __atomic_fetch_nand_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
513
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
514 This set of library functions have some interesting implementation requirements
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
515 to take note of:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
516
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
517 - They support all sizes and alignments -- including those which cannot be
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
518 implemented natively on any existing hardware. Therefore, they will certainly
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
519 use mutexes in for some sizes/alignments.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
520
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
521 - As a consequence, they cannot be shipped in a statically linked
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
522 compiler-support library, as they have state which must be shared amongst all
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
523 DSOs loaded in the program. They must be provided in a shared library used by
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
524 all objects.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
525
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
526 - The set of atomic sizes supported lock-free must be a superset of the sizes
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
527 any compiler can emit. That is: if a new compiler introduces support for
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
528 inline-lock-free atomics of size N, the ``__atomic_*`` functions must also have a
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
529 lock-free implementation for size N. This is a requirement so that code
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
530 produced by an old compiler (which will have called the ``__atomic_*`` function)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
531 interoperates with code produced by the new compiler (which will use native
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
532 the atomic instruction).
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
533
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
534 Note that it's possible to write an entirely target-independent implementation
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
535 of these library functions by using the compiler atomic builtins themselves to
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
536 implement the operations on naturally-aligned pointers of supported sizes, and a
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
537 generic mutex implementation otherwise.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
538
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
539 Libcalls: __sync_*
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
540 ==================
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
541
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
542 Some targets or OS/target combinations can support lock-free atomics, but for
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
543 various reasons, it is not practical to emit the instructions inline.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
544
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
545 There's two typical examples of this.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
546
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
547 Some CPUs support multiple instruction sets which can be swiched back and forth
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
548 on function-call boundaries. For example, MIPS supports the MIPS16 ISA, which
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
549 has a smaller instruction encoding than the usual MIPS32 ISA. ARM, similarly,
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
550 has the Thumb ISA. In MIPS16 and earlier versions of Thumb, the atomic
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
551 instructions are not encodable. However, those instructions are available via a
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
552 function call to a function with the longer encoding.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
553
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
554 Additionally, a few OS/target pairs provide kernel-supported lock-free
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
555 atomics. ARM/Linux is an example of this: the kernel `provides
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
556 <https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt>`_ a
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
557 function which on older CPUs contains a "magically-restartable" atomic sequence
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
558 (which looks atomic so long as there's only one CPU), and contains actual atomic
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
559 instructions on newer multicore models. This sort of functionality can typically
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
560 be provided on any architecture, if all CPUs which are missing atomic
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
561 compare-and-swap support are uniprocessor (no SMP). This is almost always the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
562 case. The only common architecture without that property is SPARC -- SPARCV8 SMP
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
563 systems were common, yet it doesn't support any sort of compare-and-swap
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
564 operation.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
565
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
566 In either of these cases, the Target in LLVM can claim support for atomics of an
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
567 appropriate size, and then implement some subset of the operations via libcalls
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
568 to a ``__sync_*`` function. Such functions *must* not use locks in their
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
569 implementation, because unlike the ``__atomic_*`` routines used by
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
570 AtomicExpandPass, these may be mixed-and-matched with native instructions by the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
571 target lowering.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
572
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
573 Further, these routines do not need to be shared, as they are stateless. So,
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
574 there is no issue with having multiple copies included in one binary. Thus,
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
575 typically these routines are implemented by the statically-linked compiler
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
576 runtime support library.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
577
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
578 LLVM will emit a call to an appropriate ``__sync_*`` routine if the target
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
579 ISelLowering code has set the corresponding ``ATOMIC_CMPXCHG``, ``ATOMIC_SWAP``,
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
580 or ``ATOMIC_LOAD_*`` operation to "Expand", and if it has opted-into the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
581 availability of those library functions via a call to ``initSyncLibcalls()``.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
582
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
583 The full set of functions that may be called by LLVM is (for ``N`` being 1, 2,
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
584 4, 8, or 16)::
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
585
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
586 iN __sync_val_compare_and_swap_N(iN *ptr, iN expected, iN desired)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
587 iN __sync_lock_test_and_set_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
588 iN __sync_fetch_and_add_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
589 iN __sync_fetch_and_sub_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
590 iN __sync_fetch_and_and_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
591 iN __sync_fetch_and_or_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
592 iN __sync_fetch_and_xor_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
593 iN __sync_fetch_and_nand_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
594 iN __sync_fetch_and_max_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
595 iN __sync_fetch_and_umax_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
596 iN __sync_fetch_and_min_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
597 iN __sync_fetch_and_umin_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
598
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
599 This list doesn't include any function for atomic load or store; all known
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
600 architectures support atomic loads and stores directly (possibly by emitting a
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
601 fence on either side of a normal load or store.)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
602
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
603 There's also, somewhat separately, the possibility to lower ``ATOMIC_FENCE`` to
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
604 ``__sync_synchronize()``. This may happen or not happen independent of all the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
605 above, controlled purely by ``setOperationAction(ISD::ATOMIC_FENCE, ...)``.