CbC/CbC_llvm: docs/Atomics.rst annotate

annotate docs/Atomics.rst @ 121:803732b1fca8

LLVM 5.0

author	kono
date	Fri, 27 Oct 2017 17:07:41 +0900
parents	1172e4bd9c6f
children	c2174574ed3a

rev	line source
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	1 ==============================================
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	2 LLVM Atomic Instructions and Concurrency Guide
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	3 ==============================================
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	4
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	5 .. contents::
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	6 :local:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	7
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	8 Introduction
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	9 ============
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	10
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	11 LLVM supports instructions which are well-defined in the presence of threads and
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	12 asynchronous signals.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	13
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	14 The atomic instructions are designed specifically to provide readable IR and
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	15 optimized code generation for the following:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	16
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	17 * The C++11 ``<atomic>`` header. (`C++11 draft available here
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	18 <http://www.open-std.org/jtc1/sc22/wg21/>`_.) (`C11 draft available here
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	19 <http://www.open-std.org/jtc1/sc22/wg14/>`_.)
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	20
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	21 * Proper semantics for Java-style memory, for both ``volatile`` and regular
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	22 shared variables. (`Java Specification
77 54457678186b LLVM 3.6 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 0 diff changeset	23 <http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html>`_)
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	24
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	25 * gcc-compatible ``__sync_*`` builtins. (`Description
77 54457678186b LLVM 3.6 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 0 diff changeset	26 <https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html>`_)
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	27
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	28 * Other scenarios with atomic semantics, including ``static`` variables with
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	29 non-trivial constructors in C++.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	30
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	31 Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++ volatile,
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	32 which ensures that every volatile load and store happens and is performed in the
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	33 stated order. A couple examples: if a SequentiallyConsistent store is
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	34 immediately followed by another SequentiallyConsistent store to the same
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	35 address, the first store can be erased. This transformation is not allowed for a
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	36 pair of volatile stores. On the other hand, a non-volatile non-atomic load can
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	37 be moved across a volatile load freely, but not an Acquire load.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	38
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	39 This document is intended to provide a guide to anyone either writing a frontend
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	40 for LLVM or working on optimization passes for LLVM with a guide for how to deal
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	41 with instructions with special semantics in the presence of concurrency. This
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	42 is not intended to be a precise guide to the semantics; the details can get
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	43 extremely complicated and unreadable, and are not usually necessary.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	44
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	45 .. _Optimization outside atomic:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	46
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	47 Optimization outside atomic
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	48 ===========================
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	49
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	50 The basic ``'load'`` and ``'store'`` allow a variety of optimizations, but can
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	51 lead to undefined results in a concurrent environment; see `NotAtomic`_. This
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	52 section specifically goes into the one optimizer restriction which applies in
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	53 concurrent environments, which gets a bit more of an extended description
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	54 because any optimization dealing with stores needs to be aware of it.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	55
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	56 From the optimizer's point of view, the rule is that if there are not any
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	57 instructions with atomic ordering involved, concurrency does not matter, with
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	58 one exception: if a variable might be visible to another thread or signal
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	59 handler, a store cannot be inserted along a path where it might not execute
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	60 otherwise. Take the following example:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	61
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	62 .. code-block:: c
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	63
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	64 /* C code, for readability; run through clang -O2 -S -emit-llvm to get
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	65 equivalent IR */
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	66 int x;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	67 void f(int* a) {
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	68 for (int i = 0; i < 100; i++) {
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	69 if (a[i])
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	70 x += 1;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	71 }
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	72 }
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	73
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	74 The following is equivalent in non-concurrent situations:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	75
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	76 .. code-block:: c
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	77
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	78 int x;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	79 void f(int* a) {
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	80 int xtemp = x;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	81 for (int i = 0; i < 100; i++) {
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	82 if (a[i])
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	83 xtemp += 1;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	84 }
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	85 x = xtemp;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	86 }
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	87
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	88 However, LLVM is not allowed to transform the former to the latter: it could
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	89 indirectly introduce undefined behavior if another thread can access ``x`` at
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	90 the same time. (This example is particularly of interest because before the
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	91 concurrency model was implemented, LLVM would perform this transformation.)
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	92
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	93 Note that speculative loads are allowed; a load which is part of a race returns
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	94 ``undef``, but does not have undefined behavior.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	95
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	96 Atomic instructions
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	97 ===================
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	98
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	99 For cases where simple loads and stores are not sufficient, LLVM provides
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	100 various atomic instructions. The exact guarantees provided depend on the
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	101 ordering; see `Atomic orderings`_.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	102
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	103 ``load atomic`` and ``store atomic`` provide the same basic functionality as
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	104 non-atomic loads and stores, but provide additional guarantees in situations
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	105 where threads and signals are involved.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	106
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	107 ``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	108 atomic store (where the store is conditional for ``cmpxchg``), but no other
77 54457678186b LLVM 3.6 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 0 diff changeset	109 memory operation can happen on any thread between the load and store.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	110
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	111 A ``fence`` provides Acquire and/or Release ordering which is not part of
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	112 another operation; it is normally used along with Monotonic memory operations.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	113 A Monotonic load followed by an Acquire fence is roughly equivalent to an
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	114 Acquire load, and a Monotonic store following a Release fence is roughly
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	115 equivalent to a Release store. SequentiallyConsistent fences behave as both
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	116 an Acquire and a Release fence, and offer some additional complicated
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	117 guarantees, see the C++11 standard for details.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	118
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	119 Frontends generating atomic instructions generally need to be aware of the
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	120 target to some degree; atomic instructions are guaranteed to be lock-free, and
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	121 therefore an instruction which is wider than the target natively supports can be
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	122 impossible to generate.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	123
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	124 .. _Atomic orderings:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	125
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	126 Atomic orderings
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	127 ================
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	128
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	129 In order to achieve a balance between performance and necessary guarantees,
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	130 there are six levels of atomicity. They are listed in order of strength; each
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	131 level includes all the guarantees of the previous level except for
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	132 Acquire/Release. (See also `LangRef Ordering <LangRef.html#ordering>`_.)
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	133
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	134 .. _NotAtomic:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	135
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	136 NotAtomic
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	137 ---------
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	138
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	139 NotAtomic is the obvious, a load or store which is not atomic. (This isn't
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	140 really a level of atomicity, but is listed here for comparison.) This is
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	141 essentially a regular load or store. If there is a race on a given memory
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	142 location, loads from that location return undef.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	143
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	144 Relevant standard
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	145 This is intended to match shared variables in C/C++, and to be used in any
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	146 other context where memory access is necessary, and a race is impossible. (The
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	147 precise definition is in `LangRef Memory Model <LangRef.html#memmodel>`_.)
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	148
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	149 Notes for frontends
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	150 The rule is essentially that all memory accessed with basic loads and stores
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	151 by multiple threads should be protected by a lock or other synchronization;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	152 otherwise, you are likely to run into undefined behavior. If your frontend is
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	153 for a "safe" language like Java, use Unordered to load and store any shared
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	154 variable. Note that NotAtomic volatile loads and stores are not properly
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	155 atomic; do not try to use them as a substitute. (Per the C/C++ standards,
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	156 volatile does provide some limited guarantees around asynchronous signals, but
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	157 atomics are generally a better solution.)
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	158
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	159 Notes for optimizers
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	160 Introducing loads to shared variables along a codepath where they would not
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	161 otherwise exist is allowed; introducing stores to shared variables is not. See
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	162 `Optimization outside atomic`_.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	163
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	164 Notes for code generation
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	165 The one interesting restriction here is that it is not allowed to write to
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	166 bytes outside of the bytes relevant to a store. This is mostly relevant to
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	167 unaligned stores: it is not allowed in general to convert an unaligned store
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	168 into two aligned stores of the same width as the unaligned store. Backends are
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	169 also expected to generate an i8 store as an i8 store, and not an instruction
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	170 which writes to surrounding bytes. (If you are writing a backend for an
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	171 architecture which cannot satisfy these restrictions and cares about
95 afa8332a0e37 LLVM 3.8 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 83 diff changeset	172 concurrency, please send an email to llvm-dev.)
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	173
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	174 Unordered
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	175 ---------
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	176
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	177 Unordered is the lowest level of atomicity. It essentially guarantees that races
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	178 produce somewhat sane results instead of having undefined behavior. It also
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	179 guarantees the operation to be lock-free, so it does not depend on the data
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	180 being part of a special atomic structure or depend on a separate per-process
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	181 global lock. Note that code generation will fail for unsupported atomic
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	182 operations; if you need such an operation, use explicit locking.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	183
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	184 Relevant standard
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	185 This is intended to match the Java memory model for shared variables.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	186
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	187 Notes for frontends
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	188 This cannot be used for synchronization, but is useful for Java and other
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	189 "safe" languages which need to guarantee that the generated code never
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	190 exhibits undefined behavior. Note that this guarantee is cheap on common
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	191 platforms for loads of a native width, but can be expensive or unavailable for
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	192 wider loads, like a 64-bit store on ARM. (A frontend for Java or other "safe"
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	193 languages would normally split a 64-bit store on ARM into two 32-bit unordered
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	194 stores.)
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	195
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	196 Notes for optimizers
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	197 In terms of the optimizer, this prohibits any transformation that transforms a
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	198 single load into multiple loads, transforms a store into multiple stores,
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	199 narrows a store, or stores a value which would not be stored otherwise. Some
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	200 examples of unsafe optimizations are narrowing an assignment into a bitfield,
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	201 rematerializing a load, and turning loads and stores into a memcpy
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	202 call. Reordering unordered operations is safe, though, and optimizers should
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	203 take advantage of that because unordered operations are common in languages
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	204 that need them.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	205
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	206 Notes for code generation
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	207 These operations are required to be atomic in the sense that if you use
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	208 unordered loads and unordered stores, a load cannot see a value which was
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	209 never stored. A normal load or store instruction is usually sufficient, but
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	210 note that an unordered load or store cannot be split into multiple
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	211 instructions (or an instruction which does multiple memory operations, like
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	212 ``LDRD`` on ARM without LPAE, or not naturally-aligned ``LDRD`` on LPAE ARM).
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	213
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	214 Monotonic
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	215 ---------
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	216
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	217 Monotonic is the weakest level of atomicity that can be used in synchronization
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	218 primitives, although it does not provide any general synchronization. It
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	219 essentially guarantees that if you take all the operations affecting a specific
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	220 address, a consistent ordering exists.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	221
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	222 Relevant standard
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	223 This corresponds to the C++11/C11 ``memory_order_relaxed``; see those
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	224 standards for the exact definition.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	225
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	226 Notes for frontends
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	227 If you are writing a frontend which uses this directly, use with caution. The
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	228 guarantees in terms of synchronization are very weak, so make sure these are
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	229 only used in a pattern which you know is correct. Generally, these would
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	230 either be used for atomic operations which do not protect other memory (like
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	231 an atomic counter), or along with a ``fence``.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	232
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	233 Notes for optimizers
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	234 In terms of the optimizer, this can be treated as a read+write on the relevant
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	235 memory location (and alias analysis will take advantage of that). In addition,
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	236 it is legal to reorder non-atomic and Unordered loads around Monotonic
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	237 loads. CSE/DSE and a few other optimizations are allowed, but Monotonic
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	238 operations are unlikely to be used in ways which would make those
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	239 optimizations useful.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	240
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	241 Notes for code generation
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	242 Code generation is essentially the same as that for unordered for loads and
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	243 stores. No fences are required. ``cmpxchg`` and ``atomicrmw`` are required
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	244 to appear as a single operation.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	245
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	246 Acquire
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	247 -------
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	248
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	249 Acquire provides a barrier of the sort necessary to acquire a lock to access
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	250 other memory with normal loads and stores.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	251
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	252 Relevant standard
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	253 This corresponds to the C++11/C11 ``memory_order_acquire``. It should also be
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	254 used for C++11/C11 ``memory_order_consume``.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	255
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	256 Notes for frontends
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	257 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	258 Acquire only provides a semantic guarantee when paired with a Release
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	259 operation.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	260
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	261 Notes for optimizers
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	262 Optimizers not aware of atomics can treat this like a nothrow call. It is
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	263 also possible to move stores from before an Acquire load or read-modify-write
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	264 operation to after it, and move non-Acquire loads from before an Acquire
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	265 operation to after it.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	266
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	267 Notes for code generation
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	268 Architectures with weak memory ordering (essentially everything relevant today
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	269 except x86 and SPARC) require some sort of fence to maintain the Acquire
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	270 semantics. The precise fences required varies widely by architecture, but for
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	271 a simple implementation, most architectures provide a barrier which is strong
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	272 enough for everything (``dmb`` on ARM, ``sync`` on PowerPC, etc.). Putting
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	273 such a fence after the equivalent Monotonic operation is sufficient to
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	274 maintain Acquire semantics for a memory operation.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	275
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	276 Release
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	277 -------
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	278
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	279 Release is similar to Acquire, but with a barrier of the sort necessary to
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	280 release a lock.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	281
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	282 Relevant standard
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	283 This corresponds to the C++11/C11 ``memory_order_release``.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	284
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	285 Notes for frontends
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	286 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	287 Release only provides a semantic guarantee when paired with a Acquire
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	288 operation.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	289
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	290 Notes for optimizers
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	291 Optimizers not aware of atomics can treat this like a nothrow call. It is
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	292 also possible to move loads from after a Release store or read-modify-write
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	293 operation to before it, and move non-Release stores from after an Release
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	294 operation to before it.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	295
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	296 Notes for code generation
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	297 See the section on Acquire; a fence before the relevant operation is usually
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	298 sufficient for Release. Note that a store-store fence is not sufficient to
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	299 implement Release semantics; store-store fences are generally not exposed to
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	300 IR because they are extremely difficult to use correctly.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	301
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	302 AcquireRelease
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	303 --------------
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	304
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	305 AcquireRelease (``acq_rel`` in IR) provides both an Acquire and a Release
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	306 barrier (for fences and operations which both read and write memory).
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	307
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	308 Relevant standard
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	309 This corresponds to the C++11/C11 ``memory_order_acq_rel``.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	310
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	311 Notes for frontends
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	312 If you are writing a frontend which uses this directly, use with caution.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	313 Acquire only provides a semantic guarantee when paired with a Release
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	314 operation, and vice versa.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	315
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	316 Notes for optimizers
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	317 In general, optimizers should treat this like a nothrow call; the possible
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	318 optimizations are usually not interesting.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	319
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	320 Notes for code generation
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	321 This operation has Acquire and Release semantics; see the sections on Acquire
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	322 and Release.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	323
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	324 SequentiallyConsistent
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	325 ----------------------
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	326
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	327 SequentiallyConsistent (``seq_cst`` in IR) provides Acquire semantics for loads
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	328 and Release semantics for stores. Additionally, it guarantees that a total
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	329 ordering exists between all SequentiallyConsistent operations.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	330
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	331 Relevant standard
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	332 This corresponds to the C++11/C11 ``memory_order_seq_cst``, Java volatile, and
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	333 the gcc-compatible ``__sync_*`` builtins which do not specify otherwise.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	334
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	335 Notes for frontends
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	336 If a frontend is exposing atomic operations, these are much easier to reason
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	337 about for the programmer than other kinds of operations, and using them is
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	338 generally a practical performance tradeoff.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	339
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	340 Notes for optimizers
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	341 Optimizers not aware of atomics can treat this like a nothrow call. For
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	342 SequentiallyConsistent loads and stores, the same reorderings are allowed as
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	343 for Acquire loads and Release stores, except that SequentiallyConsistent
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	344 operations may not be reordered.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	345
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	346 Notes for code generation
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	347 SequentiallyConsistent loads minimally require the same barriers as Acquire
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	348 operations and SequentiallyConsistent stores require Release
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	349 barriers. Additionally, the code generator must enforce ordering between
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	350 SequentiallyConsistent stores followed by SequentiallyConsistent loads. This
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	351 is usually done by emitting either a full fence before the loads or a full
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	352 fence after the stores; which is preferred varies by architecture.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	353
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	354 Atomics and IR optimization
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	355 ===========================
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	356
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	357 Predicates for optimizer writers to query:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	358
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	359 * ``isSimple()``: A load or store which is not volatile or atomic. This is
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	360 what, for example, memcpyopt would check for operations it might transform.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	361
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	362 * ``isUnordered()``: A load or store which is not volatile and at most
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	363 Unordered. This would be checked, for example, by LICM before hoisting an
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	364 operation.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	365
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	366 * ``mayReadFromMemory()``/``mayWriteToMemory()``: Existing predicate, but note
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	367 that they return true for any operation which is volatile or at least
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	368 Monotonic.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	369
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	370 * ``isStrongerThan`` / ``isAtLeastOrStrongerThan``: These are predicates on
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	371 orderings. They can be useful for passes that are aware of atomics, for
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	372 example to do DSE across a single atomic access, but not across a
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	373 release-acquire pair (see MemoryDependencyAnalysis for an example of this)
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	374
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	375 * Alias analysis: Note that AA will return ModRef for anything Acquire or
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	376 Release, and for the address accessed by any Monotonic operation.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	377
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	378 To support optimizing around atomic operations, make sure you are using the
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	379 right predicates; everything should work if that is done. If your pass should
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	380 optimize some atomic operations (Unordered operations in particular), make sure
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	381 it doesn't replace an atomic load or store with a non-atomic operation.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	382
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	383 Some examples of how optimizations interact with various kinds of atomic
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	384 operations:
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	385
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	386 * ``memcpyopt``: An atomic operation cannot be optimized into part of a
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	387 memcpy/memset, including unordered loads/stores. It can pull operations
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	388 across some atomic operations.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	389
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	390 * LICM: Unordered loads/stores can be moved out of a loop. It just treats
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	391 monotonic operations like a read+write to a memory location, and anything
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	392 stricter than that like a nothrow call.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	393
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	394 * DSE: Unordered stores can be DSE'ed like normal stores. Monotonic stores can
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	395 be DSE'ed in some cases, but it's tricky to reason about, and not especially
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	396 important. It is possible in some case for DSE to operate across a stronger
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	397 atomic operation, but it is fairly tricky. DSE delegates this reasoning to
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	398 MemoryDependencyAnalysis (which is also used by other passes like GVN).
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	399
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	400 * Folding a load: Any atomic load from a constant global can be constant-folded,
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	401 because it cannot be observed. Similar reasoning allows sroa with
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	402 atomic loads and stores.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	403
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	404 Atomics and Codegen
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	405 ===================
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	406
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	407 Atomic operations are represented in the SelectionDAG with ``ATOMIC_*`` opcodes.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	408 On architectures which use barrier instructions for all atomic ordering (like
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	409 ARM), appropriate fences can be emitted by the AtomicExpand Codegen pass if
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	410 ``setInsertFencesForAtomic()`` was used.
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	411
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	412 The MachineMemOperand for all atomic operations is currently marked as volatile;
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	413 this is not correct in the IR sense of volatile, but CodeGen handles anything
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	414 marked volatile very conservatively. This should get fixed at some point.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	415
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	416 One very important property of the atomic operations is that if your backend
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	417 supports any inline lock-free atomic operations of a given size, you should
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	418 support ALL operations of that size in a lock-free manner.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	419
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	420 When the target implements atomic ``cmpxchg`` or LL/SC instructions (as most do)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	421 this is trivial: all the other operations can be implemented on top of those
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	422 primitives. However, on many older CPUs (e.g. ARMv5, SparcV8, Intel 80386) there
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	423 are atomic load and store instructions, but no ``cmpxchg`` or LL/SC. As it is
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	424 invalid to implement ``atomic load`` using the native instruction, but
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	425 ``cmpxchg`` using a library call to a function that uses a mutex, ``atomic
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	426 load`` must also expand to a library call on such architectures, so that it
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	427 can remain atomic with regards to a simultaneous ``cmpxchg``, by using the same
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	428 mutex.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	429
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	430 AtomicExpandPass can help with that: it will expand all atomic operations to the
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	431 proper ``__atomic_*`` libcalls for any size above the maximum set by
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	432 ``setMaxAtomicSizeInBitsSupported`` (which defaults to 0).
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	433
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	434 On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	435 generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	436 fences generate an ``MFENCE``, other fences do not cause any code to be
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	437 generated. ``cmpxchg`` uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg``
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	438 uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	439 other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	440 on the users of the result, some ``atomicrmw`` operations can be translated into
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	441 operations like ``LOCK AND``, but that does not work in general.
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	442
77 54457678186b LLVM 3.6 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 0 diff changeset	443 On ARM (before v8), MIPS, and many other RISC architectures, Acquire, Release,
54457678186b LLVM 3.6 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 0 diff changeset	444 and SequentiallyConsistent semantics require barrier instructions for every such
0 95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	445 operation. Loads and stores generate normal instructions. ``cmpxchg`` and
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	446 ``atomicrmw`` can be represented using a loop with LL/SC-style instructions
95c75e76d11b LLVM 3.4 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: diff changeset	447 which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX``
77 54457678186b LLVM 3.6 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 0 diff changeset	448 on ARM, etc.).
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	449
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	450 It is often easiest for backends to use AtomicExpandPass to lower some of the
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	451 atomic constructs. Here are some lowerings it can do:
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	452
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	453 * cmpxchg -> loop with load-linked/store-conditional
95 afa8332a0e37 LLVM 3.8 Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> parents: 83 diff changeset	454 by overriding ``shouldExpandAtomicCmpXchgInIR()``, ``emitLoadLinked()``,
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	455 ``emitStoreConditional()``
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	456 * large loads/stores -> ll-sc/cmpxchg
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	457 by overriding ``shouldExpandAtomicStoreInIR()``/``shouldExpandAtomicLoadInIR()``
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	458 * strong atomic accesses -> monotonic accesses + fences by overriding
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	459 ``shouldInsertFencesForAtomic()``, ``emitLeadingFence()``, and
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	460 ``emitTrailingFence()``
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	461 * atomic rmw -> loop with cmpxchg or load-linked/store-conditional
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	462 by overriding ``expandAtomicRMWInIR()``
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	463 * expansion to __atomic_* libcalls for unsupported sizes.
83 60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	464
60c9769439b8 LLVM 3.7 Tatsuki IHA <e125716@ie.u-ryukyu.ac.jp> parents: 77 diff changeset	465 For an example of all of these, look at the ARM backend.
120 1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	466
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	467 Libcalls: __atomic_*
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	468 ====================
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	469
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	470 There are two kinds of atomic library calls that are generated by LLVM. Please
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	471 note that both sets of library functions somewhat confusingly share the names of
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	472 builtin functions defined by clang. Despite this, the library functions are
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	473 not directly related to the builtins: it is not the case that ``__atomic_*``
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	474 builtins lower to ``__atomic_`` library calls and ``__sync_`` builtins lower
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	475 to ``__sync_*`` library calls.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	476
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	477 The first set of library functions are named ``__atomic_*``. This set has been
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	478 "standardized" by GCC, and is described below. (See also `GCC's documentation
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	479 <https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>`_)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	480
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	481 LLVM's AtomicExpandPass will translate atomic operations on data sizes above
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	482 ``MaxAtomicSizeInBitsSupported`` into calls to these functions.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	483
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	484 There are four generic functions, which can be called with data of any size or
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	485 alignment::
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	486
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	487 void __atomic_load(size_t size, void ptr, void ret, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	488 void __atomic_store(size_t size, void ptr, void val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	489 void __atomic_exchange(size_t size, void ptr, void val, void *ret, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	490 bool __atomic_compare_exchange(size_t size, void ptr, void expected, void *desired, int success_order, int failure_order)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	491
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	492 There are also size-specialized versions of the above functions, which can only
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	493 be used with naturally-aligned pointers of the appropriate size. In the
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	494 signatures below, "N" is one of 1, 2, 4, 8, and 16, and "iN" is the appropriate
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	495 integer type of that size; if no such integer type exists, the specialization
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	496 cannot be used::
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	497
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	498 iN __atomic_load_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	499 void __atomic_store_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	500 iN __atomic_exchange_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	501 bool __atomic_compare_exchange_N(iN ptr, iN expected, iN desired, int success_order, int failure_order)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	502
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	503 Finally there are some read-modify-write functions, which are only available in
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	504 the size-specific variants (any other sizes use a ``__atomic_compare_exchange``
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	505 loop)::
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	506
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	507 iN __atomic_fetch_add_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	508 iN __atomic_fetch_sub_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	509 iN __atomic_fetch_and_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	510 iN __atomic_fetch_or_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	511 iN __atomic_fetch_xor_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	512 iN __atomic_fetch_nand_N(iN *ptr, iN val, int ordering)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	513
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	514 This set of library functions have some interesting implementation requirements
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	515 to take note of:
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	516
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	517 - They support all sizes and alignments -- including those which cannot be
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	518 implemented natively on any existing hardware. Therefore, they will certainly
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	519 use mutexes in for some sizes/alignments.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	520
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	521 - As a consequence, they cannot be shipped in a statically linked
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	522 compiler-support library, as they have state which must be shared amongst all
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	523 DSOs loaded in the program. They must be provided in a shared library used by
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	524 all objects.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	525
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	526 - The set of atomic sizes supported lock-free must be a superset of the sizes
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	527 any compiler can emit. That is: if a new compiler introduces support for
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	528 inline-lock-free atomics of size N, the ``__atomic_*`` functions must also have a
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	529 lock-free implementation for size N. This is a requirement so that code
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	530 produced by an old compiler (which will have called the ``__atomic_*`` function)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	531 interoperates with code produced by the new compiler (which will use native
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	532 the atomic instruction).
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	533
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	534 Note that it's possible to write an entirely target-independent implementation
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	535 of these library functions by using the compiler atomic builtins themselves to
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	536 implement the operations on naturally-aligned pointers of supported sizes, and a
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	537 generic mutex implementation otherwise.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	538
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	539 Libcalls: __sync_*
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	540 ==================
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	541
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	542 Some targets or OS/target combinations can support lock-free atomics, but for
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	543 various reasons, it is not practical to emit the instructions inline.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	544
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	545 There's two typical examples of this.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	546
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	547 Some CPUs support multiple instruction sets which can be swiched back and forth
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	548 on function-call boundaries. For example, MIPS supports the MIPS16 ISA, which
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	549 has a smaller instruction encoding than the usual MIPS32 ISA. ARM, similarly,
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	550 has the Thumb ISA. In MIPS16 and earlier versions of Thumb, the atomic
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	551 instructions are not encodable. However, those instructions are available via a
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	552 function call to a function with the longer encoding.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	553
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	554 Additionally, a few OS/target pairs provide kernel-supported lock-free
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	555 atomics. ARM/Linux is an example of this: the kernel `provides
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	556 <https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt>`_ a
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	557 function which on older CPUs contains a "magically-restartable" atomic sequence
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	558 (which looks atomic so long as there's only one CPU), and contains actual atomic
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	559 instructions on newer multicore models. This sort of functionality can typically
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	560 be provided on any architecture, if all CPUs which are missing atomic
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	561 compare-and-swap support are uniprocessor (no SMP). This is almost always the
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	562 case. The only common architecture without that property is SPARC -- SPARCV8 SMP
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	563 systems were common, yet it doesn't support any sort of compare-and-swap
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	564 operation.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	565
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	566 In either of these cases, the Target in LLVM can claim support for atomics of an
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	567 appropriate size, and then implement some subset of the operations via libcalls
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	568 to a ``__sync_`` function. Such functions must* not use locks in their
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	569 implementation, because unlike the ``__atomic_*`` routines used by
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	570 AtomicExpandPass, these may be mixed-and-matched with native instructions by the
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	571 target lowering.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	572
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	573 Further, these routines do not need to be shared, as they are stateless. So,
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	574 there is no issue with having multiple copies included in one binary. Thus,
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	575 typically these routines are implemented by the statically-linked compiler
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	576 runtime support library.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	577
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	578 LLVM will emit a call to an appropriate ``__sync_*`` routine if the target
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	579 ISelLowering code has set the corresponding ``ATOMIC_CMPXCHG``, ``ATOMIC_SWAP``,
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	580 or ``ATOMIC_LOAD_*`` operation to "Expand", and if it has opted-into the
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	581 availability of those library functions via a call to ``initSyncLibcalls()``.
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	582
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	583 The full set of functions that may be called by LLVM is (for ``N`` being 1, 2,
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	584 4, 8, or 16)::
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	585
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	586 iN __sync_val_compare_and_swap_N(iN *ptr, iN expected, iN desired)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	587 iN __sync_lock_test_and_set_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	588 iN __sync_fetch_and_add_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	589 iN __sync_fetch_and_sub_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	590 iN __sync_fetch_and_and_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	591 iN __sync_fetch_and_or_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	592 iN __sync_fetch_and_xor_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	593 iN __sync_fetch_and_nand_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	594 iN __sync_fetch_and_max_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	595 iN __sync_fetch_and_umax_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	596 iN __sync_fetch_and_min_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	597 iN __sync_fetch_and_umin_N(iN *ptr, iN val)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	598
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	599 This list doesn't include any function for atomic load or store; all known
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	600 architectures support atomic loads and stores directly (possibly by emitting a
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	601 fence on either side of a normal load or store.)
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	602
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	603 There's also, somewhat separately, the possibility to lower ``ATOMIC_FENCE`` to
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	604 ``__sync_synchronize()``. This may happen or not happen independent of all the
1172e4bd9c6f update 4.0.0 mir3636 parents: 95 diff changeset	605 above, controlled purely by ``setOperationAction(ISD::ATOMIC_FENCE, ...)``.

Mercurial > hg > CbC > CbC_llvm

annotate docs/Atomics.rst @ 121:803732b1fca8