annotate docs/AMDGPUUsage.rst @ 120:1172e4bd9c6f

update 4.0.0
author mir3636
date Fri, 25 Nov 2016 19:14:25 +0900
parents afa8332a0e37
children 803732b1fca8
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
1 ==============================
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
2 User Guide for AMDGPU Back-end
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
3 ==============================
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
4
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
5 Introduction
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
6 ============
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
7
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
8 The AMDGPU back-end provides ISA code generation for AMD GPUs, starting with
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
9 the R600 family up until the current Volcanic Islands (GCN Gen 3).
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
10
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
11 Refer to `AMDGPU section in Architecture & Platform Information for Compiler Writers <CompilerWriterInfo.html#amdgpu>`_
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
12 for additional documentation.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
13
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
14 Conventions
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
15 ===========
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
16
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
17 Address Spaces
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
18 --------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
19
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
20 The AMDGPU back-end uses the following address space mapping:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
21
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
22 ============= ============================================
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
23 Address Space Memory Space
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
24 ============= ============================================
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
25 0 Private
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
26 1 Global
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
27 2 Constant
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
28 3 Local
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
29 4 Generic (Flat)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
30 5 Region
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
31 ============= ============================================
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
32
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
33 The terminology in the table, aside from the region memory space, is from the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
34 OpenCL standard.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
35
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
36
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
37 Assembler
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
38 =========
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
39
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
40 AMDGPU backend has LLVM-MC based assembler which is currently in development.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
41 It supports Southern Islands ISA, Sea Islands and Volcanic Islands.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
42
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
43 This document describes general syntax for instructions and operands. For more
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
44 information about instructions, their semantics and supported combinations
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
45 of operands, refer to one of Instruction Set Architecture manuals.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
46
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
47 An instruction has the following syntax (register operands are
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
48 normally comma-separated while extra operands are space-separated):
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
49
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
50 *<opcode> <register_operand0>, ... <extra_operand0> ...*
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
51
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
52
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
53 Operands
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
54 --------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
55
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
56 The following syntax for register operands is supported:
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
57
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
58 * SGPR registers: s0, ... or s[0], ...
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
59 * VGPR registers: v0, ... or v[0], ...
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
60 * TTMP registers: ttmp0, ... or ttmp[0], ...
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
61 * Special registers: exec (exec_lo, exec_hi), vcc (vcc_lo, vcc_hi), flat_scratch (flat_scratch_lo, flat_scratch_hi)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
62 * Special trap registers: tba (tba_lo, tba_hi), tma (tma_lo, tma_hi)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
63 * Register pairs, quads, etc: s[2:3], v[10:11], ttmp[5:6], s[4:7], v[12:15], ttmp[4:7], s[8:15], ...
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
64 * Register lists: [s0, s1], [ttmp0, ttmp1, ttmp2, ttmp3]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
65 * Register index expressions: v[2*2], s[1-1:2-1]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
66 * 'off' indicates that an operand is not enabled
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
67
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
68 The following extra operands are supported:
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
69
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
70 * offset, offset0, offset1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
71 * idxen, offen bits
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
72 * glc, slc, tfe bits
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
73 * waitcnt: integer or combination of counter values
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
74 * VOP3 modifiers:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
75
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
76 - abs (\| \|), neg (\-)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
77
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
78 * DPP modifiers:
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
79
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
80 - row_shl, row_shr, row_ror, row_rol
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
81 - row_mirror, row_half_mirror, row_bcast
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
82 - wave_shl, wave_shr, wave_ror, wave_rol, quad_perm
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
83 - row_mask, bank_mask, bound_ctrl
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
84
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
85 * SDWA modifiers:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
86
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
87 - dst_sel, src0_sel, src1_sel (BYTE_N, WORD_M, DWORD)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
88 - dst_unused (UNUSED_PAD, UNUSED_SEXT, UNUSED_PRESERVE)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
89 - abs, neg, sext
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
90
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
91 DS Instructions Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
92 ------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
93
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
94 .. code-block:: nasm
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
95
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
96 ds_add_u32 v2, v4 offset:16
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
97 ds_write_src2_b64 v2 offset0:4 offset1:8
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
98 ds_cmpst_f32 v2, v4, v6
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
99 ds_min_rtn_f64 v[8:9], v2, v[4:5]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
100
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
101
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
102 For full list of supported instructions, refer to "LDS/GDS instructions" in ISA Manual.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
103
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
104 FLAT Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
105 --------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
106
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
107 .. code-block:: nasm
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
108
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
109 flat_load_dword v1, v[3:4]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
110 flat_store_dwordx3 v[3:4], v[5:7]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
111 flat_atomic_swap v1, v[3:4], v5 glc
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
112 flat_atomic_cmpswap v1, v[3:4], v[5:6] glc slc
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
113 flat_atomic_fmax_x2 v[1:2], v[3:4], v[5:6] glc
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
114
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
115 For full list of supported instructions, refer to "FLAT instructions" in ISA Manual.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
116
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
117 MUBUF Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
118 ---------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
119
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
120 .. code-block:: nasm
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
121
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
122 buffer_load_dword v1, off, s[4:7], s1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
123 buffer_store_dwordx4 v[1:4], v2, ttmp[4:7], s1 offen offset:4 glc tfe
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
124 buffer_store_format_xy v[1:2], off, s[4:7], s1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
125 buffer_wbinvl1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
126 buffer_atomic_inc v1, v2, s[8:11], s4 idxen offset:4 slc
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
127
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
128 For full list of supported instructions, refer to "MUBUF Instructions" in ISA Manual.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
129
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
130 SMRD/SMEM Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
131 -------------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
132
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
133 .. code-block:: nasm
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
134
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
135 s_load_dword s1, s[2:3], 0xfc
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
136 s_load_dwordx8 s[8:15], s[2:3], s4
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
137 s_load_dwordx16 s[88:103], s[2:3], s4
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
138 s_dcache_inv_vol
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
139 s_memtime s[4:5]
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
140
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
141 For full list of supported instructions, refer to "Scalar Memory Operations" in ISA Manual.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
142
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
143 SOP1 Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
144 --------------------------
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
145
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
146 .. code-block:: nasm
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
147
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
148 s_mov_b32 s1, s2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
149 s_mov_b64 s[0:1], 0x80000000
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
150 s_cmov_b32 s1, 200
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
151 s_wqm_b64 s[2:3], s[4:5]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
152 s_bcnt0_i32_b64 s1, s[2:3]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
153 s_swappc_b64 s[2:3], s[4:5]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
154 s_cbranch_join s[4:5]
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
155
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
156 For full list of supported instructions, refer to "SOP1 Instructions" in ISA Manual.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
157
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
158 SOP2 Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
159 -------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
160
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
161 .. code-block:: nasm
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
162
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
163 s_add_u32 s1, s2, s3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
164 s_and_b64 s[2:3], s[4:5], s[6:7]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
165 s_cselect_b32 s1, s2, s3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
166 s_andn2_b32 s2, s4, s6
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
167 s_lshr_b64 s[2:3], s[4:5], s6
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
168 s_ashr_i32 s2, s4, s6
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
169 s_bfm_b64 s[2:3], s4, s6
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
170 s_bfe_i64 s[2:3], s[4:5], s6
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
171 s_cbranch_g_fork s[4:5], s[6:7]
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
172
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
173 For full list of supported instructions, refer to "SOP2 Instructions" in ISA Manual.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
174
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
175 SOPC Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
176 --------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
177
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
178 .. code-block:: nasm
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
179
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
180 s_cmp_eq_i32 s1, s2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
181 s_bitcmp1_b32 s1, s2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
182 s_bitcmp0_b64 s[2:3], s4
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
183 s_setvskip s3, s5
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
184
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
185 For full list of supported instructions, refer to "SOPC Instructions" in ISA Manual.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
186
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
187 SOPP Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
188 --------------------------
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
189
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
190 .. code-block:: nasm
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
191
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
192 s_barrier
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
193 s_nop 2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
194 s_endpgm
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
195 s_waitcnt 0 ; Wait for all counters to be 0
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
196 s_waitcnt vmcnt(0) & expcnt(0) & lgkmcnt(0) ; Equivalent to above
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
197 s_waitcnt vmcnt(1) ; Wait for vmcnt counter to be 1.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
198 s_sethalt 9
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
199 s_sleep 10
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
200 s_sendmsg 0x1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
201 s_sendmsg sendmsg(MSG_INTERRUPT)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
202 s_trap 1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
203
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
204 For full list of supported instructions, refer to "SOPP Instructions" in ISA Manual.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
205
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
206 Unless otherwise mentioned, little verification is performed on the operands
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
207 of SOPP Instrucitons, so it is up to the programmer to be familiar with the
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
208 range or acceptable values.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
209
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
210 Vector ALU Instruction Examples
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
211 -------------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
212
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
213 For vector ALU instruction opcodes (VOP1, VOP2, VOP3, VOPC, VOP_DPP, VOP_SDWA),
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
214 the assembler will automatically use optimal encoding based on its operands.
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
215 To force specific encoding, one can add a suffix to the opcode of the instruction:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
216
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
217 * _e32 for 32-bit VOP1/VOP2/VOPC
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
218 * _e64 for 64-bit VOP3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
219 * _dpp for VOP_DPP
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
220 * _sdwa for VOP_SDWA
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
221
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
222 VOP1/VOP2/VOP3/VOPC examples:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
223
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
224 .. code-block:: nasm
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
225
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
226 v_mov_b32 v1, v2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
227 v_mov_b32_e32 v1, v2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
228 v_nop
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
229 v_cvt_f64_i32_e32 v[1:2], v2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
230 v_floor_f32_e32 v1, v2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
231 v_bfrev_b32_e32 v1, v2
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
232 v_add_f32_e32 v1, v2, v3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
233 v_mul_i32_i24_e64 v1, v2, 3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
234 v_mul_i32_i24_e32 v1, -3, v3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
235 v_mul_i32_i24_e32 v1, -100, v3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
236 v_addc_u32 v1, s[0:1], v2, v3, s[2:3]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
237 v_max_f16_e32 v1, v2, v3
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
238
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
239 VOP_DPP examples:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
240
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
241 .. code-block:: nasm
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
242
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
243 v_mov_b32 v0, v0 quad_perm:[0,2,1,1]
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
244 v_sin_f32 v0, v0 row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
245 v_mov_b32 v0, v0 wave_shl:1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
246 v_mov_b32 v0, v0 row_mirror
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
247 v_mov_b32 v0, v0 row_bcast:31
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
248 v_mov_b32 v0, v0 quad_perm:[1,3,0,1] row_mask:0xa bank_mask:0x1 bound_ctrl:0
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
249 v_add_f32 v0, v0, |v0| row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
250 v_max_f16 v1, v2, v3 row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
251
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
252 VOP_SDWA examples:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
253
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
254 .. code-block:: nasm
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
255
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
256 v_mov_b32 v1, v2 dst_sel:BYTE_0 dst_unused:UNUSED_PRESERVE src0_sel:DWORD
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
257 v_min_u32 v200, v200, v1 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
258 v_sin_f32 v0, v0 dst_unused:UNUSED_PAD src0_sel:WORD_1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
259 v_fract_f32 v0, |v0| dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_1
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
260 v_cmpx_le_u32 vcc, v1, v2 src0_sel:BYTE_2 src1_sel:WORD_0
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
261
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
262 For full list of supported instructions, refer to "Vector ALU instructions".
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
263
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
264 HSA Code Object Directives
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
265 --------------------------
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
266
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
267 AMDGPU ABI defines auxiliary data in output code object. In assembly source,
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
268 one can specify them with assembler directives.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
269
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
270 .hsa_code_object_version major, minor
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
271 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
272
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
273 *major* and *minor* are integers that specify the version of the HSA code
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
274 object that will be generated by the assembler.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
275
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
276 .hsa_code_object_isa [major, minor, stepping, vendor, arch]
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
277 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
278
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
279 *major*, *minor*, and *stepping* are all integers that describe the instruction
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
280 set architecture (ISA) version of the assembly program.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
281
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
282 *vendor* and *arch* are quoted strings. *vendor* should always be equal to
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
283 "AMD" and *arch* should always be equal to "AMDGPU".
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
284
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
285 By default, the assembler will derive the ISA version, *vendor*, and *arch*
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
286 from the value of the -mcpu option that is passed to the assembler.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
287
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
288 .amdgpu_hsa_kernel (name)
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
289 ^^^^^^^^^^^^^^^^^^^^^^^^^
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
290
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
291 This directives specifies that the symbol with given name is a kernel entry point
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
292 (label) and the object should contain corresponding symbol of type STT_AMDGPU_HSA_KERNEL.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
293
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
294 .amd_kernel_code_t
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
295 ^^^^^^^^^^^^^^^^^^
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
296
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
297 This directive marks the beginning of a list of key / value pairs that are used
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
298 to specify the amd_kernel_code_t object that will be emitted by the assembler.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
299 The list must be terminated by the *.end_amd_kernel_code_t* directive. For
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
300 any amd_kernel_code_t values that are unspecified a default value will be
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
301 used. The default value for all keys is 0, with the following exceptions:
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
302
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
303 - *kernel_code_version_major* defaults to 1.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
304 - *machine_kind* defaults to 1.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
305 - *machine_version_major*, *machine_version_minor*, and
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
306 *machine_version_stepping* are derived from the value of the -mcpu option
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
307 that is passed to the assembler.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
308 - *kernel_code_entry_byte_offset* defaults to 256.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
309 - *wavefront_size* defaults to 6.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
310 - *kernarg_segment_alignment*, *group_segment_alignment*, and
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
311 *private_segment_alignment* default to 4. Note that alignments are specified
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
312 as a power of two, so a value of **n** means an alignment of 2^ **n**.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
313
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
314 The *.amd_kernel_code_t* directive must be placed immediately after the
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
315 function label and before any instructions.
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
316
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
317 For a full list of amd_kernel_code_t keys, refer to AMDGPU ABI document,
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
318 comments in lib/Target/AMDGPU/AmdKernelCodeT.h and test/CodeGen/AMDGPU/hsa.s.
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
319
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
320 Here is an example of a minimal amd_kernel_code_t specification:
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
321
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
322 .. code-block:: none
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
323
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
324 .hsa_code_object_version 1,0
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
325 .hsa_code_object_isa
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
326
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
327 .hsatext
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
328 .globl hello_world
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
329 .p2align 8
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
330 .amdgpu_hsa_kernel hello_world
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
331
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
332 hello_world:
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
333
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
334 .amd_kernel_code_t
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
335 enable_sgpr_kernarg_segment_ptr = 1
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
336 is_ptr64 = 1
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
337 compute_pgm_rsrc1_vgprs = 0
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
338 compute_pgm_rsrc1_sgprs = 0
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
339 compute_pgm_rsrc2_user_sgpr = 2
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
340 kernarg_segment_byte_size = 8
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
341 wavefront_sgpr_count = 2
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
342 workitem_vgpr_count = 3
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
343 .end_amd_kernel_code_t
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
344
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
345 s_load_dwordx2 s[0:1], s[0:1] 0x0
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
346 v_mov_b32 v0, 3.14159
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
347 s_waitcnt lgkmcnt(0)
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
348 v_mov_b32 v1, s0
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
349 v_mov_b32 v2, s1
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
350 flat_store_dword v[1:2], v0
95
afa8332a0e37 LLVM 3.8
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
diff changeset
351 s_endpgm
120
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
352 .Lfunc_end0:
1172e4bd9c6f update 4.0.0
mir3636
parents: 95
diff changeset
353 .size hello_world, .Lfunc_end0-hello_world