annotate polly/docs/UsingPollyWithClang.rst @ 150:1d019706d866

LLVM10
author anatofuz
date Thu, 13 Feb 2020 15:10:13 +0900
parents
children c4bab56944e8
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
150
anatofuz
parents:
diff changeset
1 ======================
anatofuz
parents:
diff changeset
2 Using Polly with Clang
anatofuz
parents:
diff changeset
3 ======================
anatofuz
parents:
diff changeset
4
anatofuz
parents:
diff changeset
5 This documentation discusses how Polly can be used in Clang to automatically
anatofuz
parents:
diff changeset
6 optimize C/C++ code during compilation.
anatofuz
parents:
diff changeset
7
anatofuz
parents:
diff changeset
8
anatofuz
parents:
diff changeset
9 .. warning::
anatofuz
parents:
diff changeset
10
anatofuz
parents:
diff changeset
11 Warning: clang/LLVM/Polly need to be in sync (compiled from the same SVN
anatofuz
parents:
diff changeset
12 revision).
anatofuz
parents:
diff changeset
13
anatofuz
parents:
diff changeset
14 Make Polly available from Clang
anatofuz
parents:
diff changeset
15 ===============================
anatofuz
parents:
diff changeset
16
anatofuz
parents:
diff changeset
17 Polly is available through clang, opt, and bugpoint, if Polly was checked out
anatofuz
parents:
diff changeset
18 into tools/polly before compilation. No further configuration is needed.
anatofuz
parents:
diff changeset
19
anatofuz
parents:
diff changeset
20 Optimizing with Polly
anatofuz
parents:
diff changeset
21 =====================
anatofuz
parents:
diff changeset
22
anatofuz
parents:
diff changeset
23 Optimizing with Polly is as easy as adding -O3 -mllvm -polly to your compiler
anatofuz
parents:
diff changeset
24 flags (Polly is not available unless optimizations are enabled, such as
anatofuz
parents:
diff changeset
25 -O1,-O2,-O3; Optimizing for size with -Os or -Oz is not recommended).
anatofuz
parents:
diff changeset
26
anatofuz
parents:
diff changeset
27 .. code-block:: console
anatofuz
parents:
diff changeset
28
anatofuz
parents:
diff changeset
29 clang -O3 -mllvm -polly file.c
anatofuz
parents:
diff changeset
30
anatofuz
parents:
diff changeset
31 Automatic OpenMP code generation
anatofuz
parents:
diff changeset
32 ================================
anatofuz
parents:
diff changeset
33
anatofuz
parents:
diff changeset
34 To automatically detect parallel loops and generate OpenMP code for them you
anatofuz
parents:
diff changeset
35 also need to add -mllvm -polly-parallel -lgomp to your CFLAGS.
anatofuz
parents:
diff changeset
36
anatofuz
parents:
diff changeset
37 .. code-block:: console
anatofuz
parents:
diff changeset
38
anatofuz
parents:
diff changeset
39 clang -O3 -mllvm -polly -mllvm -polly-parallel -lgomp file.c
anatofuz
parents:
diff changeset
40
anatofuz
parents:
diff changeset
41 Switching the OpenMP backend
anatofuz
parents:
diff changeset
42 ----------------------------
anatofuz
parents:
diff changeset
43
anatofuz
parents:
diff changeset
44 The following CL switch allows to choose Polly's OpenMP-backend:
anatofuz
parents:
diff changeset
45
anatofuz
parents:
diff changeset
46 -polly-omp-backend[=BACKEND]
anatofuz
parents:
diff changeset
47 choose the OpenMP backend; BACKEND can be 'GNU' (the default) or 'LLVM';
anatofuz
parents:
diff changeset
48
anatofuz
parents:
diff changeset
49 The OpenMP backends can be further influenced using the following CL switches:
anatofuz
parents:
diff changeset
50
anatofuz
parents:
diff changeset
51
anatofuz
parents:
diff changeset
52 -polly-num-threads[=NUM]
anatofuz
parents:
diff changeset
53 set the number of threads to use; NUM may be any positive integer (default: 0, which equals automatic/OMP runtime);
anatofuz
parents:
diff changeset
54
anatofuz
parents:
diff changeset
55 -polly-scheduling[=SCHED]
anatofuz
parents:
diff changeset
56 set the OpenMP scheduling type; SCHED can be 'static', 'dynamic', 'guided' or 'runtime' (the default);
anatofuz
parents:
diff changeset
57
anatofuz
parents:
diff changeset
58 -polly-scheduling-chunksize[=CHUNK]
anatofuz
parents:
diff changeset
59 set the chunksize (for the selected scheduling type); CHUNK may be any strictly positive integer (otherwise it will default to 1);
anatofuz
parents:
diff changeset
60
anatofuz
parents:
diff changeset
61 Note that at the time of writing, the GNU backend may only use the
anatofuz
parents:
diff changeset
62 `polly-num-threads` and `polly-scheduling` switches, where the latter also has
anatofuz
parents:
diff changeset
63 to be set to "runtime".
anatofuz
parents:
diff changeset
64
anatofuz
parents:
diff changeset
65 Example: Use alternative backend with dynamic scheduling, four threads and
anatofuz
parents:
diff changeset
66 chunksize of one (additional switches).
anatofuz
parents:
diff changeset
67
anatofuz
parents:
diff changeset
68 .. code-block:: console
anatofuz
parents:
diff changeset
69
anatofuz
parents:
diff changeset
70 -mllvm -polly-omp-backend=LLVM -mllvm -polly-num-threads=4
anatofuz
parents:
diff changeset
71 -mllvm -polly-scheduling=dynamic -mllvm -polly-scheduling-chunksize=1
anatofuz
parents:
diff changeset
72
anatofuz
parents:
diff changeset
73 Automatic Vector code generation
anatofuz
parents:
diff changeset
74 ================================
anatofuz
parents:
diff changeset
75
anatofuz
parents:
diff changeset
76 Automatic vector code generation can be enabled by adding -mllvm
anatofuz
parents:
diff changeset
77 -polly-vectorizer=stripmine to your CFLAGS.
anatofuz
parents:
diff changeset
78
anatofuz
parents:
diff changeset
79 .. code-block:: console
anatofuz
parents:
diff changeset
80
anatofuz
parents:
diff changeset
81 clang -O3 -mllvm -polly -mllvm -polly-vectorizer=stripmine file.c
anatofuz
parents:
diff changeset
82
anatofuz
parents:
diff changeset
83 Isolate the Polly passes
anatofuz
parents:
diff changeset
84 ========================
anatofuz
parents:
diff changeset
85
anatofuz
parents:
diff changeset
86 Polly's analysis and transformation passes are run with many other
anatofuz
parents:
diff changeset
87 passes of the pass manager's pipeline. Some of passes that run before
anatofuz
parents:
diff changeset
88 Polly are essential for its working, for instance the canonicalization
anatofuz
parents:
diff changeset
89 of loop. Therefore Polly is unable to optimize code straight out of
anatofuz
parents:
diff changeset
90 clang's -O0 output.
anatofuz
parents:
diff changeset
91
anatofuz
parents:
diff changeset
92 To get the LLVM-IR that Polly sees in the optimization pipeline, use the
anatofuz
parents:
diff changeset
93 command:
anatofuz
parents:
diff changeset
94
anatofuz
parents:
diff changeset
95 .. code-block:: console
anatofuz
parents:
diff changeset
96
anatofuz
parents:
diff changeset
97 clang file.c -c -O3 -mllvm -polly -mllvm -polly-dump-before-file=before-polly.ll
anatofuz
parents:
diff changeset
98
anatofuz
parents:
diff changeset
99 This writes a file 'before-polly.ll' containing the LLVM-IR as passed to
anatofuz
parents:
diff changeset
100 polly, after SSA transformation, loop canonicalization, inlining and
anatofuz
parents:
diff changeset
101 other passes.
anatofuz
parents:
diff changeset
102
anatofuz
parents:
diff changeset
103 Thereafter, any Polly pass can be run over 'before-polly.ll' using the
anatofuz
parents:
diff changeset
104 'opt' tool. To found out which Polly passes are active in the standard
anatofuz
parents:
diff changeset
105 pipeline, see the output of
anatofuz
parents:
diff changeset
106
anatofuz
parents:
diff changeset
107 .. code-block:: console
anatofuz
parents:
diff changeset
108
anatofuz
parents:
diff changeset
109 clang file.c -c -O3 -mllvm -polly -mllvm -debug-pass=Arguments
anatofuz
parents:
diff changeset
110
anatofuz
parents:
diff changeset
111 The Polly's passes are those between '-polly-detect' and
anatofuz
parents:
diff changeset
112 '-polly-codegen'. Analysis passes can be omitted. At the time of this
anatofuz
parents:
diff changeset
113 writing, the default Polly pass pipeline is:
anatofuz
parents:
diff changeset
114
anatofuz
parents:
diff changeset
115 .. code-block:: console
anatofuz
parents:
diff changeset
116
anatofuz
parents:
diff changeset
117 opt before-polly.ll -polly-simplify -polly-optree -polly-delicm -polly-simplify -polly-prune-unprofitable -polly-opt-isl -polly-codegen
anatofuz
parents:
diff changeset
118
anatofuz
parents:
diff changeset
119 Note that this uses LLVM's old/legacy pass manager.
anatofuz
parents:
diff changeset
120
anatofuz
parents:
diff changeset
121 For completeness, here are some other methods that generates IR
anatofuz
parents:
diff changeset
122 suitable for processing with Polly from C/C++/Objective C source code.
anatofuz
parents:
diff changeset
123 The previous method is the recommended one.
anatofuz
parents:
diff changeset
124
anatofuz
parents:
diff changeset
125 The following generates unoptimized LLVM-IR ('-O0', which is the
anatofuz
parents:
diff changeset
126 default) and runs the canonicalizing passes on it
anatofuz
parents:
diff changeset
127 ('-polly-canonicalize'). This does /not/ include all the passes that run
anatofuz
parents:
diff changeset
128 before Polly in the default pass pipeline. The '-disable-O0-optnone'
anatofuz
parents:
diff changeset
129 option is required because otherwise clang adds an 'optnone' attribute
anatofuz
parents:
diff changeset
130 to all functions such that it is skipped by most optimization passes.
anatofuz
parents:
diff changeset
131 This is meant to stop LTO builds to optimize these functions in the
anatofuz
parents:
diff changeset
132 linking phase anyway.
anatofuz
parents:
diff changeset
133
anatofuz
parents:
diff changeset
134 .. code-block:: console
anatofuz
parents:
diff changeset
135
anatofuz
parents:
diff changeset
136 clang file.c -c -O0 -Xclang -disable-O0-optnone -emit-llvm -S -o - | opt -polly-canonicalize -S
anatofuz
parents:
diff changeset
137
anatofuz
parents:
diff changeset
138 The option '-disable-llvm-passes' disables all LLVM passes, even those
anatofuz
parents:
diff changeset
139 that run at -O0. Passing -O1 (or any optimization level other than -O0)
anatofuz
parents:
diff changeset
140 avoids that the 'optnone' attribute is added.
anatofuz
parents:
diff changeset
141
anatofuz
parents:
diff changeset
142 .. code-block:: console
anatofuz
parents:
diff changeset
143
anatofuz
parents:
diff changeset
144 clang file.c -c -O1 -Xclang -disable-llvm-passes -emit-llvm -S -o - | opt -polly-canonicalize -S
anatofuz
parents:
diff changeset
145
anatofuz
parents:
diff changeset
146 As another alternative, Polly can be pushed in front of the pass
anatofuz
parents:
diff changeset
147 pipeline, and then its output dumped. This implicitly runs the
anatofuz
parents:
diff changeset
148 '-polly-canonicalize' passes.
anatofuz
parents:
diff changeset
149
anatofuz
parents:
diff changeset
150 .. code-block:: console
anatofuz
parents:
diff changeset
151
anatofuz
parents:
diff changeset
152 clang file.c -c -O3 -mllvm -polly -mllvm -polly-position=early -mllvm -polly-dump-before-file=before-polly.ll
anatofuz
parents:
diff changeset
153
anatofuz
parents:
diff changeset
154 Further options
anatofuz
parents:
diff changeset
155 ===============
anatofuz
parents:
diff changeset
156 Polly supports further options that are mainly useful for the development or the
anatofuz
parents:
diff changeset
157 analysis of Polly. The relevant options can be added to clang by appending
anatofuz
parents:
diff changeset
158 -mllvm -option-name to the CFLAGS or the clang command line.
anatofuz
parents:
diff changeset
159
anatofuz
parents:
diff changeset
160 Limit Polly to a single function
anatofuz
parents:
diff changeset
161 --------------------------------
anatofuz
parents:
diff changeset
162
anatofuz
parents:
diff changeset
163 To limit the execution of Polly to a single function, use the option
anatofuz
parents:
diff changeset
164 -polly-only-func=functionname.
anatofuz
parents:
diff changeset
165
anatofuz
parents:
diff changeset
166 Disable LLVM-IR generation
anatofuz
parents:
diff changeset
167 --------------------------
anatofuz
parents:
diff changeset
168
anatofuz
parents:
diff changeset
169 Polly normally regenerates LLVM-IR from the Polyhedral representation. To only
anatofuz
parents:
diff changeset
170 see the effects of the preparing transformation, but to disable Polly code
anatofuz
parents:
diff changeset
171 generation add the option polly-no-codegen.
anatofuz
parents:
diff changeset
172
anatofuz
parents:
diff changeset
173 Graphical view of the SCoPs
anatofuz
parents:
diff changeset
174 ---------------------------
anatofuz
parents:
diff changeset
175 Polly can use graphviz to show the SCoPs it detects in a program. The relevant
anatofuz
parents:
diff changeset
176 options are -polly-show, -polly-show-only, -polly-dot and -polly-dot-only. The
anatofuz
parents:
diff changeset
177 'show' options automatically run dotty or another graphviz viewer to show the
anatofuz
parents:
diff changeset
178 scops graphically. The 'dot' options store for each function a dot file that
anatofuz
parents:
diff changeset
179 highlights the detected SCoPs. If 'only' is appended at the end of the option,
anatofuz
parents:
diff changeset
180 the basic blocks are shown without the statements the contain.
anatofuz
parents:
diff changeset
181
anatofuz
parents:
diff changeset
182 Change/Disable the Optimizer
anatofuz
parents:
diff changeset
183 ----------------------------
anatofuz
parents:
diff changeset
184
anatofuz
parents:
diff changeset
185 Polly uses by default the isl scheduling optimizer. The isl optimizer optimizes
anatofuz
parents:
diff changeset
186 for data-locality and parallelism using the Pluto algorithm.
anatofuz
parents:
diff changeset
187 To disable the optimizer entirely use the option -polly-optimizer=none.
anatofuz
parents:
diff changeset
188
anatofuz
parents:
diff changeset
189 Disable tiling in the optimizer
anatofuz
parents:
diff changeset
190 -------------------------------
anatofuz
parents:
diff changeset
191
anatofuz
parents:
diff changeset
192 By default both optimizers perform tiling, if possible. In case this is not
anatofuz
parents:
diff changeset
193 wanted the option -polly-tiling=false can be used to disable it. (This option
anatofuz
parents:
diff changeset
194 disables tiling for both optimizers).
anatofuz
parents:
diff changeset
195
anatofuz
parents:
diff changeset
196 Import / Export
anatofuz
parents:
diff changeset
197 ---------------
anatofuz
parents:
diff changeset
198
anatofuz
parents:
diff changeset
199 The flags -polly-import and -polly-export allow the export and reimport of the
anatofuz
parents:
diff changeset
200 polyhedral representation. By exporting, modifying and reimporting the
anatofuz
parents:
diff changeset
201 polyhedral representation externally calculated transformations can be
anatofuz
parents:
diff changeset
202 applied. This enables external optimizers or the manual optimization of
anatofuz
parents:
diff changeset
203 specific SCoPs.
anatofuz
parents:
diff changeset
204
anatofuz
parents:
diff changeset
205 Viewing Polly Diagnostics with opt-viewer
anatofuz
parents:
diff changeset
206 -----------------------------------------
anatofuz
parents:
diff changeset
207
anatofuz
parents:
diff changeset
208 The flag -fsave-optimization-record will generate .opt.yaml files when compiling
anatofuz
parents:
diff changeset
209 your program. These yaml files contain information about each emitted remark.
anatofuz
parents:
diff changeset
210 Ensure that you have Python 2.7 with PyYaml and Pygments Python Packages.
anatofuz
parents:
diff changeset
211 To run opt-viewer:
anatofuz
parents:
diff changeset
212
anatofuz
parents:
diff changeset
213 .. code-block:: console
anatofuz
parents:
diff changeset
214
anatofuz
parents:
diff changeset
215 llvm/tools/opt-viewer/opt-viewer.py -source-dir /path/to/program/src/ \
anatofuz
parents:
diff changeset
216 /path/to/program/src/foo.opt.yaml \
anatofuz
parents:
diff changeset
217 /path/to/program/src/bar.opt.yaml \
anatofuz
parents:
diff changeset
218 -o ./output
anatofuz
parents:
diff changeset
219
anatofuz
parents:
diff changeset
220 Include all yaml files (use \*.opt.yaml when specifying which yaml files to view)
anatofuz
parents:
diff changeset
221 to view all diagnostics from your program in opt-viewer. Compile with `PGO
anatofuz
parents:
diff changeset
222 <https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation>`_ to view
anatofuz
parents:
diff changeset
223 Hotness information in opt-viewer. Resulting html files can be viewed in an internet browser.