Mercurial > hg > Members > tobaru > cbc > CbC_llvm
comparison docs/Passes.rst @ 3:9ad51c7bc036
1st commit. remove git dir and add all files.
author | Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> |
---|---|
date | Wed, 15 May 2013 06:43:32 +0900 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 3:9ad51c7bc036 |
---|---|
1 .. | |
2 If Passes.html is up to date, the following "one-liner" should print | |
3 an empty diff. | |
4 | |
5 egrep -e '^<tr><td><a href="#.*">-.*</a></td><td>.*</td></tr>$' \ | |
6 -e '^ <a name=".*">.*</a>$' < Passes.html >html; \ | |
7 perl >help <<'EOT' && diff -u help html; rm -f help html | |
8 open HTML, "<Passes.html" or die "open: Passes.html: $!\n"; | |
9 while (<HTML>) { | |
10 m:^<tr><td><a href="#(.*)">-.*</a></td><td>.*</td></tr>$: or next; | |
11 $order{$1} = sprintf("%03d", 1 + int %order); | |
12 } | |
13 open HELP, "../Release/bin/opt -help|" or die "open: opt -help: $!\n"; | |
14 while (<HELP>) { | |
15 m:^ -([^ ]+) +- (.*)$: or next; | |
16 my $o = $order{$1}; | |
17 $o = "000" unless defined $o; | |
18 push @x, "$o<tr><td><a href=\"#$1\">-$1</a></td><td>$2</td></tr>\n"; | |
19 push @y, "$o <a name=\"$1\">-$1: $2</a>\n"; | |
20 } | |
21 @x = map { s/^\d\d\d//; $_ } sort @x; | |
22 @y = map { s/^\d\d\d//; $_ } sort @y; | |
23 print @x, @y; | |
24 EOT | |
25 | |
26 This (real) one-liner can also be helpful when converting comments to HTML: | |
27 | |
28 perl -e '$/ = undef; for (split(/\n/, <>)) { s:^ *///? ?::; print " <p>\n" if !$on && $_ =~ /\S/; print " </p>\n" if $on && $_ =~ /^\s*$/; print " $_\n"; $on = ($_ =~ /\S/); } print " </p>\n" if $on' | |
29 | |
30 ==================================== | |
31 LLVM's Analysis and Transform Passes | |
32 ==================================== | |
33 | |
34 .. contents:: | |
35 :local: | |
36 | |
37 Introduction | |
38 ============ | |
39 | |
40 This document serves as a high level summary of the optimization features that | |
41 LLVM provides. Optimizations are implemented as Passes that traverse some | |
42 portion of a program to either collect information or transform the program. | |
43 The table below divides the passes that LLVM provides into three categories. | |
44 Analysis passes compute information that other passes can use or for debugging | |
45 or program visualization purposes. Transform passes can use (or invalidate) | |
46 the analysis passes. Transform passes all mutate the program in some way. | |
47 Utility passes provides some utility but don't otherwise fit categorization. | |
48 For example passes to extract functions to bitcode or write a module to bitcode | |
49 are neither analysis nor transform passes. The table of contents above | |
50 provides a quick summary of each pass and links to the more complete pass | |
51 description later in the document. | |
52 | |
53 Analysis Passes | |
54 =============== | |
55 | |
56 This section describes the LLVM Analysis Passes. | |
57 | |
58 ``-aa-eval``: Exhaustive Alias Analysis Precision Evaluator | |
59 ----------------------------------------------------------- | |
60 | |
61 This is a simple N^2 alias analysis accuracy evaluator. Basically, for each | |
62 function in the program, it simply queries to see how the alias analysis | |
63 implementation answers alias queries between each pair of pointers in the | |
64 function. | |
65 | |
66 This is inspired and adapted from code by: Naveen Neelakantam, Francesco | |
67 Spadini, and Wojciech Stryjewski. | |
68 | |
69 ``-basicaa``: Basic Alias Analysis (stateless AA impl) | |
70 ------------------------------------------------------ | |
71 | |
72 A basic alias analysis pass that implements identities (two different globals | |
73 cannot alias, etc), but does no stateful analysis. | |
74 | |
75 ``-basiccg``: Basic CallGraph Construction | |
76 ------------------------------------------ | |
77 | |
78 Yet to be written. | |
79 | |
80 ``-count-aa``: Count Alias Analysis Query Responses | |
81 --------------------------------------------------- | |
82 | |
83 A pass which can be used to count how many alias queries are being made and how | |
84 the alias analysis implementation being used responds. | |
85 | |
86 ``-da``: Dependence Analysis | |
87 ---------------------------- | |
88 | |
89 Dependence analysis framework, which is used to detect dependences in memory | |
90 accesses. | |
91 | |
92 ``-debug-aa``: AA use debugger | |
93 ------------------------------ | |
94 | |
95 This simple pass checks alias analysis users to ensure that if they create a | |
96 new value, they do not query AA without informing it of the value. It acts as | |
97 a shim over any other AA pass you want. | |
98 | |
99 Yes keeping track of every value in the program is expensive, but this is a | |
100 debugging pass. | |
101 | |
102 ``-domfrontier``: Dominance Frontier Construction | |
103 ------------------------------------------------- | |
104 | |
105 This pass is a simple dominator construction algorithm for finding forward | |
106 dominator frontiers. | |
107 | |
108 ``-domtree``: Dominator Tree Construction | |
109 ----------------------------------------- | |
110 | |
111 This pass is a simple dominator construction algorithm for finding forward | |
112 dominators. | |
113 | |
114 | |
115 ``-dot-callgraph``: Print Call Graph to "dot" file | |
116 -------------------------------------------------- | |
117 | |
118 This pass, only available in ``opt``, prints the call graph into a ``.dot`` | |
119 graph. This graph can then be processed with the "dot" tool to convert it to | |
120 postscript or some other suitable format. | |
121 | |
122 ``-dot-cfg``: Print CFG of function to "dot" file | |
123 ------------------------------------------------- | |
124 | |
125 This pass, only available in ``opt``, prints the control flow graph into a | |
126 ``.dot`` graph. This graph can then be processed with the :program:`dot` tool | |
127 to convert it to postscript or some other suitable format. | |
128 | |
129 ``-dot-cfg-only``: Print CFG of function to "dot" file (with no function bodies) | |
130 -------------------------------------------------------------------------------- | |
131 | |
132 This pass, only available in ``opt``, prints the control flow graph into a | |
133 ``.dot`` graph, omitting the function bodies. This graph can then be processed | |
134 with the :program:`dot` tool to convert it to postscript or some other suitable | |
135 format. | |
136 | |
137 ``-dot-dom``: Print dominance tree of function to "dot" file | |
138 ------------------------------------------------------------ | |
139 | |
140 This pass, only available in ``opt``, prints the dominator tree into a ``.dot`` | |
141 graph. This graph can then be processed with the :program:`dot` tool to | |
142 convert it to postscript or some other suitable format. | |
143 | |
144 ``-dot-dom-only``: Print dominance tree of function to "dot" file (with no function bodies) | |
145 ------------------------------------------------------------------------------------------- | |
146 | |
147 This pass, only available in ``opt``, prints the dominator tree into a ``.dot`` | |
148 graph, omitting the function bodies. This graph can then be processed with the | |
149 :program:`dot` tool to convert it to postscript or some other suitable format. | |
150 | |
151 ``-dot-postdom``: Print postdominance tree of function to "dot" file | |
152 -------------------------------------------------------------------- | |
153 | |
154 This pass, only available in ``opt``, prints the post dominator tree into a | |
155 ``.dot`` graph. This graph can then be processed with the :program:`dot` tool | |
156 to convert it to postscript or some other suitable format. | |
157 | |
158 ``-dot-postdom-only``: Print postdominance tree of function to "dot" file (with no function bodies) | |
159 --------------------------------------------------------------------------------------------------- | |
160 | |
161 This pass, only available in ``opt``, prints the post dominator tree into a | |
162 ``.dot`` graph, omitting the function bodies. This graph can then be processed | |
163 with the :program:`dot` tool to convert it to postscript or some other suitable | |
164 format. | |
165 | |
166 ``-globalsmodref-aa``: Simple mod/ref analysis for globals | |
167 ---------------------------------------------------------- | |
168 | |
169 This simple pass provides alias and mod/ref information for global values that | |
170 do not have their address taken, and keeps track of whether functions read or | |
171 write memory (are "pure"). For this simple (but very common) case, we can | |
172 provide pretty accurate and useful information. | |
173 | |
174 ``-instcount``: Counts the various types of ``Instruction``\ s | |
175 -------------------------------------------------------------- | |
176 | |
177 This pass collects the count of all instructions and reports them. | |
178 | |
179 ``-intervals``: Interval Partition Construction | |
180 ----------------------------------------------- | |
181 | |
182 This analysis calculates and represents the interval partition of a function, | |
183 or a preexisting interval partition. | |
184 | |
185 In this way, the interval partition may be used to reduce a flow graph down to | |
186 its degenerate single node interval partition (unless it is irreducible). | |
187 | |
188 ``-iv-users``: Induction Variable Users | |
189 --------------------------------------- | |
190 | |
191 Bookkeeping for "interesting" users of expressions computed from induction | |
192 variables. | |
193 | |
194 ``-lazy-value-info``: Lazy Value Information Analysis | |
195 ----------------------------------------------------- | |
196 | |
197 Interface for lazy computation of value constraint information. | |
198 | |
199 ``-libcall-aa``: LibCall Alias Analysis | |
200 --------------------------------------- | |
201 | |
202 LibCall Alias Analysis. | |
203 | |
204 ``-lint``: Statically lint-checks LLVM IR | |
205 ----------------------------------------- | |
206 | |
207 This pass statically checks for common and easily-identified constructs which | |
208 produce undefined or likely unintended behavior in LLVM IR. | |
209 | |
210 It is not a guarantee of correctness, in two ways. First, it isn't | |
211 comprehensive. There are checks which could be done statically which are not | |
212 yet implemented. Some of these are indicated by TODO comments, but those | |
213 aren't comprehensive either. Second, many conditions cannot be checked | |
214 statically. This pass does no dynamic instrumentation, so it can't check for | |
215 all possible problems. | |
216 | |
217 Another limitation is that it assumes all code will be executed. A store | |
218 through a null pointer in a basic block which is never reached is harmless, but | |
219 this pass will warn about it anyway. | |
220 | |
221 Optimization passes may make conditions that this pass checks for more or less | |
222 obvious. If an optimization pass appears to be introducing a warning, it may | |
223 be that the optimization pass is merely exposing an existing condition in the | |
224 code. | |
225 | |
226 This code may be run before :ref:`instcombine <passes-instcombine>`. In many | |
227 cases, instcombine checks for the same kinds of things and turns instructions | |
228 with undefined behavior into unreachable (or equivalent). Because of this, | |
229 this pass makes some effort to look through bitcasts and so on. | |
230 | |
231 ``-loops``: Natural Loop Information | |
232 ------------------------------------ | |
233 | |
234 This analysis is used to identify natural loops and determine the loop depth of | |
235 various nodes of the CFG. Note that the loops identified may actually be | |
236 several natural loops that share the same header node... not just a single | |
237 natural loop. | |
238 | |
239 ``-memdep``: Memory Dependence Analysis | |
240 --------------------------------------- | |
241 | |
242 An analysis that determines, for a given memory operation, what preceding | |
243 memory operations it depends on. It builds on alias analysis information, and | |
244 tries to provide a lazy, caching interface to a common kind of alias | |
245 information query. | |
246 | |
247 ``-module-debuginfo``: Decodes module-level debug info | |
248 ------------------------------------------------------ | |
249 | |
250 This pass decodes the debug info metadata in a module and prints in a | |
251 (sufficiently-prepared-) human-readable form. | |
252 | |
253 For example, run this pass from ``opt`` along with the ``-analyze`` option, and | |
254 it'll print to standard output. | |
255 | |
256 ``-no-aa``: No Alias Analysis (always returns 'may' alias) | |
257 ---------------------------------------------------------- | |
258 | |
259 This is the default implementation of the Alias Analysis interface. It always | |
260 returns "I don't know" for alias queries. NoAA is unlike other alias analysis | |
261 implementations, in that it does not chain to a previous analysis. As such it | |
262 doesn't follow many of the rules that other alias analyses must. | |
263 | |
264 ``-no-profile``: No Profile Information | |
265 --------------------------------------- | |
266 | |
267 The default "no profile" implementation of the abstract ``ProfileInfo`` | |
268 interface. | |
269 | |
270 ``-postdomfrontier``: Post-Dominance Frontier Construction | |
271 ---------------------------------------------------------- | |
272 | |
273 This pass is a simple post-dominator construction algorithm for finding | |
274 post-dominator frontiers. | |
275 | |
276 ``-postdomtree``: Post-Dominator Tree Construction | |
277 -------------------------------------------------- | |
278 | |
279 This pass is a simple post-dominator construction algorithm for finding | |
280 post-dominators. | |
281 | |
282 ``-print-alias-sets``: Alias Set Printer | |
283 ---------------------------------------- | |
284 | |
285 Yet to be written. | |
286 | |
287 ``-print-callgraph``: Print a call graph | |
288 ---------------------------------------- | |
289 | |
290 This pass, only available in ``opt``, prints the call graph to standard error | |
291 in a human-readable form. | |
292 | |
293 ``-print-callgraph-sccs``: Print SCCs of the Call Graph | |
294 ------------------------------------------------------- | |
295 | |
296 This pass, only available in ``opt``, prints the SCCs of the call graph to | |
297 standard error in a human-readable form. | |
298 | |
299 ``-print-cfg-sccs``: Print SCCs of each function CFG | |
300 ---------------------------------------------------- | |
301 | |
302 This pass, only available in ``opt``, printsthe SCCs of each function CFG to | |
303 standard error in a human-readable fom. | |
304 | |
305 ``-print-dbginfo``: Print debug info in human readable form | |
306 ----------------------------------------------------------- | |
307 | |
308 Pass that prints instructions, and associated debug info: | |
309 | |
310 #. source/line/col information | |
311 #. original variable name | |
312 #. original type name | |
313 | |
314 ``-print-dom-info``: Dominator Info Printer | |
315 ------------------------------------------- | |
316 | |
317 Dominator Info Printer. | |
318 | |
319 ``-print-externalfnconstants``: Print external fn callsites passed constants | |
320 ---------------------------------------------------------------------------- | |
321 | |
322 This pass, only available in ``opt``, prints out call sites to external | |
323 functions that are called with constant arguments. This can be useful when | |
324 looking for standard library functions we should constant fold or handle in | |
325 alias analyses. | |
326 | |
327 ``-print-function``: Print function to stderr | |
328 --------------------------------------------- | |
329 | |
330 The ``PrintFunctionPass`` class is designed to be pipelined with other | |
331 ``FunctionPasses``, and prints out the functions of the module as they are | |
332 processed. | |
333 | |
334 ``-print-module``: Print module to stderr | |
335 ----------------------------------------- | |
336 | |
337 This pass simply prints out the entire module when it is executed. | |
338 | |
339 .. _passes-print-used-types: | |
340 | |
341 ``-print-used-types``: Find Used Types | |
342 -------------------------------------- | |
343 | |
344 This pass is used to seek out all of the types in use by the program. Note | |
345 that this analysis explicitly does not include types only used by the symbol | |
346 table. | |
347 | |
348 ``-profile-estimator``: Estimate profiling information | |
349 ------------------------------------------------------ | |
350 | |
351 Profiling information that estimates the profiling information in a very crude | |
352 and unimaginative way. | |
353 | |
354 ``-profile-loader``: Load profile information from ``llvmprof.out`` | |
355 ------------------------------------------------------------------- | |
356 | |
357 A concrete implementation of profiling information that loads the information | |
358 from a profile dump file. | |
359 | |
360 ``-profile-verifier``: Verify profiling information | |
361 --------------------------------------------------- | |
362 | |
363 Pass that checks profiling information for plausibility. | |
364 | |
365 ``-regions``: Detect single entry single exit regions | |
366 ----------------------------------------------------- | |
367 | |
368 The ``RegionInfo`` pass detects single entry single exit regions in a function, | |
369 where a region is defined as any subgraph that is connected to the remaining | |
370 graph at only two spots. Furthermore, an hierarchical region tree is built. | |
371 | |
372 ``-scalar-evolution``: Scalar Evolution Analysis | |
373 ------------------------------------------------ | |
374 | |
375 The ``ScalarEvolution`` analysis can be used to analyze and catagorize scalar | |
376 expressions in loops. It specializes in recognizing general induction | |
377 variables, representing them with the abstract and opaque ``SCEV`` class. | |
378 Given this analysis, trip counts of loops and other important properties can be | |
379 obtained. | |
380 | |
381 This analysis is primarily useful for induction variable substitution and | |
382 strength reduction. | |
383 | |
384 ``-scev-aa``: ScalarEvolution-based Alias Analysis | |
385 -------------------------------------------------- | |
386 | |
387 Simple alias analysis implemented in terms of ``ScalarEvolution`` queries. | |
388 | |
389 This differs from traditional loop dependence analysis in that it tests for | |
390 dependencies within a single iteration of a loop, rather than dependencies | |
391 between different iterations. | |
392 | |
393 ``ScalarEvolution`` has a more complete understanding of pointer arithmetic | |
394 than ``BasicAliasAnalysis``' collection of ad-hoc analyses. | |
395 | |
396 ``-targetdata``: Target Data Layout | |
397 ----------------------------------- | |
398 | |
399 Provides other passes access to information on how the size and alignment | |
400 required by the target ABI for various data types. | |
401 | |
402 Transform Passes | |
403 ================ | |
404 | |
405 This section describes the LLVM Transform Passes. | |
406 | |
407 ``-adce``: Aggressive Dead Code Elimination | |
408 ------------------------------------------- | |
409 | |
410 ADCE aggressively tries to eliminate code. This pass is similar to :ref:`DCE | |
411 <passes-dce>` but it assumes that values are dead until proven otherwise. This | |
412 is similar to :ref:`SCCP <passes-sccp>`, except applied to the liveness of | |
413 values. | |
414 | |
415 ``-always-inline``: Inliner for ``always_inline`` functions | |
416 ----------------------------------------------------------- | |
417 | |
418 A custom inliner that handles only functions that are marked as "always | |
419 inline". | |
420 | |
421 ``-argpromotion``: Promote 'by reference' arguments to scalars | |
422 -------------------------------------------------------------- | |
423 | |
424 This pass promotes "by reference" arguments to be "by value" arguments. In | |
425 practice, this means looking for internal functions that have pointer | |
426 arguments. If it can prove, through the use of alias analysis, that an | |
427 argument is *only* loaded, then it can pass the value into the function instead | |
428 of the address of the value. This can cause recursive simplification of code | |
429 and lead to the elimination of allocas (especially in C++ template code like | |
430 the STL). | |
431 | |
432 This pass also handles aggregate arguments that are passed into a function, | |
433 scalarizing them if the elements of the aggregate are only loaded. Note that | |
434 it refuses to scalarize aggregates which would require passing in more than | |
435 three operands to the function, because passing thousands of operands for a | |
436 large array or structure is unprofitable! | |
437 | |
438 Note that this transformation could also be done for arguments that are only | |
439 stored to (returning the value instead), but does not currently. This case | |
440 would be best handled when and if LLVM starts supporting multiple return values | |
441 from functions. | |
442 | |
443 ``-bb-vectorize``: Basic-Block Vectorization | |
444 -------------------------------------------- | |
445 | |
446 This pass combines instructions inside basic blocks to form vector | |
447 instructions. It iterates over each basic block, attempting to pair compatible | |
448 instructions, repeating this process until no additional pairs are selected for | |
449 vectorization. When the outputs of some pair of compatible instructions are | |
450 used as inputs by some other pair of compatible instructions, those pairs are | |
451 part of a potential vectorization chain. Instruction pairs are only fused into | |
452 vector instructions when they are part of a chain longer than some threshold | |
453 length. Moreover, the pass attempts to find the best possible chain for each | |
454 pair of compatible instructions. These heuristics are intended to prevent | |
455 vectorization in cases where it would not yield a performance increase of the | |
456 resulting code. | |
457 | |
458 ``-block-placement``: Profile Guided Basic Block Placement | |
459 ---------------------------------------------------------- | |
460 | |
461 This pass is a very simple profile guided basic block placement algorithm. The | |
462 idea is to put frequently executed blocks together at the start of the function | |
463 and hopefully increase the number of fall-through conditional branches. If | |
464 there is no profile information for a particular function, this pass basically | |
465 orders blocks in depth-first order. | |
466 | |
467 ``-break-crit-edges``: Break critical edges in CFG | |
468 -------------------------------------------------- | |
469 | |
470 Break all of the critical edges in the CFG by inserting a dummy basic block. | |
471 It may be "required" by passes that cannot deal with critical edges. This | |
472 transformation obviously invalidates the CFG, but can update forward dominator | |
473 (set, immediate dominators, tree, and frontier) information. | |
474 | |
475 ``-codegenprepare``: Optimize for code generation | |
476 ------------------------------------------------- | |
477 | |
478 This pass munges the code in the input function to better prepare it for | |
479 SelectionDAG-based code generation. This works around limitations in it's | |
480 basic-block-at-a-time approach. It should eventually be removed. | |
481 | |
482 ``-constmerge``: Merge Duplicate Global Constants | |
483 ------------------------------------------------- | |
484 | |
485 Merges duplicate global constants together into a single constant that is | |
486 shared. This is useful because some passes (i.e., TraceValues) insert a lot of | |
487 string constants into the program, regardless of whether or not an existing | |
488 string is available. | |
489 | |
490 ``-constprop``: Simple constant propagation | |
491 ------------------------------------------- | |
492 | |
493 This file implements constant propagation and merging. It looks for | |
494 instructions involving only constant operands and replaces them with a constant | |
495 value instead of an instruction. For example: | |
496 | |
497 .. code-block:: llvm | |
498 | |
499 add i32 1, 2 | |
500 | |
501 becomes | |
502 | |
503 .. code-block:: llvm | |
504 | |
505 i32 3 | |
506 | |
507 NOTE: this pass has a habit of making definitions be dead. It is a good idea | |
508 to to run a :ref:`Dead Instruction Elimination <passes-die>` pass sometime | |
509 after running this pass. | |
510 | |
511 .. _passes-dce: | |
512 | |
513 ``-dce``: Dead Code Elimination | |
514 ------------------------------- | |
515 | |
516 Dead code elimination is similar to :ref:`dead instruction elimination | |
517 <passes-die>`, but it rechecks instructions that were used by removed | |
518 instructions to see if they are newly dead. | |
519 | |
520 ``-deadargelim``: Dead Argument Elimination | |
521 ------------------------------------------- | |
522 | |
523 This pass deletes dead arguments from internal functions. Dead argument | |
524 elimination removes arguments which are directly dead, as well as arguments | |
525 only passed into function calls as dead arguments of other functions. This | |
526 pass also deletes dead arguments in a similar way. | |
527 | |
528 This pass is often useful as a cleanup pass to run after aggressive | |
529 interprocedural passes, which add possibly-dead arguments. | |
530 | |
531 ``-deadtypeelim``: Dead Type Elimination | |
532 ---------------------------------------- | |
533 | |
534 This pass is used to cleanup the output of GCC. It eliminate names for types | |
535 that are unused in the entire translation unit, using the :ref:`find used types | |
536 <passes-print-used-types>` pass. | |
537 | |
538 .. _passes-die: | |
539 | |
540 ``-die``: Dead Instruction Elimination | |
541 -------------------------------------- | |
542 | |
543 Dead instruction elimination performs a single pass over the function, removing | |
544 instructions that are obviously dead. | |
545 | |
546 ``-dse``: Dead Store Elimination | |
547 -------------------------------- | |
548 | |
549 A trivial dead store elimination that only considers basic-block local | |
550 redundant stores. | |
551 | |
552 ``-functionattrs``: Deduce function attributes | |
553 ---------------------------------------------- | |
554 | |
555 A simple interprocedural pass which walks the call-graph, looking for functions | |
556 which do not access or only read non-local memory, and marking them | |
557 ``readnone``/``readonly``. In addition, it marks function arguments (of | |
558 pointer type) "``nocapture``" if a call to the function does not create any | |
559 copies of the pointer value that outlive the call. This more or less means | |
560 that the pointer is only dereferenced, and not returned from the function or | |
561 stored in a global. This pass is implemented as a bottom-up traversal of the | |
562 call-graph. | |
563 | |
564 ``-globaldce``: Dead Global Elimination | |
565 --------------------------------------- | |
566 | |
567 This transform is designed to eliminate unreachable internal globals from the | |
568 program. It uses an aggressive algorithm, searching out globals that are known | |
569 to be alive. After it finds all of the globals which are needed, it deletes | |
570 whatever is left over. This allows it to delete recursive chunks of the | |
571 program which are unreachable. | |
572 | |
573 ``-globalopt``: Global Variable Optimizer | |
574 ----------------------------------------- | |
575 | |
576 This pass transforms simple global variables that never have their address | |
577 taken. If obviously true, it marks read/write globals as constant, deletes | |
578 variables only stored to, etc. | |
579 | |
580 ``-gvn``: Global Value Numbering | |
581 -------------------------------- | |
582 | |
583 This pass performs global value numbering to eliminate fully and partially | |
584 redundant instructions. It also performs redundant load elimination. | |
585 | |
586 .. _passes-indvars: | |
587 | |
588 ``-indvars``: Canonicalize Induction Variables | |
589 ---------------------------------------------- | |
590 | |
591 This transformation analyzes and transforms the induction variables (and | |
592 computations derived from them) into simpler forms suitable for subsequent | |
593 analysis and transformation. | |
594 | |
595 This transformation makes the following changes to each loop with an | |
596 identifiable induction variable: | |
597 | |
598 * All loops are transformed to have a *single* canonical induction variable | |
599 which starts at zero and steps by one. | |
600 * The canonical induction variable is guaranteed to be the first PHI node in | |
601 the loop header block. | |
602 * Any pointer arithmetic recurrences are raised to use array subscripts. | |
603 | |
604 If the trip count of a loop is computable, this pass also makes the following | |
605 changes: | |
606 | |
607 * The exit condition for the loop is canonicalized to compare the induction | |
608 value against the exit value. This turns loops like: | |
609 | |
610 .. code-block:: c++ | |
611 | |
612 for (i = 7; i*i < 1000; ++i) | |
613 | |
614 into | |
615 | |
616 .. code-block:: c++ | |
617 | |
618 for (i = 0; i != 25; ++i) | |
619 | |
620 * Any use outside of the loop of an expression derived from the indvar is | |
621 changed to compute the derived value outside of the loop, eliminating the | |
622 dependence on the exit value of the induction variable. If the only purpose | |
623 of the loop is to compute the exit value of some derived expression, this | |
624 transformation will make the loop dead. | |
625 | |
626 This transformation should be followed by strength reduction after all of the | |
627 desired loop transformations have been performed. Additionally, on targets | |
628 where it is profitable, the loop could be transformed to count down to zero | |
629 (the "do loop" optimization). | |
630 | |
631 ``-inline``: Function Integration/Inlining | |
632 ------------------------------------------ | |
633 | |
634 Bottom-up inlining of functions into callees. | |
635 | |
636 ``-insert-edge-profiling``: Insert instrumentation for edge profiling | |
637 --------------------------------------------------------------------- | |
638 | |
639 This pass instruments the specified program with counters for edge profiling. | |
640 Edge profiling can give a reasonable approximation of the hot paths through a | |
641 program, and is used for a wide variety of program transformations. | |
642 | |
643 Note that this implementation is very naïve. It inserts a counter for *every* | |
644 edge in the program, instead of using control flow information to prune the | |
645 number of counters inserted. | |
646 | |
647 ``-insert-optimal-edge-profiling``: Insert optimal instrumentation for edge profiling | |
648 ------------------------------------------------------------------------------------- | |
649 | |
650 This pass instruments the specified program with counters for edge profiling. | |
651 Edge profiling can give a reasonable approximation of the hot paths through a | |
652 program, and is used for a wide variety of program transformations. | |
653 | |
654 .. _passes-instcombine: | |
655 | |
656 ``-instcombine``: Combine redundant instructions | |
657 ------------------------------------------------ | |
658 | |
659 Combine instructions to form fewer, simple instructions. This pass does not | |
660 modify the CFG This pass is where algebraic simplification happens. | |
661 | |
662 This pass combines things like: | |
663 | |
664 .. code-block:: llvm | |
665 | |
666 %Y = add i32 %X, 1 | |
667 %Z = add i32 %Y, 1 | |
668 | |
669 into: | |
670 | |
671 .. code-block:: llvm | |
672 | |
673 %Z = add i32 %X, 2 | |
674 | |
675 This is a simple worklist driven algorithm. | |
676 | |
677 This pass guarantees that the following canonicalizations are performed on the | |
678 program: | |
679 | |
680 #. If a binary operator has a constant operand, it is moved to the right-hand | |
681 side. | |
682 #. Bitwise operators with constant operands are always grouped so that shifts | |
683 are performed first, then ``or``\ s, then ``and``\ s, then ``xor``\ s. | |
684 #. Compare instructions are converted from ``<``, ``>``, ``≤``, or ``≥`` to | |
685 ``=`` or ``≠`` if possible. | |
686 #. All ``cmp`` instructions on boolean values are replaced with logical | |
687 operations. | |
688 #. ``add X, X`` is represented as ``mul X, 2`` ⇒ ``shl X, 1`` | |
689 #. Multiplies with a constant power-of-two argument are transformed into | |
690 shifts. | |
691 #. … etc. | |
692 | |
693 ``-internalize``: Internalize Global Symbols | |
694 -------------------------------------------- | |
695 | |
696 This pass loops over all of the functions in the input module, looking for a | |
697 main function. If a main function is found, all other functions and all global | |
698 variables with initializers are marked as internal. | |
699 | |
700 ``-ipconstprop``: Interprocedural constant propagation | |
701 ------------------------------------------------------ | |
702 | |
703 This pass implements an *extremely* simple interprocedural constant propagation | |
704 pass. It could certainly be improved in many different ways, like using a | |
705 worklist. This pass makes arguments dead, but does not remove them. The | |
706 existing dead argument elimination pass should be run after this to clean up | |
707 the mess. | |
708 | |
709 ``-ipsccp``: Interprocedural Sparse Conditional Constant Propagation | |
710 -------------------------------------------------------------------- | |
711 | |
712 An interprocedural variant of :ref:`Sparse Conditional Constant Propagation | |
713 <passes-sccp>`. | |
714 | |
715 ``-jump-threading``: Jump Threading | |
716 ----------------------------------- | |
717 | |
718 Jump threading tries to find distinct threads of control flow running through a | |
719 basic block. This pass looks at blocks that have multiple predecessors and | |
720 multiple successors. If one or more of the predecessors of the block can be | |
721 proven to always cause a jump to one of the successors, we forward the edge | |
722 from the predecessor to the successor by duplicating the contents of this | |
723 block. | |
724 | |
725 An example of when this can occur is code like this: | |
726 | |
727 .. code-block:: c++ | |
728 | |
729 if () { ... | |
730 X = 4; | |
731 } | |
732 if (X < 3) { | |
733 | |
734 In this case, the unconditional branch at the end of the first if can be | |
735 revectored to the false side of the second if. | |
736 | |
737 ``-lcssa``: Loop-Closed SSA Form Pass | |
738 ------------------------------------- | |
739 | |
740 This pass transforms loops by placing phi nodes at the end of the loops for all | |
741 values that are live across the loop boundary. For example, it turns the left | |
742 into the right code: | |
743 | |
744 .. code-block:: c++ | |
745 | |
746 for (...) for (...) | |
747 if (c) if (c) | |
748 X1 = ... X1 = ... | |
749 else else | |
750 X2 = ... X2 = ... | |
751 X3 = phi(X1, X2) X3 = phi(X1, X2) | |
752 ... = X3 + 4 X4 = phi(X3) | |
753 ... = X4 + 4 | |
754 | |
755 This is still valid LLVM; the extra phi nodes are purely redundant, and will be | |
756 trivially eliminated by ``InstCombine``. The major benefit of this | |
757 transformation is that it makes many other loop optimizations, such as | |
758 ``LoopUnswitch``\ ing, simpler. | |
759 | |
760 .. _passes-licm: | |
761 | |
762 ``-licm``: Loop Invariant Code Motion | |
763 ------------------------------------- | |
764 | |
765 This pass performs loop invariant code motion, attempting to remove as much | |
766 code from the body of a loop as possible. It does this by either hoisting code | |
767 into the preheader block, or by sinking code to the exit blocks if it is safe. | |
768 This pass also promotes must-aliased memory locations in the loop to live in | |
769 registers, thus hoisting and sinking "invariant" loads and stores. | |
770 | |
771 This pass uses alias analysis for two purposes: | |
772 | |
773 #. Moving loop invariant loads and calls out of loops. If we can determine | |
774 that a load or call inside of a loop never aliases anything stored to, we | |
775 can hoist it or sink it like any other instruction. | |
776 | |
777 #. Scalar Promotion of Memory. If there is a store instruction inside of the | |
778 loop, we try to move the store to happen AFTER the loop instead of inside of | |
779 the loop. This can only happen if a few conditions are true: | |
780 | |
781 #. The pointer stored through is loop invariant. | |
782 #. There are no stores or loads in the loop which *may* alias the pointer. | |
783 There are no calls in the loop which mod/ref the pointer. | |
784 | |
785 If these conditions are true, we can promote the loads and stores in the | |
786 loop of the pointer to use a temporary alloca'd variable. We then use the | |
787 :ref:`mem2reg <passes-mem2reg>` functionality to construct the appropriate | |
788 SSA form for the variable. | |
789 | |
790 ``-loop-deletion``: Delete dead loops | |
791 ------------------------------------- | |
792 | |
793 This file implements the Dead Loop Deletion Pass. This pass is responsible for | |
794 eliminating loops with non-infinite computable trip counts that have no side | |
795 effects or volatile instructions, and do not contribute to the computation of | |
796 the function's return value. | |
797 | |
798 .. _passes-loop-extract: | |
799 | |
800 ``-loop-extract``: Extract loops into new functions | |
801 --------------------------------------------------- | |
802 | |
803 A pass wrapper around the ``ExtractLoop()`` scalar transformation to extract | |
804 each top-level loop into its own new function. If the loop is the *only* loop | |
805 in a given function, it is not touched. This is a pass most useful for | |
806 debugging via bugpoint. | |
807 | |
808 ``-loop-extract-single``: Extract at most one loop into a new function | |
809 ---------------------------------------------------------------------- | |
810 | |
811 Similar to :ref:`Extract loops into new functions <passes-loop-extract>`, this | |
812 pass extracts one natural loop from the program into a function if it can. | |
813 This is used by :program:`bugpoint`. | |
814 | |
815 ``-loop-reduce``: Loop Strength Reduction | |
816 ----------------------------------------- | |
817 | |
818 This pass performs a strength reduction on array references inside loops that | |
819 have as one or more of their components the loop induction variable. This is | |
820 accomplished by creating a new value to hold the initial value of the array | |
821 access for the first iteration, and then creating a new GEP instruction in the | |
822 loop to increment the value by the appropriate amount. | |
823 | |
824 ``-loop-rotate``: Rotate Loops | |
825 ------------------------------ | |
826 | |
827 A simple loop rotation transformation. | |
828 | |
829 ``-loop-simplify``: Canonicalize natural loops | |
830 ---------------------------------------------- | |
831 | |
832 This pass performs several transformations to transform natural loops into a | |
833 simpler form, which makes subsequent analyses and transformations simpler and | |
834 more effective. | |
835 | |
836 Loop pre-header insertion guarantees that there is a single, non-critical entry | |
837 edge from outside of the loop to the loop header. This simplifies a number of | |
838 analyses and transformations, such as :ref:`LICM <passes-licm>`. | |
839 | |
840 Loop exit-block insertion guarantees that all exit blocks from the loop (blocks | |
841 which are outside of the loop that have predecessors inside of the loop) only | |
842 have predecessors from inside of the loop (and are thus dominated by the loop | |
843 header). This simplifies transformations such as store-sinking that are built | |
844 into LICM. | |
845 | |
846 This pass also guarantees that loops will have exactly one backedge. | |
847 | |
848 Note that the :ref:`simplifycfg <passes-simplifycfg>` pass will clean up blocks | |
849 which are split out but end up being unnecessary, so usage of this pass should | |
850 not pessimize generated code. | |
851 | |
852 This pass obviously modifies the CFG, but updates loop information and | |
853 dominator information. | |
854 | |
855 ``-loop-unroll``: Unroll loops | |
856 ------------------------------ | |
857 | |
858 This pass implements a simple loop unroller. It works best when loops have | |
859 been canonicalized by the :ref:`indvars <passes-indvars>` pass, allowing it to | |
860 determine the trip counts of loops easily. | |
861 | |
862 ``-loop-unswitch``: Unswitch loops | |
863 ---------------------------------- | |
864 | |
865 This pass transforms loops that contain branches on loop-invariant conditions | |
866 to have multiple loops. For example, it turns the left into the right code: | |
867 | |
868 .. code-block:: c++ | |
869 | |
870 for (...) if (lic) | |
871 A for (...) | |
872 if (lic) A; B; C | |
873 B else | |
874 C for (...) | |
875 A; C | |
876 | |
877 This can increase the size of the code exponentially (doubling it every time a | |
878 loop is unswitched) so we only unswitch if the resultant code will be smaller | |
879 than a threshold. | |
880 | |
881 This pass expects :ref:`LICM <passes-licm>` to be run before it to hoist | |
882 invariant conditions out of the loop, to make the unswitching opportunity | |
883 obvious. | |
884 | |
885 ``-loweratomic``: Lower atomic intrinsics to non-atomic form | |
886 ------------------------------------------------------------ | |
887 | |
888 This pass lowers atomic intrinsics to non-atomic form for use in a known | |
889 non-preemptible environment. | |
890 | |
891 The pass does not verify that the environment is non-preemptible (in general | |
892 this would require knowledge of the entire call graph of the program including | |
893 any libraries which may not be available in bitcode form); it simply lowers | |
894 every atomic intrinsic. | |
895 | |
896 ``-lowerinvoke``: Lower invoke and unwind, for unwindless code generators | |
897 ------------------------------------------------------------------------- | |
898 | |
899 This transformation is designed for use by code generators which do not yet | |
900 support stack unwinding. This pass supports two models of exception handling | |
901 lowering, the "cheap" support and the "expensive" support. | |
902 | |
903 "Cheap" exception handling support gives the program the ability to execute any | |
904 program which does not "throw an exception", by turning "``invoke``" | |
905 instructions into calls and by turning "``unwind``" instructions into calls to | |
906 ``abort()``. If the program does dynamically use the "``unwind``" instruction, | |
907 the program will print a message then abort. | |
908 | |
909 "Expensive" exception handling support gives the full exception handling | |
910 support to the program at the cost of making the "``invoke``" instruction | |
911 really expensive. It basically inserts ``setjmp``/``longjmp`` calls to emulate | |
912 the exception handling as necessary. | |
913 | |
914 Because the "expensive" support slows down programs a lot, and EH is only used | |
915 for a subset of the programs, it must be specifically enabled by the | |
916 ``-enable-correct-eh-support`` option. | |
917 | |
918 Note that after this pass runs the CFG is not entirely accurate (exceptional | |
919 control flow edges are not correct anymore) so only very simple things should | |
920 be done after the ``lowerinvoke`` pass has run (like generation of native | |
921 code). This should not be used as a general purpose "my LLVM-to-LLVM pass | |
922 doesn't support the ``invoke`` instruction yet" lowering pass. | |
923 | |
924 ``-lowerswitch``: Lower ``SwitchInst``\ s to branches | |
925 ----------------------------------------------------- | |
926 | |
927 Rewrites switch instructions with a sequence of branches, which allows targets | |
928 to get away with not implementing the switch instruction until it is | |
929 convenient. | |
930 | |
931 .. _passes-mem2reg: | |
932 | |
933 ``-mem2reg``: Promote Memory to Register | |
934 ---------------------------------------- | |
935 | |
936 This file promotes memory references to be register references. It promotes | |
937 alloca instructions which only have loads and stores as uses. An ``alloca`` is | |
938 transformed by using dominator frontiers to place phi nodes, then traversing | |
939 the function in depth-first order to rewrite loads and stores as appropriate. | |
940 This is just the standard SSA construction algorithm to construct "pruned" SSA | |
941 form. | |
942 | |
943 ``-memcpyopt``: MemCpy Optimization | |
944 ----------------------------------- | |
945 | |
946 This pass performs various transformations related to eliminating ``memcpy`` | |
947 calls, or transforming sets of stores into ``memset``\ s. | |
948 | |
949 ``-mergefunc``: Merge Functions | |
950 ------------------------------- | |
951 | |
952 This pass looks for equivalent functions that are mergable and folds them. | |
953 | |
954 A hash is computed from the function, based on its type and number of basic | |
955 blocks. | |
956 | |
957 Once all hashes are computed, we perform an expensive equality comparison on | |
958 each function pair. This takes n^2/2 comparisons per bucket, so it's important | |
959 that the hash function be high quality. The equality comparison iterates | |
960 through each instruction in each basic block. | |
961 | |
962 When a match is found the functions are folded. If both functions are | |
963 overridable, we move the functionality into a new internal function and leave | |
964 two overridable thunks to it. | |
965 | |
966 ``-mergereturn``: Unify function exit nodes | |
967 ------------------------------------------- | |
968 | |
969 Ensure that functions have at most one ``ret`` instruction in them. | |
970 Additionally, it keeps track of which node is the new exit node of the CFG. | |
971 | |
972 ``-partial-inliner``: Partial Inliner | |
973 ------------------------------------- | |
974 | |
975 This pass performs partial inlining, typically by inlining an ``if`` statement | |
976 that surrounds the body of the function. | |
977 | |
978 ``-prune-eh``: Remove unused exception handling info | |
979 ---------------------------------------------------- | |
980 | |
981 This file implements a simple interprocedural pass which walks the call-graph, | |
982 turning invoke instructions into call instructions if and only if the callee | |
983 cannot throw an exception. It implements this as a bottom-up traversal of the | |
984 call-graph. | |
985 | |
986 ``-reassociate``: Reassociate expressions | |
987 ----------------------------------------- | |
988 | |
989 This pass reassociates commutative expressions in an order that is designed to | |
990 promote better constant propagation, GCSE, :ref:`LICM <passes-licm>`, PRE, etc. | |
991 | |
992 For example: 4 + (x + 5) ⇒ x + (4 + 5) | |
993 | |
994 In the implementation of this algorithm, constants are assigned rank = 0, | |
995 function arguments are rank = 1, and other values are assigned ranks | |
996 corresponding to the reverse post order traversal of current function (starting | |
997 at 2), which effectively gives values in deep loops higher rank than values not | |
998 in loops. | |
999 | |
1000 ``-reg2mem``: Demote all values to stack slots | |
1001 ---------------------------------------------- | |
1002 | |
1003 This file demotes all registers to memory references. It is intended to be the | |
1004 inverse of :ref:`mem2reg <passes-mem2reg>`. By converting to ``load`` | |
1005 instructions, the only values live across basic blocks are ``alloca`` | |
1006 instructions and ``load`` instructions before ``phi`` nodes. It is intended | |
1007 that this should make CFG hacking much easier. To make later hacking easier, | |
1008 the entry block is split into two, such that all introduced ``alloca`` | |
1009 instructions (and nothing else) are in the entry block. | |
1010 | |
1011 ``-scalarrepl``: Scalar Replacement of Aggregates (DT) | |
1012 ------------------------------------------------------ | |
1013 | |
1014 The well-known scalar replacement of aggregates transformation. This transform | |
1015 breaks up ``alloca`` instructions of aggregate type (structure or array) into | |
1016 individual ``alloca`` instructions for each member if possible. Then, if | |
1017 possible, it transforms the individual ``alloca`` instructions into nice clean | |
1018 scalar SSA form. | |
1019 | |
1020 This combines a simple scalar replacement of aggregates algorithm with the | |
1021 :ref:`mem2reg <passes-mem2reg>` algorithm because they often interact, | |
1022 especially for C++ programs. As such, iterating between ``scalarrepl``, then | |
1023 :ref:`mem2reg <passes-mem2reg>` until we run out of things to promote works | |
1024 well. | |
1025 | |
1026 .. _passes-sccp: | |
1027 | |
1028 ``-sccp``: Sparse Conditional Constant Propagation | |
1029 -------------------------------------------------- | |
1030 | |
1031 Sparse conditional constant propagation and merging, which can be summarized | |
1032 as: | |
1033 | |
1034 * Assumes values are constant unless proven otherwise | |
1035 * Assumes BasicBlocks are dead unless proven otherwise | |
1036 * Proves values to be constant, and replaces them with constants | |
1037 * Proves conditional branches to be unconditional | |
1038 | |
1039 Note that this pass has a habit of making definitions be dead. It is a good | |
1040 idea to to run a :ref:`DCE <passes-dce>` pass sometime after running this pass. | |
1041 | |
1042 ``-simplify-libcalls``: Simplify well-known library calls | |
1043 --------------------------------------------------------- | |
1044 | |
1045 Applies a variety of small optimizations for calls to specific well-known | |
1046 function calls (e.g. runtime library functions). For example, a call | |
1047 ``exit(3)`` that occurs within the ``main()`` function can be transformed into | |
1048 simply ``return 3``. | |
1049 | |
1050 .. _passes-simplifycfg: | |
1051 | |
1052 ``-simplifycfg``: Simplify the CFG | |
1053 ---------------------------------- | |
1054 | |
1055 Performs dead code elimination and basic block merging. Specifically: | |
1056 | |
1057 * Removes basic blocks with no predecessors. | |
1058 * Merges a basic block into its predecessor if there is only one and the | |
1059 predecessor only has one successor. | |
1060 * Eliminates PHI nodes for basic blocks with a single predecessor. | |
1061 * Eliminates a basic block that only contains an unconditional branch. | |
1062 | |
1063 ``-sink``: Code sinking | |
1064 ----------------------- | |
1065 | |
1066 This pass moves instructions into successor blocks, when possible, so that they | |
1067 aren't executed on paths where their results aren't needed. | |
1068 | |
1069 ``-strip``: Strip all symbols from a module | |
1070 ------------------------------------------- | |
1071 | |
1072 Performs code stripping. This transformation can delete: | |
1073 | |
1074 * names for virtual registers | |
1075 * symbols for internal globals and functions | |
1076 * debug information | |
1077 | |
1078 Note that this transformation makes code much less readable, so it should only | |
1079 be used in situations where the strip utility would be used, such as reducing | |
1080 code size or making it harder to reverse engineer code. | |
1081 | |
1082 ``-strip-dead-debug-info``: Strip debug info for unused symbols | |
1083 --------------------------------------------------------------- | |
1084 | |
1085 .. FIXME: this description is the same as for -strip | |
1086 | |
1087 performs code stripping. this transformation can delete: | |
1088 | |
1089 * names for virtual registers | |
1090 * symbols for internal globals and functions | |
1091 * debug information | |
1092 | |
1093 note that this transformation makes code much less readable, so it should only | |
1094 be used in situations where the strip utility would be used, such as reducing | |
1095 code size or making it harder to reverse engineer code. | |
1096 | |
1097 ``-strip-dead-prototypes``: Strip Unused Function Prototypes | |
1098 ------------------------------------------------------------ | |
1099 | |
1100 This pass loops over all of the functions in the input module, looking for dead | |
1101 declarations and removes them. Dead declarations are declarations of functions | |
1102 for which no implementation is available (i.e., declarations for unused library | |
1103 functions). | |
1104 | |
1105 ``-strip-debug-declare``: Strip all ``llvm.dbg.declare`` intrinsics | |
1106 ------------------------------------------------------------------- | |
1107 | |
1108 .. FIXME: this description is the same as for -strip | |
1109 | |
1110 This pass implements code stripping. Specifically, it can delete: | |
1111 | |
1112 #. names for virtual registers | |
1113 #. symbols for internal globals and functions | |
1114 #. debug information | |
1115 | |
1116 Note that this transformation makes code much less readable, so it should only | |
1117 be used in situations where the 'strip' utility would be used, such as reducing | |
1118 code size or making it harder to reverse engineer code. | |
1119 | |
1120 ``-strip-nondebug``: Strip all symbols, except dbg symbols, from a module | |
1121 ------------------------------------------------------------------------- | |
1122 | |
1123 .. FIXME: this description is the same as for -strip | |
1124 | |
1125 This pass implements code stripping. Specifically, it can delete: | |
1126 | |
1127 #. names for virtual registers | |
1128 #. symbols for internal globals and functions | |
1129 #. debug information | |
1130 | |
1131 Note that this transformation makes code much less readable, so it should only | |
1132 be used in situations where the 'strip' utility would be used, such as reducing | |
1133 code size or making it harder to reverse engineer code. | |
1134 | |
1135 ``-tailcallelim``: Tail Call Elimination | |
1136 ---------------------------------------- | |
1137 | |
1138 This file transforms calls of the current function (self recursion) followed by | |
1139 a return instruction with a branch to the entry of the function, creating a | |
1140 loop. This pass also implements the following extensions to the basic | |
1141 algorithm: | |
1142 | |
1143 #. Trivial instructions between the call and return do not prevent the | |
1144 transformation from taking place, though currently the analysis cannot | |
1145 support moving any really useful instructions (only dead ones). | |
1146 #. This pass transforms functions that are prevented from being tail recursive | |
1147 by an associative expression to use an accumulator variable, thus compiling | |
1148 the typical naive factorial or fib implementation into efficient code. | |
1149 #. TRE is performed if the function returns void, if the return returns the | |
1150 result returned by the call, or if the function returns a run-time constant | |
1151 on all exits from the function. It is possible, though unlikely, that the | |
1152 return returns something else (like constant 0), and can still be TRE'd. It | |
1153 can be TRE'd if *all other* return instructions in the function return the | |
1154 exact same value. | |
1155 #. If it can prove that callees do not access theier caller stack frame, they | |
1156 are marked as eligible for tail call elimination (by the code generator). | |
1157 | |
1158 Utility Passes | |
1159 ============== | |
1160 | |
1161 This section describes the LLVM Utility Passes. | |
1162 | |
1163 ``-deadarghaX0r``: Dead Argument Hacking (BUGPOINT USE ONLY; DO NOT USE) | |
1164 ------------------------------------------------------------------------ | |
1165 | |
1166 Same as dead argument elimination, but deletes arguments to functions which are | |
1167 external. This is only for use by :doc:`bugpoint <Bugpoint>`. | |
1168 | |
1169 ``-extract-blocks``: Extract Basic Blocks From Module (for bugpoint use) | |
1170 ------------------------------------------------------------------------ | |
1171 | |
1172 This pass is used by bugpoint to extract all blocks from the module into their | |
1173 own functions. | |
1174 | |
1175 ``-instnamer``: Assign names to anonymous instructions | |
1176 ------------------------------------------------------ | |
1177 | |
1178 This is a little utility pass that gives instructions names, this is mostly | |
1179 useful when diffing the effect of an optimization because deleting an unnamed | |
1180 instruction can change all other instruction numbering, making the diff very | |
1181 noisy. | |
1182 | |
1183 ``-preverify``: Preliminary module verification | |
1184 ----------------------------------------------- | |
1185 | |
1186 Ensures that the module is in the form required by the :ref:`Module Verifier | |
1187 <passes-verify>` pass. Running the verifier runs this pass automatically, so | |
1188 there should be no need to use it directly. | |
1189 | |
1190 .. _passes-verify: | |
1191 | |
1192 ``-verify``: Module Verifier | |
1193 ---------------------------- | |
1194 | |
1195 Verifies an LLVM IR code. This is useful to run after an optimization which is | |
1196 undergoing testing. Note that llvm-as verifies its input before emitting | |
1197 bitcode, and also that malformed bitcode is likely to make LLVM crash. All | |
1198 language front-ends are therefore encouraged to verify their output before | |
1199 performing optimizing transformations. | |
1200 | |
1201 #. Both of a binary operator's parameters are of the same type. | |
1202 #. Verify that the indices of mem access instructions match other operands. | |
1203 #. Verify that arithmetic and other things are only performed on first-class | |
1204 types. Verify that shifts and logicals only happen on integrals f.e. | |
1205 #. All of the constants in a switch statement are of the correct type. | |
1206 #. The code is in valid SSA form. | |
1207 #. It is illegal to put a label into any other type (like a structure) or to | |
1208 return one. | |
1209 #. Only phi nodes can be self referential: ``%x = add i32 %x``, ``%x`` is | |
1210 invalid. | |
1211 #. PHI nodes must have an entry for each predecessor, with no extras. | |
1212 #. PHI nodes must be the first thing in a basic block, all grouped together. | |
1213 #. PHI nodes must have at least one entry. | |
1214 #. All basic blocks should only end with terminator insts, not contain them. | |
1215 #. The entry node to a function must not have predecessors. | |
1216 #. All Instructions must be embedded into a basic block. | |
1217 #. Functions cannot take a void-typed parameter. | |
1218 #. Verify that a function's argument list agrees with its declared type. | |
1219 #. It is illegal to specify a name for a void value. | |
1220 #. It is illegal to have an internal global value with no initializer. | |
1221 #. It is illegal to have a ``ret`` instruction that returns a value that does | |
1222 not agree with the function return value type. | |
1223 #. Function call argument types match the function prototype. | |
1224 #. All other things that are tested by asserts spread about the code. | |
1225 | |
1226 Note that this does not provide full security verification (like Java), but | |
1227 instead just tries to ensure that code is well-formed. | |
1228 | |
1229 ``-view-cfg``: View CFG of function | |
1230 ----------------------------------- | |
1231 | |
1232 Displays the control flow graph using the GraphViz tool. | |
1233 | |
1234 ``-view-cfg-only``: View CFG of function (with no function bodies) | |
1235 ------------------------------------------------------------------ | |
1236 | |
1237 Displays the control flow graph using the GraphViz tool, but omitting function | |
1238 bodies. | |
1239 | |
1240 ``-view-dom``: View dominance tree of function | |
1241 ---------------------------------------------- | |
1242 | |
1243 Displays the dominator tree using the GraphViz tool. | |
1244 | |
1245 ``-view-dom-only``: View dominance tree of function (with no function bodies) | |
1246 ----------------------------------------------------------------------------- | |
1247 | |
1248 Displays the dominator tree using the GraphViz tool, but omitting function | |
1249 bodies. | |
1250 | |
1251 ``-view-postdom``: View postdominance tree of function | |
1252 ------------------------------------------------------ | |
1253 | |
1254 Displays the post dominator tree using the GraphViz tool. | |
1255 | |
1256 ``-view-postdom-only``: View postdominance tree of function (with no function bodies) | |
1257 ------------------------------------------------------------------------------------- | |
1258 | |
1259 Displays the post dominator tree using the GraphViz tool, but omitting function | |
1260 bodies. | |
1261 |