Mercurial > hg > CbC > CbC_llvm
comparison docs/Bugpoint.rst @ 3:9ad51c7bc036
1st commit. remove git dir and add all files.
author | Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> |
---|---|
date | Wed, 15 May 2013 06:43:32 +0900 |
parents | |
children | 54457678186b |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 3:9ad51c7bc036 |
---|---|
1 ==================================== | |
2 LLVM bugpoint tool: design and usage | |
3 ==================================== | |
4 | |
5 .. contents:: | |
6 :local: | |
7 | |
8 Description | |
9 =========== | |
10 | |
11 ``bugpoint`` narrows down the source of problems in LLVM tools and passes. It | |
12 can be used to debug three types of failures: optimizer crashes, miscompilations | |
13 by optimizers, or bad native code generation (including problems in the static | |
14 and JIT compilers). It aims to reduce large test cases to small, useful ones. | |
15 For example, if ``opt`` crashes while optimizing a file, it will identify the | |
16 optimization (or combination of optimizations) that causes the crash, and reduce | |
17 the file down to a small example which triggers the crash. | |
18 | |
19 For detailed case scenarios, such as debugging ``opt``, or one of the LLVM code | |
20 generators, see `How To Submit a Bug Report document <HowToSubmitABug.html>`_. | |
21 | |
22 Design Philosophy | |
23 ================= | |
24 | |
25 ``bugpoint`` is designed to be a useful tool without requiring any hooks into | |
26 the LLVM infrastructure at all. It works with any and all LLVM passes and code | |
27 generators, and does not need to "know" how they work. Because of this, it may | |
28 appear to do stupid things or miss obvious simplifications. ``bugpoint`` is | |
29 also designed to trade off programmer time for computer time in the | |
30 compiler-debugging process; consequently, it may take a long period of | |
31 (unattended) time to reduce a test case, but we feel it is still worth it. Note | |
32 that ``bugpoint`` is generally very quick unless debugging a miscompilation | |
33 where each test of the program (which requires executing it) takes a long time. | |
34 | |
35 Automatic Debugger Selection | |
36 ---------------------------- | |
37 | |
38 ``bugpoint`` reads each ``.bc`` or ``.ll`` file specified on the command line | |
39 and links them together into a single module, called the test program. If any | |
40 LLVM passes are specified on the command line, it runs these passes on the test | |
41 program. If any of the passes crash, or if they produce malformed output (which | |
42 causes the verifier to abort), ``bugpoint`` starts the `crash debugger`_. | |
43 | |
44 Otherwise, if the ``-output`` option was not specified, ``bugpoint`` runs the | |
45 test program with the "safe" backend (which is assumed to generate good code) to | |
46 generate a reference output. Once ``bugpoint`` has a reference output for the | |
47 test program, it tries executing it with the selected code generator. If the | |
48 selected code generator crashes, ``bugpoint`` starts the `crash debugger`_ on | |
49 the code generator. Otherwise, if the resulting output differs from the | |
50 reference output, it assumes the difference resulted from a code generator | |
51 failure, and starts the `code generator debugger`_. | |
52 | |
53 Finally, if the output of the selected code generator matches the reference | |
54 output, ``bugpoint`` runs the test program after all of the LLVM passes have | |
55 been applied to it. If its output differs from the reference output, it assumes | |
56 the difference resulted from a failure in one of the LLVM passes, and enters the | |
57 `miscompilation debugger`_. Otherwise, there is no problem ``bugpoint`` can | |
58 debug. | |
59 | |
60 .. _crash debugger: | |
61 | |
62 Crash debugger | |
63 -------------- | |
64 | |
65 If an optimizer or code generator crashes, ``bugpoint`` will try as hard as it | |
66 can to reduce the list of passes (for optimizer crashes) and the size of the | |
67 test program. First, ``bugpoint`` figures out which combination of optimizer | |
68 passes triggers the bug. This is useful when debugging a problem exposed by | |
69 ``opt``, for example, because it runs over 38 passes. | |
70 | |
71 Next, ``bugpoint`` tries removing functions from the test program, to reduce its | |
72 size. Usually it is able to reduce a test program to a single function, when | |
73 debugging intraprocedural optimizations. Once the number of functions has been | |
74 reduced, it attempts to delete various edges in the control flow graph, to | |
75 reduce the size of the function as much as possible. Finally, ``bugpoint`` | |
76 deletes any individual LLVM instructions whose absence does not eliminate the | |
77 failure. At the end, ``bugpoint`` should tell you what passes crash, give you a | |
78 bitcode file, and give you instructions on how to reproduce the failure with | |
79 ``opt`` or ``llc``. | |
80 | |
81 .. _code generator debugger: | |
82 | |
83 Code generator debugger | |
84 ----------------------- | |
85 | |
86 The code generator debugger attempts to narrow down the amount of code that is | |
87 being miscompiled by the selected code generator. To do this, it takes the test | |
88 program and partitions it into two pieces: one piece which it compiles with the | |
89 "safe" backend (into a shared object), and one piece which it runs with either | |
90 the JIT or the static LLC compiler. It uses several techniques to reduce the | |
91 amount of code pushed through the LLVM code generator, to reduce the potential | |
92 scope of the problem. After it is finished, it emits two bitcode files (called | |
93 "test" [to be compiled with the code generator] and "safe" [to be compiled with | |
94 the "safe" backend], respectively), and instructions for reproducing the | |
95 problem. The code generator debugger assumes that the "safe" backend produces | |
96 good code. | |
97 | |
98 .. _miscompilation debugger: | |
99 | |
100 Miscompilation debugger | |
101 ----------------------- | |
102 | |
103 The miscompilation debugger works similarly to the code generator debugger. It | |
104 works by splitting the test program into two pieces, running the optimizations | |
105 specified on one piece, linking the two pieces back together, and then executing | |
106 the result. It attempts to narrow down the list of passes to the one (or few) | |
107 which are causing the miscompilation, then reduce the portion of the test | |
108 program which is being miscompiled. The miscompilation debugger assumes that | |
109 the selected code generator is working properly. | |
110 | |
111 Advice for using bugpoint | |
112 ========================= | |
113 | |
114 ``bugpoint`` can be a remarkably useful tool, but it sometimes works in | |
115 non-obvious ways. Here are some hints and tips: | |
116 | |
117 * In the code generator and miscompilation debuggers, ``bugpoint`` only works | |
118 with programs that have deterministic output. Thus, if the program outputs | |
119 ``argv[0]``, the date, time, or any other "random" data, ``bugpoint`` may | |
120 misinterpret differences in these data, when output, as the result of a | |
121 miscompilation. Programs should be temporarily modified to disable outputs | |
122 that are likely to vary from run to run. | |
123 | |
124 * In the code generator and miscompilation debuggers, debugging will go faster | |
125 if you manually modify the program or its inputs to reduce the runtime, but | |
126 still exhibit the problem. | |
127 | |
128 * ``bugpoint`` is extremely useful when working on a new optimization: it helps | |
129 track down regressions quickly. To avoid having to relink ``bugpoint`` every | |
130 time you change your optimization however, have ``bugpoint`` dynamically load | |
131 your optimization with the ``-load`` option. | |
132 | |
133 * ``bugpoint`` can generate a lot of output and run for a long period of time. | |
134 It is often useful to capture the output of the program to file. For example, | |
135 in the C shell, you can run: | |
136 | |
137 .. code-block:: console | |
138 | |
139 $ bugpoint ... |& tee bugpoint.log | |
140 | |
141 to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well | |
142 as on your terminal. | |
143 | |
144 * ``bugpoint`` cannot debug problems with the LLVM linker. If ``bugpoint`` | |
145 crashes before you see its "All input ok" message, you might try ``llvm-link | |
146 -v`` on the same set of input files. If that also crashes, you may be | |
147 experiencing a linker bug. | |
148 | |
149 * ``bugpoint`` is useful for proactively finding bugs in LLVM. Invoking | |
150 ``bugpoint`` with the ``-find-bugs`` option will cause the list of specified | |
151 optimizations to be randomized and applied to the program. This process will | |
152 repeat until a bug is found or the user kills ``bugpoint``. | |
153 | |
154 What to do when bugpoint isn't enough | |
155 ===================================== | |
156 | |
157 Sometimes, ``bugpoint`` is not enough. In particular, InstCombine and | |
158 TargetLowering both have visitor structured code with lots of potential | |
159 transformations. If the process of using bugpoint has left you with still too | |
160 much code to figure out and the problem seems to be in instcombine, the | |
161 following steps may help. These same techniques are useful with TargetLowering | |
162 as well. | |
163 | |
164 Turn on ``-debug-only=instcombine`` and see which transformations within | |
165 instcombine are firing by selecting out lines with "``IC``" in them. | |
166 | |
167 At this point, you have a decision to make. Is the number of transformations | |
168 small enough to step through them using a debugger? If so, then try that. | |
169 | |
170 If there are too many transformations, then a source modification approach may | |
171 be helpful. In this approach, you can modify the source code of instcombine to | |
172 disable just those transformations that are being performed on your test input | |
173 and perform a binary search over the set of transformations. One set of places | |
174 to modify are the "``visit*``" methods of ``InstCombiner`` (*e.g.* | |
175 ``visitICmpInst``) by adding a "``return false``" as the first line of the | |
176 method. | |
177 | |
178 If that still doesn't remove enough, then change the caller of | |
179 ``InstCombiner::DoOneIteration``, ``InstCombiner::runOnFunction`` to limit the | |
180 number of iterations. | |
181 | |
182 You may also find it useful to use "``-stats``" now to see what parts of | |
183 instcombine are firing. This can guide where to put additional reporting code. | |
184 | |
185 At this point, if the amount of transformations is still too large, then | |
186 inserting code to limit whether or not to execute the body of the code in the | |
187 visit function can be helpful. Add a static counter which is incremented on | |
188 every invocation of the function. Then add code which simply returns false on | |
189 desired ranges. For example: | |
190 | |
191 .. code-block:: c++ | |
192 | |
193 | |
194 static int calledCount = 0; | |
195 calledCount++; | |
196 DEBUG(if (calledCount < 212) return false); | |
197 DEBUG(if (calledCount > 217) return false); | |
198 DEBUG(if (calledCount == 213) return false); | |
199 DEBUG(if (calledCount == 214) return false); | |
200 DEBUG(if (calledCount == 215) return false); | |
201 DEBUG(if (calledCount == 216) return false); | |
202 DEBUG(dbgs() << "visitXOR calledCount: " << calledCount << "\n"); | |
203 DEBUG(dbgs() << "I: "; I->dump()); | |
204 | |
205 could be added to ``visitXOR`` to limit ``visitXor`` to being applied only to | |
206 calls 212 and 217. This is from an actual test case and raises an important | |
207 point---a simple binary search may not be sufficient, as transformations that | |
208 interact may require isolating more than one call. In TargetLowering, use | |
209 ``return SDNode();`` instead of ``return false;``. | |
210 | |
211 Now that that the number of transformations is down to a manageable number, try | |
212 examining the output to see if you can figure out which transformations are | |
213 being done. If that can be figured out, then do the usual debugging. If which | |
214 code corresponds to the transformation being performed isn't obvious, set a | |
215 breakpoint after the call count based disabling and step through the code. | |
216 Alternatively, you can use "``printf``" style debugging to report waypoints. |