Mercurial > hg > CbC > CbC_llvm
comparison docs/FAQ.rst @ 0:95c75e76d11b LLVM3.4
LLVM 3.4
author | Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> |
---|---|
date | Thu, 12 Dec 2013 13:56:28 +0900 |
parents | |
children | 54457678186b |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:95c75e76d11b |
---|---|
1 ================================ | |
2 Frequently Asked Questions (FAQ) | |
3 ================================ | |
4 | |
5 .. contents:: | |
6 :local: | |
7 | |
8 | |
9 License | |
10 ======= | |
11 | |
12 Does the University of Illinois Open Source License really qualify as an "open source" license? | |
13 ----------------------------------------------------------------------------------------------- | |
14 Yes, the license is `certified | |
15 <http://www.opensource.org/licenses/UoI-NCSA.php>`_ by the Open Source | |
16 Initiative (OSI). | |
17 | |
18 | |
19 Can I modify LLVM source code and redistribute the modified source? | |
20 ------------------------------------------------------------------- | |
21 Yes. The modified source distribution must retain the copyright notice and | |
22 follow the three bulletted conditions listed in the `LLVM license | |
23 <http://llvm.org/svn/llvm-project/llvm/trunk/LICENSE.TXT>`_. | |
24 | |
25 | |
26 Can I modify the LLVM source code and redistribute binaries or other tools based on it, without redistributing the source? | |
27 -------------------------------------------------------------------------------------------------------------------------- | |
28 Yes. This is why we distribute LLVM under a less restrictive license than GPL, | |
29 as explained in the first question above. | |
30 | |
31 | |
32 Source Code | |
33 =========== | |
34 | |
35 In what language is LLVM written? | |
36 --------------------------------- | |
37 All of the LLVM tools and libraries are written in C++ with extensive use of | |
38 the STL. | |
39 | |
40 | |
41 How portable is the LLVM source code? | |
42 ------------------------------------- | |
43 The LLVM source code should be portable to most modern Unix-like operating | |
44 systems. Most of the code is written in standard C++ with operating system | |
45 services abstracted to a support library. The tools required to build and | |
46 test LLVM have been ported to a plethora of platforms. | |
47 | |
48 Some porting problems may exist in the following areas: | |
49 | |
50 * The autoconf/makefile build system relies heavily on UNIX shell tools, | |
51 like the Bourne Shell and sed. Porting to systems without these tools | |
52 (MacOS 9, Plan 9) will require more effort. | |
53 | |
54 What API do I use to store a value to one of the virtual registers in LLVM IR's SSA representation? | |
55 --------------------------------------------------------------------------------------------------- | |
56 | |
57 In short: you can't. It's actually kind of a silly question once you grok | |
58 what's going on. Basically, in code like: | |
59 | |
60 .. code-block:: llvm | |
61 | |
62 %result = add i32 %foo, %bar | |
63 | |
64 , ``%result`` is just a name given to the ``Value`` of the ``add`` | |
65 instruction. In other words, ``%result`` *is* the add instruction. The | |
66 "assignment" doesn't explicitly "store" anything to any "virtual register"; | |
67 the "``=``" is more like the mathematical sense of equality. | |
68 | |
69 Longer explanation: In order to generate a textual representation of the | |
70 IR, some kind of name has to be given to each instruction so that other | |
71 instructions can textually reference it. However, the isomorphic in-memory | |
72 representation that you manipulate from C++ has no such restriction since | |
73 instructions can simply keep pointers to any other ``Value``'s that they | |
74 reference. In fact, the names of dummy numbered temporaries like ``%1`` are | |
75 not explicitly represented in the in-memory representation at all (see | |
76 ``Value::getName()``). | |
77 | |
78 Build Problems | |
79 ============== | |
80 | |
81 When I run configure, it finds the wrong C compiler. | |
82 ---------------------------------------------------- | |
83 The ``configure`` script attempts to locate first ``gcc`` and then ``cc``, | |
84 unless it finds compiler paths set in ``CC`` and ``CXX`` for the C and C++ | |
85 compiler, respectively. | |
86 | |
87 If ``configure`` finds the wrong compiler, either adjust your ``PATH`` | |
88 environment variable or set ``CC`` and ``CXX`` explicitly. | |
89 | |
90 | |
91 The ``configure`` script finds the right C compiler, but it uses the LLVM tools from a previous build. What do I do? | |
92 --------------------------------------------------------------------------------------------------------------------- | |
93 The ``configure`` script uses the ``PATH`` to find executables, so if it's | |
94 grabbing the wrong linker/assembler/etc, there are two ways to fix it: | |
95 | |
96 #. Adjust your ``PATH`` environment variable so that the correct program | |
97 appears first in the ``PATH``. This may work, but may not be convenient | |
98 when you want them *first* in your path for other work. | |
99 | |
100 #. Run ``configure`` with an alternative ``PATH`` that is correct. In a | |
101 Bourne compatible shell, the syntax would be: | |
102 | |
103 .. code-block:: console | |
104 | |
105 % PATH=[the path without the bad program] ./configure ... | |
106 | |
107 This is still somewhat inconvenient, but it allows ``configure`` to do its | |
108 work without having to adjust your ``PATH`` permanently. | |
109 | |
110 | |
111 When creating a dynamic library, I get a strange GLIBC error. | |
112 ------------------------------------------------------------- | |
113 Under some operating systems (i.e. Linux), libtool does not work correctly if | |
114 GCC was compiled with the ``--disable-shared option``. To work around this, | |
115 install your own version of GCC that has shared libraries enabled by default. | |
116 | |
117 | |
118 I've updated my source tree from Subversion, and now my build is trying to use a file/directory that doesn't exist. | |
119 ------------------------------------------------------------------------------------------------------------------- | |
120 You need to re-run configure in your object directory. When new Makefiles | |
121 are added to the source tree, they have to be copied over to the object tree | |
122 in order to be used by the build. | |
123 | |
124 | |
125 I've modified a Makefile in my source tree, but my build tree keeps using the old version. What do I do? | |
126 --------------------------------------------------------------------------------------------------------- | |
127 If the Makefile already exists in your object tree, you can just run the | |
128 following command in the top level directory of your object tree: | |
129 | |
130 .. code-block:: console | |
131 | |
132 % ./config.status <relative path to Makefile>; | |
133 | |
134 If the Makefile is new, you will have to modify the configure script to copy | |
135 it over. | |
136 | |
137 | |
138 I've upgraded to a new version of LLVM, and I get strange build errors. | |
139 ----------------------------------------------------------------------- | |
140 Sometimes, changes to the LLVM source code alters how the build system works. | |
141 Changes in ``libtool``, ``autoconf``, or header file dependencies are | |
142 especially prone to this sort of problem. | |
143 | |
144 The best thing to try is to remove the old files and re-build. In most cases, | |
145 this takes care of the problem. To do this, just type ``make clean`` and then | |
146 ``make`` in the directory that fails to build. | |
147 | |
148 | |
149 I've built LLVM and am testing it, but the tests freeze. | |
150 -------------------------------------------------------- | |
151 This is most likely occurring because you built a profile or release | |
152 (optimized) build of LLVM and have not specified the same information on the | |
153 ``gmake`` command line. | |
154 | |
155 For example, if you built LLVM with the command: | |
156 | |
157 .. code-block:: console | |
158 | |
159 % gmake ENABLE_PROFILING=1 | |
160 | |
161 ...then you must run the tests with the following commands: | |
162 | |
163 .. code-block:: console | |
164 | |
165 % cd llvm/test | |
166 % gmake ENABLE_PROFILING=1 | |
167 | |
168 Why do test results differ when I perform different types of builds? | |
169 -------------------------------------------------------------------- | |
170 The LLVM test suite is dependent upon several features of the LLVM tools and | |
171 libraries. | |
172 | |
173 First, the debugging assertions in code are not enabled in optimized or | |
174 profiling builds. Hence, tests that used to fail may pass. | |
175 | |
176 Second, some tests may rely upon debugging options or behavior that is only | |
177 available in the debug build. These tests will fail in an optimized or | |
178 profile build. | |
179 | |
180 | |
181 Compiling LLVM with GCC 3.3.2 fails, what should I do? | |
182 ------------------------------------------------------ | |
183 This is `a bug in GCC <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13392>`_, | |
184 and affects projects other than LLVM. Try upgrading or downgrading your GCC. | |
185 | |
186 | |
187 Compiling LLVM with GCC succeeds, but the resulting tools do not work, what can be wrong? | |
188 ----------------------------------------------------------------------------------------- | |
189 Several versions of GCC have shown a weakness in miscompiling the LLVM | |
190 codebase. Please consult your compiler version (``gcc --version``) to find | |
191 out whether it is `broken <GettingStarted.html#brokengcc>`_. If so, your only | |
192 option is to upgrade GCC to a known good version. | |
193 | |
194 | |
195 After Subversion update, rebuilding gives the error "No rule to make target". | |
196 ----------------------------------------------------------------------------- | |
197 If the error is of the form: | |
198 | |
199 .. code-block:: console | |
200 | |
201 gmake[2]: *** No rule to make target `/path/to/somefile', | |
202 needed by `/path/to/another/file.d'. | |
203 Stop. | |
204 | |
205 This may occur anytime files are moved within the Subversion repository or | |
206 removed entirely. In this case, the best solution is to erase all ``.d`` | |
207 files, which list dependencies for source files, and rebuild: | |
208 | |
209 .. code-block:: console | |
210 | |
211 % cd $LLVM_OBJ_DIR | |
212 % rm -f `find . -name \*\.d` | |
213 % gmake | |
214 | |
215 In other cases, it may be necessary to run ``make clean`` before rebuilding. | |
216 | |
217 | |
218 Source Languages | |
219 ================ | |
220 | |
221 What source languages are supported? | |
222 ------------------------------------ | |
223 LLVM currently has full support for C and C++ source languages. These are | |
224 available through both `Clang <http://clang.llvm.org/>`_ and `DragonEgg | |
225 <http://dragonegg.llvm.org/>`_. | |
226 | |
227 The PyPy developers are working on integrating LLVM into the PyPy backend so | |
228 that PyPy language can translate to LLVM. | |
229 | |
230 | |
231 I'd like to write a self-hosting LLVM compiler. How should I interface with the LLVM middle-end optimizers and back-end code generators? | |
232 ---------------------------------------------------------------------------------------------------------------------------------------- | |
233 Your compiler front-end will communicate with LLVM by creating a module in the | |
234 LLVM intermediate representation (IR) format. Assuming you want to write your | |
235 language's compiler in the language itself (rather than C++), there are 3 | |
236 major ways to tackle generating LLVM IR from a front-end: | |
237 | |
238 1. **Call into the LLVM libraries code using your language's FFI (foreign | |
239 function interface).** | |
240 | |
241 * *for:* best tracks changes to the LLVM IR, .ll syntax, and .bc format | |
242 | |
243 * *for:* enables running LLVM optimization passes without a emit/parse | |
244 overhead | |
245 | |
246 * *for:* adapts well to a JIT context | |
247 | |
248 * *against:* lots of ugly glue code to write | |
249 | |
250 2. **Emit LLVM assembly from your compiler's native language.** | |
251 | |
252 * *for:* very straightforward to get started | |
253 | |
254 * *against:* the .ll parser is slower than the bitcode reader when | |
255 interfacing to the middle end | |
256 | |
257 * *against:* it may be harder to track changes to the IR | |
258 | |
259 3. **Emit LLVM bitcode from your compiler's native language.** | |
260 | |
261 * *for:* can use the more-efficient bitcode reader when interfacing to the | |
262 middle end | |
263 | |
264 * *against:* you'll have to re-engineer the LLVM IR object model and bitcode | |
265 writer in your language | |
266 | |
267 * *against:* it may be harder to track changes to the IR | |
268 | |
269 If you go with the first option, the C bindings in include/llvm-c should help | |
270 a lot, since most languages have strong support for interfacing with C. The | |
271 most common hurdle with calling C from managed code is interfacing with the | |
272 garbage collector. The C interface was designed to require very little memory | |
273 management, and so is straightforward in this regard. | |
274 | |
275 What support is there for a higher level source language constructs for building a compiler? | |
276 -------------------------------------------------------------------------------------------- | |
277 Currently, there isn't much. LLVM supports an intermediate representation | |
278 which is useful for code representation but will not support the high level | |
279 (abstract syntax tree) representation needed by most compilers. There are no | |
280 facilities for lexical nor semantic analysis. | |
281 | |
282 | |
283 I don't understand the ``GetElementPtr`` instruction. Help! | |
284 ----------------------------------------------------------- | |
285 See `The Often Misunderstood GEP Instruction <GetElementPtr.html>`_. | |
286 | |
287 | |
288 Using the C and C++ Front Ends | |
289 ============================== | |
290 | |
291 Can I compile C or C++ code to platform-independent LLVM bitcode? | |
292 ----------------------------------------------------------------- | |
293 No. C and C++ are inherently platform-dependent languages. The most obvious | |
294 example of this is the preprocessor. A very common way that C code is made | |
295 portable is by using the preprocessor to include platform-specific code. In | |
296 practice, information about other platforms is lost after preprocessing, so | |
297 the result is inherently dependent on the platform that the preprocessing was | |
298 targeting. | |
299 | |
300 Another example is ``sizeof``. It's common for ``sizeof(long)`` to vary | |
301 between platforms. In most C front-ends, ``sizeof`` is expanded to a | |
302 constant immediately, thus hard-wiring a platform-specific detail. | |
303 | |
304 Also, since many platforms define their ABIs in terms of C, and since LLVM is | |
305 lower-level than C, front-ends currently must emit platform-specific IR in | |
306 order to have the result conform to the platform ABI. | |
307 | |
308 | |
309 Questions about code generated by the demo page | |
310 =============================================== | |
311 | |
312 What is this ``llvm.global_ctors`` and ``_GLOBAL__I_a...`` stuff that happens when I ``#include <iostream>``? | |
313 ------------------------------------------------------------------------------------------------------------- | |
314 If you ``#include`` the ``<iostream>`` header into a C++ translation unit, | |
315 the file will probably use the ``std::cin``/``std::cout``/... global objects. | |
316 However, C++ does not guarantee an order of initialization between static | |
317 objects in different translation units, so if a static ctor/dtor in your .cpp | |
318 file used ``std::cout``, for example, the object would not necessarily be | |
319 automatically initialized before your use. | |
320 | |
321 To make ``std::cout`` and friends work correctly in these scenarios, the STL | |
322 that we use declares a static object that gets created in every translation | |
323 unit that includes ``<iostream>``. This object has a static constructor | |
324 and destructor that initializes and destroys the global iostream objects | |
325 before they could possibly be used in the file. The code that you see in the | |
326 ``.ll`` file corresponds to the constructor and destructor registration code. | |
327 | |
328 If you would like to make it easier to *understand* the LLVM code generated | |
329 by the compiler in the demo page, consider using ``printf()`` instead of | |
330 ``iostream``\s to print values. | |
331 | |
332 | |
333 Where did all of my code go?? | |
334 ----------------------------- | |
335 If you are using the LLVM demo page, you may often wonder what happened to | |
336 all of the code that you typed in. Remember that the demo script is running | |
337 the code through the LLVM optimizers, so if your code doesn't actually do | |
338 anything useful, it might all be deleted. | |
339 | |
340 To prevent this, make sure that the code is actually needed. For example, if | |
341 you are computing some expression, return the value from the function instead | |
342 of leaving it in a local variable. If you really want to constrain the | |
343 optimizer, you can read from and assign to ``volatile`` global variables. | |
344 | |
345 | |
346 What is this "``undef``" thing that shows up in my code? | |
347 -------------------------------------------------------- | |
348 ``undef`` is the LLVM way of representing a value that is not defined. You | |
349 can get these if you do not initialize a variable before you use it. For | |
350 example, the C function: | |
351 | |
352 .. code-block:: c | |
353 | |
354 int X() { int i; return i; } | |
355 | |
356 Is compiled to "``ret i32 undef``" because "``i``" never has a value specified | |
357 for it. | |
358 | |
359 | |
360 Why does instcombine + simplifycfg turn a call to a function with a mismatched calling convention into "unreachable"? Why not make the verifier reject it? | |
361 ---------------------------------------------------------------------------------------------------------------------------------------------------------- | |
362 This is a common problem run into by authors of front-ends that are using | |
363 custom calling conventions: you need to make sure to set the right calling | |
364 convention on both the function and on each call to the function. For | |
365 example, this code: | |
366 | |
367 .. code-block:: llvm | |
368 | |
369 define fastcc void @foo() { | |
370 ret void | |
371 } | |
372 define void @bar() { | |
373 call void @foo() | |
374 ret void | |
375 } | |
376 | |
377 Is optimized to: | |
378 | |
379 .. code-block:: llvm | |
380 | |
381 define fastcc void @foo() { | |
382 ret void | |
383 } | |
384 define void @bar() { | |
385 unreachable | |
386 } | |
387 | |
388 ... with "``opt -instcombine -simplifycfg``". This often bites people because | |
389 "all their code disappears". Setting the calling convention on the caller and | |
390 callee is required for indirect calls to work, so people often ask why not | |
391 make the verifier reject this sort of thing. | |
392 | |
393 The answer is that this code has undefined behavior, but it is not illegal. | |
394 If we made it illegal, then every transformation that could potentially create | |
395 this would have to ensure that it doesn't, and there is valid code that can | |
396 create this sort of construct (in dead code). The sorts of things that can | |
397 cause this to happen are fairly contrived, but we still need to accept them. | |
398 Here's an example: | |
399 | |
400 .. code-block:: llvm | |
401 | |
402 define fastcc void @foo() { | |
403 ret void | |
404 } | |
405 define internal void @bar(void()* %FP, i1 %cond) { | |
406 br i1 %cond, label %T, label %F | |
407 T: | |
408 call void %FP() | |
409 ret void | |
410 F: | |
411 call fastcc void %FP() | |
412 ret void | |
413 } | |
414 define void @test() { | |
415 %X = or i1 false, false | |
416 call void @bar(void()* @foo, i1 %X) | |
417 ret void | |
418 } | |
419 | |
420 In this example, "test" always passes ``@foo``/``false`` into ``bar``, which | |
421 ensures that it is dynamically called with the right calling conv (thus, the | |
422 code is perfectly well defined). If you run this through the inliner, you | |
423 get this (the explicit "or" is there so that the inliner doesn't dead code | |
424 eliminate a bunch of stuff): | |
425 | |
426 .. code-block:: llvm | |
427 | |
428 define fastcc void @foo() { | |
429 ret void | |
430 } | |
431 define void @test() { | |
432 %X = or i1 false, false | |
433 br i1 %X, label %T.i, label %F.i | |
434 T.i: | |
435 call void @foo() | |
436 br label %bar.exit | |
437 F.i: | |
438 call fastcc void @foo() | |
439 br label %bar.exit | |
440 bar.exit: | |
441 ret void | |
442 } | |
443 | |
444 Here you can see that the inlining pass made an undefined call to ``@foo`` | |
445 with the wrong calling convention. We really don't want to make the inliner | |
446 have to know about this sort of thing, so it needs to be valid code. In this | |
447 case, dead code elimination can trivially remove the undefined code. However, | |
448 if ``%X`` was an input argument to ``@test``, the inliner would produce this: | |
449 | |
450 .. code-block:: llvm | |
451 | |
452 define fastcc void @foo() { | |
453 ret void | |
454 } | |
455 | |
456 define void @test(i1 %X) { | |
457 br i1 %X, label %T.i, label %F.i | |
458 T.i: | |
459 call void @foo() | |
460 br label %bar.exit | |
461 F.i: | |
462 call fastcc void @foo() | |
463 br label %bar.exit | |
464 bar.exit: | |
465 ret void | |
466 } | |
467 | |
468 The interesting thing about this is that ``%X`` *must* be false for the | |
469 code to be well-defined, but no amount of dead code elimination will be able | |
470 to delete the broken call as unreachable. However, since | |
471 ``instcombine``/``simplifycfg`` turns the undefined call into unreachable, we | |
472 end up with a branch on a condition that goes to unreachable: a branch to | |
473 unreachable can never happen, so "``-inline -instcombine -simplifycfg``" is | |
474 able to produce: | |
475 | |
476 .. code-block:: llvm | |
477 | |
478 define fastcc void @foo() { | |
479 ret void | |
480 } | |
481 define void @test(i1 %X) { | |
482 F.i: | |
483 call fastcc void @foo() | |
484 ret void | |
485 } |