Mercurial > hg > CbC > CbC_llvm
comparison docs/ExceptionHandling.rst @ 0:95c75e76d11b LLVM3.4
LLVM 3.4
author | Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> |
---|---|
date | Thu, 12 Dec 2013 13:56:28 +0900 |
parents | |
children | 54457678186b |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:95c75e76d11b |
---|---|
1 ========================== | |
2 Exception Handling in LLVM | |
3 ========================== | |
4 | |
5 .. contents:: | |
6 :local: | |
7 | |
8 Introduction | |
9 ============ | |
10 | |
11 This document is the central repository for all information pertaining to | |
12 exception handling in LLVM. It describes the format that LLVM exception | |
13 handling information takes, which is useful for those interested in creating | |
14 front-ends or dealing directly with the information. Further, this document | |
15 provides specific examples of what exception handling information is used for in | |
16 C and C++. | |
17 | |
18 Itanium ABI Zero-cost Exception Handling | |
19 ---------------------------------------- | |
20 | |
21 Exception handling for most programming languages is designed to recover from | |
22 conditions that rarely occur during general use of an application. To that end, | |
23 exception handling should not interfere with the main flow of an application's | |
24 algorithm by performing checkpointing tasks, such as saving the current pc or | |
25 register state. | |
26 | |
27 The Itanium ABI Exception Handling Specification defines a methodology for | |
28 providing outlying data in the form of exception tables without inlining | |
29 speculative exception handling code in the flow of an application's main | |
30 algorithm. Thus, the specification is said to add "zero-cost" to the normal | |
31 execution of an application. | |
32 | |
33 A more complete description of the Itanium ABI exception handling runtime | |
34 support of can be found at `Itanium C++ ABI: Exception Handling | |
35 <http://mentorembedded.github.com/cxx-abi/abi-eh.html>`_. A description of the | |
36 exception frame format can be found at `Exception Frames | |
37 <http://refspecs.linuxfoundation.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html>`_, | |
38 with details of the DWARF 4 specification at `DWARF 4 Standard | |
39 <http://dwarfstd.org/Dwarf4Std.php>`_. A description for the C++ exception | |
40 table formats can be found at `Exception Handling Tables | |
41 <http://mentorembedded.github.com/cxx-abi/exceptions.pdf>`_. | |
42 | |
43 Setjmp/Longjmp Exception Handling | |
44 --------------------------------- | |
45 | |
46 Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics | |
47 `llvm.eh.sjlj.setjmp`_ and `llvm.eh.sjlj.longjmp`_ to handle control flow for | |
48 exception handling. | |
49 | |
50 For each function which does exception processing --- be it ``try``/``catch`` | |
51 blocks or cleanups --- that function registers itself on a global frame | |
52 list. When exceptions are unwinding, the runtime uses this list to identify | |
53 which functions need processing. | |
54 | |
55 Landing pad selection is encoded in the call site entry of the function | |
56 context. The runtime returns to the function via `llvm.eh.sjlj.longjmp`_, where | |
57 a switch table transfers control to the appropriate landing pad based on the | |
58 index stored in the function context. | |
59 | |
60 In contrast to DWARF exception handling, which encodes exception regions and | |
61 frame information in out-of-line tables, SJLJ exception handling builds and | |
62 removes the unwind frame context at runtime. This results in faster exception | |
63 handling at the expense of slower execution when no exceptions are thrown. As | |
64 exceptions are, by their nature, intended for uncommon code paths, DWARF | |
65 exception handling is generally preferred to SJLJ. | |
66 | |
67 Overview | |
68 -------- | |
69 | |
70 When an exception is thrown in LLVM code, the runtime does its best to find a | |
71 handler suited to processing the circumstance. | |
72 | |
73 The runtime first attempts to find an *exception frame* corresponding to the | |
74 function where the exception was thrown. If the programming language supports | |
75 exception handling (e.g. C++), the exception frame contains a reference to an | |
76 exception table describing how to process the exception. If the language does | |
77 not support exception handling (e.g. C), or if the exception needs to be | |
78 forwarded to a prior activation, the exception frame contains information about | |
79 how to unwind the current activation and restore the state of the prior | |
80 activation. This process is repeated until the exception is handled. If the | |
81 exception is not handled and no activations remain, then the application is | |
82 terminated with an appropriate error message. | |
83 | |
84 Because different programming languages have different behaviors when handling | |
85 exceptions, the exception handling ABI provides a mechanism for | |
86 supplying *personalities*. An exception handling personality is defined by | |
87 way of a *personality function* (e.g. ``__gxx_personality_v0`` in C++), | |
88 which receives the context of the exception, an *exception structure* | |
89 containing the exception object type and value, and a reference to the exception | |
90 table for the current function. The personality function for the current | |
91 compile unit is specified in a *common exception frame*. | |
92 | |
93 The organization of an exception table is language dependent. For C++, an | |
94 exception table is organized as a series of code ranges defining what to do if | |
95 an exception occurs in that range. Typically, the information associated with a | |
96 range defines which types of exception objects (using C++ *type info*) that are | |
97 handled in that range, and an associated action that should take place. Actions | |
98 typically pass control to a *landing pad*. | |
99 | |
100 A landing pad corresponds roughly to the code found in the ``catch`` portion of | |
101 a ``try``/``catch`` sequence. When execution resumes at a landing pad, it | |
102 receives an *exception structure* and a *selector value* corresponding to the | |
103 *type* of exception thrown. The selector is then used to determine which *catch* | |
104 should actually process the exception. | |
105 | |
106 LLVM Code Generation | |
107 ==================== | |
108 | |
109 From a C++ developer's perspective, exceptions are defined in terms of the | |
110 ``throw`` and ``try``/``catch`` statements. In this section we will describe the | |
111 implementation of LLVM exception handling in terms of C++ examples. | |
112 | |
113 Throw | |
114 ----- | |
115 | |
116 Languages that support exception handling typically provide a ``throw`` | |
117 operation to initiate the exception process. Internally, a ``throw`` operation | |
118 breaks down into two steps. | |
119 | |
120 #. A request is made to allocate exception space for an exception structure. | |
121 This structure needs to survive beyond the current activation. This structure | |
122 will contain the type and value of the object being thrown. | |
123 | |
124 #. A call is made to the runtime to raise the exception, passing the exception | |
125 structure as an argument. | |
126 | |
127 In C++, the allocation of the exception structure is done by the | |
128 ``__cxa_allocate_exception`` runtime function. The exception raising is handled | |
129 by ``__cxa_throw``. The type of the exception is represented using a C++ RTTI | |
130 structure. | |
131 | |
132 Try/Catch | |
133 --------- | |
134 | |
135 A call within the scope of a *try* statement can potentially raise an | |
136 exception. In those circumstances, the LLVM C++ front-end replaces the call with | |
137 an ``invoke`` instruction. Unlike a call, the ``invoke`` has two potential | |
138 continuation points: | |
139 | |
140 #. where to continue when the call succeeds as per normal, and | |
141 | |
142 #. where to continue if the call raises an exception, either by a throw or the | |
143 unwinding of a throw | |
144 | |
145 The term used to define a the place where an ``invoke`` continues after an | |
146 exception is called a *landing pad*. LLVM landing pads are conceptually | |
147 alternative function entry points where an exception structure reference and a | |
148 type info index are passed in as arguments. The landing pad saves the exception | |
149 structure reference and then proceeds to select the catch block that corresponds | |
150 to the type info of the exception object. | |
151 | |
152 The LLVM :ref:`i_landingpad` is used to convey information about the landing | |
153 pad to the back end. For C++, the ``landingpad`` instruction returns a pointer | |
154 and integer pair corresponding to the pointer to the *exception structure* and | |
155 the *selector value* respectively. | |
156 | |
157 The ``landingpad`` instruction takes a reference to the personality function to | |
158 be used for this ``try``/``catch`` sequence. The remainder of the instruction is | |
159 a list of *cleanup*, *catch*, and *filter* clauses. The exception is tested | |
160 against the clauses sequentially from first to last. The selector value is a | |
161 positive number if the exception matched a type info, a negative number if it | |
162 matched a filter, and zero if it matched a cleanup. If nothing is matched, the | |
163 behavior of the program is `undefined`_. If a type info matched, then the | |
164 selector value is the index of the type info in the exception table, which can | |
165 be obtained using the `llvm.eh.typeid.for`_ intrinsic. | |
166 | |
167 Once the landing pad has the type info selector, the code branches to the code | |
168 for the first catch. The catch then checks the value of the type info selector | |
169 against the index of type info for that catch. Since the type info index is not | |
170 known until all the type infos have been gathered in the backend, the catch code | |
171 must call the `llvm.eh.typeid.for`_ intrinsic to determine the index for a given | |
172 type info. If the catch fails to match the selector then control is passed on to | |
173 the next catch. | |
174 | |
175 Finally, the entry and exit of catch code is bracketed with calls to | |
176 ``__cxa_begin_catch`` and ``__cxa_end_catch``. | |
177 | |
178 * ``__cxa_begin_catch`` takes an exception structure reference as an argument | |
179 and returns the value of the exception object. | |
180 | |
181 * ``__cxa_end_catch`` takes no arguments. This function: | |
182 | |
183 #. Locates the most recently caught exception and decrements its handler | |
184 count, | |
185 | |
186 #. Removes the exception from the *caught* stack if the handler count goes to | |
187 zero, and | |
188 | |
189 #. Destroys the exception if the handler count goes to zero and the exception | |
190 was not re-thrown by throw. | |
191 | |
192 .. note:: | |
193 | |
194 a rethrow from within the catch may replace this call with a | |
195 ``__cxa_rethrow``. | |
196 | |
197 Cleanups | |
198 -------- | |
199 | |
200 A cleanup is extra code which needs to be run as part of unwinding a scope. C++ | |
201 destructors are a typical example, but other languages and language extensions | |
202 provide a variety of different kinds of cleanups. In general, a landing pad may | |
203 need to run arbitrary amounts of cleanup code before actually entering a catch | |
204 block. To indicate the presence of cleanups, a :ref:`i_landingpad` should have | |
205 a *cleanup* clause. Otherwise, the unwinder will not stop at the landing pad if | |
206 there are no catches or filters that require it to. | |
207 | |
208 .. note:: | |
209 | |
210 Do not allow a new exception to propagate out of the execution of a | |
211 cleanup. This can corrupt the internal state of the unwinder. Different | |
212 languages describe different high-level semantics for these situations: for | |
213 example, C++ requires that the process be terminated, whereas Ada cancels both | |
214 exceptions and throws a third. | |
215 | |
216 When all cleanups are finished, if the exception is not handled by the current | |
217 function, resume unwinding by calling the `resume | |
218 instruction <LangRef.html#i_resume>`_, passing in the result of the | |
219 ``landingpad`` instruction for the original landing pad. | |
220 | |
221 Throw Filters | |
222 ------------- | |
223 | |
224 C++ allows the specification of which exception types may be thrown from a | |
225 function. To represent this, a top level landing pad may exist to filter out | |
226 invalid types. To express this in LLVM code the :ref:`i_landingpad` will have a | |
227 filter clause. The clause consists of an array of type infos. | |
228 ``landingpad`` will return a negative value | |
229 if the exception does not match any of the type infos. If no match is found then | |
230 a call to ``__cxa_call_unexpected`` should be made, otherwise | |
231 ``_Unwind_Resume``. Each of these functions requires a reference to the | |
232 exception structure. Note that the most general form of a ``landingpad`` | |
233 instruction can have any number of catch, cleanup, and filter clauses (though | |
234 having more than one cleanup is pointless). The LLVM C++ front-end can generate | |
235 such ``landingpad`` instructions due to inlining creating nested exception | |
236 handling scopes. | |
237 | |
238 .. _undefined: | |
239 | |
240 Restrictions | |
241 ------------ | |
242 | |
243 The unwinder delegates the decision of whether to stop in a call frame to that | |
244 call frame's language-specific personality function. Not all unwinders guarantee | |
245 that they will stop to perform cleanups. For example, the GNU C++ unwinder | |
246 doesn't do so unless the exception is actually caught somewhere further up the | |
247 stack. | |
248 | |
249 In order for inlining to behave correctly, landing pads must be prepared to | |
250 handle selector results that they did not originally advertise. Suppose that a | |
251 function catches exceptions of type ``A``, and it's inlined into a function that | |
252 catches exceptions of type ``B``. The inliner will update the ``landingpad`` | |
253 instruction for the inlined landing pad to include the fact that ``B`` is also | |
254 caught. If that landing pad assumes that it will only be entered to catch an | |
255 ``A``, it's in for a rude awakening. Consequently, landing pads must test for | |
256 the selector results they understand and then resume exception propagation with | |
257 the `resume instruction <LangRef.html#i_resume>`_ if none of the conditions | |
258 match. | |
259 | |
260 Exception Handling Intrinsics | |
261 ============================= | |
262 | |
263 In addition to the ``landingpad`` and ``resume`` instructions, LLVM uses several | |
264 intrinsic functions (name prefixed with ``llvm.eh``) to provide exception | |
265 handling information at various points in generated code. | |
266 | |
267 .. _llvm.eh.typeid.for: | |
268 | |
269 ``llvm.eh.typeid.for`` | |
270 ---------------------- | |
271 | |
272 .. code-block:: llvm | |
273 | |
274 i32 @llvm.eh.typeid.for(i8* %type_info) | |
275 | |
276 | |
277 This intrinsic returns the type info index in the exception table of the current | |
278 function. This value can be used to compare against the result of | |
279 ``landingpad`` instruction. The single argument is a reference to a type info. | |
280 | |
281 .. _llvm.eh.sjlj.setjmp: | |
282 | |
283 ``llvm.eh.sjlj.setjmp`` | |
284 ----------------------- | |
285 | |
286 .. code-block:: llvm | |
287 | |
288 i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf) | |
289 | |
290 For SJLJ based exception handling, this intrinsic forces register saving for the | |
291 current function and stores the address of the following instruction for use as | |
292 a destination address by `llvm.eh.sjlj.longjmp`_. The buffer format and the | |
293 overall functioning of this intrinsic is compatible with the GCC | |
294 ``__builtin_setjmp`` implementation allowing code built with the clang and GCC | |
295 to interoperate. | |
296 | |
297 The single parameter is a pointer to a five word buffer in which the calling | |
298 context is saved. The front end places the frame pointer in the first word, and | |
299 the target implementation of this intrinsic should place the destination address | |
300 for a `llvm.eh.sjlj.longjmp`_ in the second word. The following three words are | |
301 available for use in a target-specific manner. | |
302 | |
303 .. _llvm.eh.sjlj.longjmp: | |
304 | |
305 ``llvm.eh.sjlj.longjmp`` | |
306 ------------------------ | |
307 | |
308 .. code-block:: llvm | |
309 | |
310 void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf) | |
311 | |
312 For SJLJ based exception handling, the ``llvm.eh.sjlj.longjmp`` intrinsic is | |
313 used to implement ``__builtin_longjmp()``. The single parameter is a pointer to | |
314 a buffer populated by `llvm.eh.sjlj.setjmp`_. The frame pointer and stack | |
315 pointer are restored from the buffer, then control is transferred to the | |
316 destination address. | |
317 | |
318 ``llvm.eh.sjlj.lsda`` | |
319 --------------------- | |
320 | |
321 .. code-block:: llvm | |
322 | |
323 i8* @llvm.eh.sjlj.lsda() | |
324 | |
325 For SJLJ based exception handling, the ``llvm.eh.sjlj.lsda`` intrinsic returns | |
326 the address of the Language Specific Data Area (LSDA) for the current | |
327 function. The SJLJ front-end code stores this address in the exception handling | |
328 function context for use by the runtime. | |
329 | |
330 ``llvm.eh.sjlj.callsite`` | |
331 ------------------------- | |
332 | |
333 .. code-block:: llvm | |
334 | |
335 void @llvm.eh.sjlj.callsite(i32 %call_site_num) | |
336 | |
337 For SJLJ based exception handling, the ``llvm.eh.sjlj.callsite`` intrinsic | |
338 identifies the callsite value associated with the following ``invoke`` | |
339 instruction. This is used to ensure that landing pad entries in the LSDA are | |
340 generated in matching order. | |
341 | |
342 Asm Table Formats | |
343 ================= | |
344 | |
345 There are two tables that are used by the exception handling runtime to | |
346 determine which actions should be taken when an exception is thrown. | |
347 | |
348 Exception Handling Frame | |
349 ------------------------ | |
350 | |
351 An exception handling frame ``eh_frame`` is very similar to the unwind frame | |
352 used by DWARF debug info. The frame contains all the information necessary to | |
353 tear down the current frame and restore the state of the prior frame. There is | |
354 an exception handling frame for each function in a compile unit, plus a common | |
355 exception handling frame that defines information common to all functions in the | |
356 unit. | |
357 | |
358 Exception Tables | |
359 ---------------- | |
360 | |
361 An exception table contains information about what actions to take when an | |
362 exception is thrown in a particular part of a function's code. There is one | |
363 exception table per function, except leaf functions and functions that have | |
364 calls only to non-throwing functions. They do not need an exception table. |