0
|
1 @c Copyright (C) 2002, 2003, 2004, 2007, 2008, 2009
|
|
2 @c Free Software Foundation, Inc.
|
|
3 @c This is part of the GCC manual.
|
|
4 @c For copying conditions, see the file gcc.texi.
|
|
5
|
|
6 @node Type Information
|
|
7 @chapter Memory Management and Type Information
|
|
8 @cindex GGC
|
|
9 @findex GTY
|
|
10
|
|
11 GCC uses some fairly sophisticated memory management techniques, which
|
|
12 involve determining information about GCC's data structures from GCC's
|
|
13 source code and using this information to perform garbage collection and
|
|
14 implement precompiled headers.
|
|
15
|
|
16 A full C parser would be too complicated for this task, so a limited
|
|
17 subset of C is interpreted and special markers are used to determine
|
|
18 what parts of the source to look at. All @code{struct} and
|
|
19 @code{union} declarations that define data structures that are
|
|
20 allocated under control of the garbage collector must be marked. All
|
|
21 global variables that hold pointers to garbage-collected memory must
|
|
22 also be marked. Finally, all global variables that need to be saved
|
|
23 and restored by a precompiled header must be marked. (The precompiled
|
|
24 header mechanism can only save static variables if they're scalar.
|
|
25 Complex data structures must be allocated in garbage-collected memory
|
|
26 to be saved in a precompiled header.)
|
|
27
|
|
28 The full format of a marker is
|
|
29 @smallexample
|
|
30 GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{}))
|
|
31 @end smallexample
|
|
32 @noindent
|
|
33 but in most cases no options are needed. The outer double parentheses
|
|
34 are still necessary, though: @code{GTY(())}. Markers can appear:
|
|
35
|
|
36 @itemize @bullet
|
|
37 @item
|
|
38 In a structure definition, before the open brace;
|
|
39 @item
|
|
40 In a global variable declaration, after the keyword @code{static} or
|
|
41 @code{extern}; and
|
|
42 @item
|
|
43 In a structure field definition, before the name of the field.
|
|
44 @end itemize
|
|
45
|
|
46 Here are some examples of marking simple data structures and globals.
|
|
47
|
|
48 @smallexample
|
|
49 struct @var{tag} GTY(())
|
|
50 @{
|
|
51 @var{fields}@dots{}
|
|
52 @};
|
|
53
|
|
54 typedef struct @var{tag} GTY(())
|
|
55 @{
|
|
56 @var{fields}@dots{}
|
|
57 @} *@var{typename};
|
|
58
|
|
59 static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */
|
|
60 static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */
|
|
61 @end smallexample
|
|
62
|
|
63 The parser understands simple typedefs such as
|
|
64 @code{typedef struct @var{tag} *@var{name};} and
|
|
65 @code{typedef int @var{name};}.
|
|
66 These don't need to be marked.
|
|
67
|
|
68 @menu
|
|
69 * GTY Options:: What goes inside a @code{GTY(())}.
|
|
70 * GGC Roots:: Making global variables GGC roots.
|
|
71 * Files:: How the generated files work.
|
|
72 * Invoking the garbage collector:: How to invoke the garbage collector.
|
|
73 @end menu
|
|
74
|
|
75 @node GTY Options
|
|
76 @section The Inside of a @code{GTY(())}
|
|
77
|
|
78 Sometimes the C code is not enough to fully describe the type
|
|
79 structure. Extra information can be provided with @code{GTY} options
|
|
80 and additional markers. Some options take a parameter, which may be
|
|
81 either a string or a type name, depending on the parameter. If an
|
|
82 option takes no parameter, it is acceptable either to omit the
|
|
83 parameter entirely, or to provide an empty string as a parameter. For
|
|
84 example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are
|
|
85 equivalent.
|
|
86
|
|
87 When the parameter is a string, often it is a fragment of C code. Four
|
|
88 special escapes may be used in these strings, to refer to pieces of
|
|
89 the data structure being marked:
|
|
90
|
|
91 @cindex % in GTY option
|
|
92 @table @code
|
|
93 @item %h
|
|
94 The current structure.
|
|
95 @item %1
|
|
96 The structure that immediately contains the current structure.
|
|
97 @item %0
|
|
98 The outermost structure that contains the current structure.
|
|
99 @item %a
|
|
100 A partial expression of the form @code{[i1][i2]@dots{}} that indexes
|
|
101 the array item currently being marked.
|
|
102 @end table
|
|
103
|
|
104 For instance, suppose that you have a structure of the form
|
|
105 @smallexample
|
|
106 struct A @{
|
|
107 @dots{}
|
|
108 @};
|
|
109 struct B @{
|
|
110 struct A foo[12];
|
|
111 @};
|
|
112 @end smallexample
|
|
113 @noindent
|
|
114 and @code{b} is a variable of type @code{struct B}. When marking
|
|
115 @samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]},
|
|
116 @code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a}
|
|
117 would expand to @samp{[11]}.
|
|
118
|
|
119 As in ordinary C, adjacent strings will be concatenated; this is
|
|
120 helpful when you have a complicated expression.
|
|
121 @smallexample
|
|
122 @group
|
|
123 GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE"
|
|
124 " ? TYPE_NEXT_VARIANT (&%h.generic)"
|
|
125 " : TREE_CHAIN (&%h.generic)")))
|
|
126 @end group
|
|
127 @end smallexample
|
|
128
|
|
129 The available options are:
|
|
130
|
|
131 @table @code
|
|
132 @findex length
|
|
133 @item length ("@var{expression}")
|
|
134
|
|
135 There are two places the type machinery will need to be explicitly told
|
|
136 the length of an array. The first case is when a structure ends in a
|
|
137 variable-length array, like this:
|
|
138 @smallexample
|
|
139 struct rtvec_def GTY(()) @{
|
|
140 int num_elem; /* @r{number of elements} */
|
|
141 rtx GTY ((length ("%h.num_elem"))) elem[1];
|
|
142 @};
|
|
143 @end smallexample
|
|
144
|
|
145 In this case, the @code{length} option is used to override the specified
|
|
146 array length (which should usually be @code{1}). The parameter of the
|
|
147 option is a fragment of C code that calculates the length.
|
|
148
|
|
149 The second case is when a structure or a global variable contains a
|
|
150 pointer to an array, like this:
|
|
151 @smallexample
|
|
152 tree *
|
|
153 GTY ((length ("%h.regno_pointer_align_length"))) regno_decl;
|
|
154 @end smallexample
|
|
155 In this case, @code{regno_decl} has been allocated by writing something like
|
|
156 @smallexample
|
|
157 x->regno_decl =
|
|
158 ggc_alloc (x->regno_pointer_align_length * sizeof (tree));
|
|
159 @end smallexample
|
|
160 and the @code{length} provides the length of the field.
|
|
161
|
|
162 This second use of @code{length} also works on global variables, like:
|
|
163 @verbatim
|
|
164 static GTY((length ("reg_base_value_size")))
|
|
165 rtx *reg_base_value;
|
|
166 @end verbatim
|
|
167
|
|
168 @findex skip
|
|
169 @item skip
|
|
170
|
|
171 If @code{skip} is applied to a field, the type machinery will ignore it.
|
|
172 This is somewhat dangerous; the only safe use is in a union when one
|
|
173 field really isn't ever used.
|
|
174
|
|
175 @findex desc
|
|
176 @findex tag
|
|
177 @findex default
|
|
178 @item desc ("@var{expression}")
|
|
179 @itemx tag ("@var{constant}")
|
|
180 @itemx default
|
|
181
|
|
182 The type machinery needs to be told which field of a @code{union} is
|
|
183 currently active. This is done by giving each field a constant
|
|
184 @code{tag} value, and then specifying a discriminator using @code{desc}.
|
|
185 The value of the expression given by @code{desc} is compared against
|
|
186 each @code{tag} value, each of which should be different. If no
|
|
187 @code{tag} is matched, the field marked with @code{default} is used if
|
|
188 there is one, otherwise no field in the union will be marked.
|
|
189
|
|
190 In the @code{desc} option, the ``current structure'' is the union that
|
|
191 it discriminates. Use @code{%1} to mean the structure containing it.
|
|
192 There are no escapes available to the @code{tag} option, since it is a
|
|
193 constant.
|
|
194
|
|
195 For example,
|
|
196 @smallexample
|
|
197 struct tree_binding GTY(())
|
|
198 @{
|
|
199 struct tree_common common;
|
|
200 union tree_binding_u @{
|
|
201 tree GTY ((tag ("0"))) scope;
|
|
202 struct cp_binding_level * GTY ((tag ("1"))) level;
|
|
203 @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope;
|
|
204 tree value;
|
|
205 @};
|
|
206 @end smallexample
|
|
207
|
|
208 In this example, the value of BINDING_HAS_LEVEL_P when applied to a
|
|
209 @code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type
|
|
210 mechanism will treat the field @code{level} as being present and if 0,
|
|
211 will treat the field @code{scope} as being present.
|
|
212
|
|
213 @findex param_is
|
|
214 @findex use_param
|
|
215 @item param_is (@var{type})
|
|
216 @itemx use_param
|
|
217
|
|
218 Sometimes it's convenient to define some data structure to work on
|
|
219 generic pointers (that is, @code{PTR}) and then use it with a specific
|
|
220 type. @code{param_is} specifies the real type pointed to, and
|
|
221 @code{use_param} says where in the generic data structure that type
|
|
222 should be put.
|
|
223
|
|
224 For instance, to have a @code{htab_t} that points to trees, one would
|
|
225 write the definition of @code{htab_t} like this:
|
|
226 @smallexample
|
|
227 typedef struct GTY(()) @{
|
|
228 @dots{}
|
|
229 void ** GTY ((use_param, @dots{})) entries;
|
|
230 @dots{}
|
|
231 @} htab_t;
|
|
232 @end smallexample
|
|
233 and then declare variables like this:
|
|
234 @smallexample
|
|
235 static htab_t GTY ((param_is (union tree_node))) ict;
|
|
236 @end smallexample
|
|
237
|
|
238 @findex param@var{n}_is
|
|
239 @findex use_param@var{n}
|
|
240 @item param@var{n}_is (@var{type})
|
|
241 @itemx use_param@var{n}
|
|
242
|
|
243 In more complicated cases, the data structure might need to work on
|
|
244 several different types, which might not necessarily all be pointers.
|
|
245 For this, @code{param1_is} through @code{param9_is} may be used to
|
|
246 specify the real type of a field identified by @code{use_param1} through
|
|
247 @code{use_param9}.
|
|
248
|
|
249 @findex use_params
|
|
250 @item use_params
|
|
251
|
|
252 When a structure contains another structure that is parameterized,
|
|
253 there's no need to do anything special, the inner structure inherits the
|
|
254 parameters of the outer one. When a structure contains a pointer to a
|
|
255 parameterized structure, the type machinery won't automatically detect
|
|
256 this (it could, it just doesn't yet), so it's necessary to tell it that
|
|
257 the pointed-to structure should use the same parameters as the outer
|
|
258 structure. This is done by marking the pointer with the
|
|
259 @code{use_params} option.
|
|
260
|
|
261 @findex deletable
|
|
262 @item deletable
|
|
263
|
|
264 @code{deletable}, when applied to a global variable, indicates that when
|
|
265 garbage collection runs, there's no need to mark anything pointed to
|
|
266 by this variable, it can just be set to @code{NULL} instead. This is used
|
|
267 to keep a list of free structures around for re-use.
|
|
268
|
|
269 @findex if_marked
|
|
270 @item if_marked ("@var{expression}")
|
|
271
|
|
272 Suppose you want some kinds of object to be unique, and so you put them
|
|
273 in a hash table. If garbage collection marks the hash table, these
|
|
274 objects will never be freed, even if the last other reference to them
|
|
275 goes away. GGC has special handling to deal with this: if you use the
|
|
276 @code{if_marked} option on a global hash table, GGC will call the
|
|
277 routine whose name is the parameter to the option on each hash table
|
|
278 entry. If the routine returns nonzero, the hash table entry will
|
|
279 be marked as usual. If the routine returns zero, the hash table entry
|
|
280 will be deleted.
|
|
281
|
|
282 The routine @code{ggc_marked_p} can be used to determine if an element
|
|
283 has been marked already; in fact, the usual case is to use
|
|
284 @code{if_marked ("ggc_marked_p")}.
|
|
285
|
|
286 @findex mark_hook
|
|
287 @item mark_hook ("@var{hook-routine-name}")
|
|
288
|
|
289 If provided for a structure or union type, the given
|
|
290 @var{hook-routine-name} (between double-quotes) is the name of a
|
|
291 routine called when the garbage collector has just marked the data as
|
|
292 reachable. This routine should not change the data, or call any ggc
|
|
293 routine. Its only argument is a pointer to the just marked (const)
|
|
294 structure or union.
|
|
295
|
|
296 @findex maybe_undef
|
|
297 @item maybe_undef
|
|
298
|
|
299 When applied to a field, @code{maybe_undef} indicates that it's OK if
|
|
300 the structure that this fields points to is never defined, so long as
|
|
301 this field is always @code{NULL}. This is used to avoid requiring
|
|
302 backends to define certain optional structures. It doesn't work with
|
|
303 language frontends.
|
|
304
|
|
305 @findex nested_ptr
|
|
306 @item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}")
|
|
307
|
|
308 The type machinery expects all pointers to point to the start of an
|
|
309 object. Sometimes for abstraction purposes it's convenient to have
|
|
310 a pointer which points inside an object. So long as it's possible to
|
|
311 convert the original object to and from the pointer, such pointers
|
|
312 can still be used. @var{type} is the type of the original object,
|
|
313 the @var{to expression} returns the pointer given the original object,
|
|
314 and the @var{from expression} returns the original object given
|
|
315 the pointer. The pointer will be available using the @code{%h}
|
|
316 escape.
|
|
317
|
|
318 @findex chain_next
|
|
319 @findex chain_prev
|
|
320 @findex chain_circular
|
|
321 @item chain_next ("@var{expression}")
|
|
322 @itemx chain_prev ("@var{expression}")
|
|
323 @itemx chain_circular ("@var{expression}")
|
|
324
|
|
325 It's helpful for the type machinery to know if objects are often
|
|
326 chained together in long lists; this lets it generate code that uses
|
|
327 less stack space by iterating along the list instead of recursing down
|
|
328 it. @code{chain_next} is an expression for the next item in the list,
|
|
329 @code{chain_prev} is an expression for the previous item. For singly
|
|
330 linked lists, use only @code{chain_next}; for doubly linked lists, use
|
|
331 both. The machinery requires that taking the next item of the
|
|
332 previous item gives the original item. @code{chain_circular} is similar
|
|
333 to @code{chain_next}, but can be used for circular single linked lists.
|
|
334
|
|
335 @findex reorder
|
|
336 @item reorder ("@var{function name}")
|
|
337
|
|
338 Some data structures depend on the relative ordering of pointers. If
|
|
339 the precompiled header machinery needs to change that ordering, it
|
|
340 will call the function referenced by the @code{reorder} option, before
|
|
341 changing the pointers in the object that's pointed to by the field the
|
|
342 option applies to. The function must take four arguments, with the
|
|
343 signature @samp{@w{void *, void *, gt_pointer_operator, void *}}.
|
|
344 The first parameter is a pointer to the structure that contains the
|
|
345 object being updated, or the object itself if there is no containing
|
|
346 structure. The second parameter is a cookie that should be ignored.
|
|
347 The third parameter is a routine that, given a pointer, will update it
|
|
348 to its correct new value. The fourth parameter is a cookie that must
|
|
349 be passed to the second parameter.
|
|
350
|
|
351 PCH cannot handle data structures that depend on the absolute values
|
|
352 of pointers. @code{reorder} functions can be expensive. When
|
|
353 possible, it is better to depend on properties of the data, like an ID
|
|
354 number or the hash of a string instead.
|
|
355
|
|
356 @findex special
|
|
357 @item special ("@var{name}")
|
|
358
|
|
359 The @code{special} option is used to mark types that have to be dealt
|
|
360 with by special case machinery. The parameter is the name of the
|
|
361 special case. See @file{gengtype.c} for further details. Avoid
|
|
362 adding new special cases unless there is no other alternative.
|
|
363 @end table
|
|
364
|
|
365 @node GGC Roots
|
|
366 @section Marking Roots for the Garbage Collector
|
|
367 @cindex roots, marking
|
|
368 @cindex marking roots
|
|
369
|
|
370 In addition to keeping track of types, the type machinery also locates
|
|
371 the global variables (@dfn{roots}) that the garbage collector starts
|
|
372 at. Roots must be declared using one of the following syntaxes:
|
|
373
|
|
374 @itemize @bullet
|
|
375 @item
|
|
376 @code{extern GTY(([@var{options}])) @var{type} @var{name};}
|
|
377 @item
|
|
378 @code{static GTY(([@var{options}])) @var{type} @var{name};}
|
|
379 @end itemize
|
|
380 @noindent
|
|
381 The syntax
|
|
382 @itemize @bullet
|
|
383 @item
|
|
384 @code{GTY(([@var{options}])) @var{type} @var{name};}
|
|
385 @end itemize
|
|
386 @noindent
|
|
387 is @emph{not} accepted. There should be an @code{extern} declaration
|
|
388 of such a variable in a header somewhere---mark that, not the
|
|
389 definition. Or, if the variable is only used in one file, make it
|
|
390 @code{static}.
|
|
391
|
|
392 @node Files
|
|
393 @section Source Files Containing Type Information
|
|
394 @cindex generated files
|
|
395 @cindex files, generated
|
|
396
|
|
397 Whenever you add @code{GTY} markers to a source file that previously
|
|
398 had none, or create a new source file containing @code{GTY} markers,
|
|
399 there are three things you need to do:
|
|
400
|
|
401 @enumerate
|
|
402 @item
|
|
403 You need to add the file to the list of source files the type
|
|
404 machinery scans. There are four cases:
|
|
405
|
|
406 @enumerate a
|
|
407 @item
|
|
408 For a back-end file, this is usually done
|
|
409 automatically; if not, you should add it to @code{target_gtfiles} in
|
|
410 the appropriate port's entries in @file{config.gcc}.
|
|
411
|
|
412 @item
|
|
413 For files shared by all front ends, add the filename to the
|
|
414 @code{GTFILES} variable in @file{Makefile.in}.
|
|
415
|
|
416 @item
|
|
417 For files that are part of one front end, add the filename to the
|
|
418 @code{gtfiles} variable defined in the appropriate
|
|
419 @file{config-lang.in}. For C, the file is @file{c-config-lang.in}.
|
|
420 Headers should appear before non-headers in this list.
|
|
421
|
|
422 @item
|
|
423 For files that are part of some but not all front ends, add the
|
|
424 filename to the @code{gtfiles} variable of @emph{all} the front ends
|
|
425 that use it.
|
|
426 @end enumerate
|
|
427
|
|
428 @item
|
|
429 If the file was a header file, you'll need to check that it's included
|
|
430 in the right place to be visible to the generated files. For a back-end
|
|
431 header file, this should be done automatically. For a front-end header
|
|
432 file, it needs to be included by the same file that includes
|
|
433 @file{gtype-@var{lang}.h}. For other header files, it needs to be
|
|
434 included in @file{gtype-desc.c}, which is a generated file, so add it to
|
|
435 @code{ifiles} in @code{open_base_file} in @file{gengtype.c}.
|
|
436
|
|
437 For source files that aren't header files, the machinery will generate a
|
|
438 header file that should be included in the source file you just changed.
|
|
439 The file will be called @file{gt-@var{path}.h} where @var{path} is the
|
|
440 pathname relative to the @file{gcc} directory with slashes replaced by
|
|
441 @verb{|-|}, so for example the header file to be included in
|
|
442 @file{cp/parser.c} is called @file{gt-cp-parser.c}. The
|
|
443 generated header file should be included after everything else in the
|
|
444 source file. Don't forget to mention this file as a dependency in the
|
|
445 @file{Makefile}!
|
|
446
|
|
447 @end enumerate
|
|
448
|
|
449 For language frontends, there is another file that needs to be included
|
|
450 somewhere. It will be called @file{gtype-@var{lang}.h}, where
|
|
451 @var{lang} is the name of the subdirectory the language is contained in.
|
|
452
|
|
453 @node Invoking the garbage collector
|
|
454 @section How to invoke the garbage collector
|
|
455 @cindex garbage collector, invocation
|
|
456 @findex ggc_collect
|
|
457
|
|
458 The GCC garbage collector GGC is only invoked explicitly. In contrast
|
|
459 with many other garbage collectors, it is not implicitly invoked by
|
|
460 allocation routines when a lot of memory has been consumed. So the
|
|
461 only way to have GGC reclaim storage it to call the @code{ggc_collect}
|
|
462 function explicitly. This call is an expensive operation, as it may
|
|
463 have to scan the entire heap. Beware that local variables (on the GCC
|
|
464 call stack) are not followed by such an invocation (as many other
|
|
465 garbage collectors do): you should reference all your data from static
|
|
466 or external @code{GTY}-ed variables, and it is advised to call
|
|
467 @code{ggc_collect} with a shallow call stack. The GGC is an exact mark
|
|
468 and sweep garbage collector (so it does not scan the call stack for
|
|
469 pointers). In practice GCC passes don't often call @code{ggc_collect}
|
|
470 themselves, because it is called by the pass manager between passes.
|