100
|
1 ===============================
|
|
2 MCJIT Design and Implementation
|
|
3 ===============================
|
|
4
|
|
5 Introduction
|
|
6 ============
|
|
7
|
|
8 This document describes the internal workings of the MCJIT execution
|
|
9 engine and the RuntimeDyld component. It is intended as a high level
|
|
10 overview of the implementation, showing the flow and interactions of
|
|
11 objects throughout the code generation and dynamic loading process.
|
|
12
|
|
13 Engine Creation
|
|
14 ===============
|
|
15
|
|
16 In most cases, an EngineBuilder object is used to create an instance of
|
|
17 the MCJIT execution engine. The EngineBuilder takes an llvm::Module
|
|
18 object as an argument to its constructor. The client may then set various
|
|
19 options that we control the later be passed along to the MCJIT engine,
|
|
20 including the selection of MCJIT as the engine type to be created.
|
|
21 Of particular interest is the EngineBuilder::setMCJITMemoryManager
|
|
22 function. If the client does not explicitly create a memory manager at
|
|
23 this time, a default memory manager (specifically SectionMemoryManager)
|
|
24 will be created when the MCJIT engine is instantiated.
|
|
25
|
|
26 Once the options have been set, a client calls EngineBuilder::create to
|
|
27 create an instance of the MCJIT engine. If the client does not use the
|
|
28 form of this function that takes a TargetMachine as a parameter, a new
|
|
29 TargetMachine will be created based on the target triple associated with
|
|
30 the Module that was used to create the EngineBuilder.
|
|
31
|
|
32 .. image:: MCJIT-engine-builder.png
|
|
33
|
|
34 EngineBuilder::create will call the static MCJIT::createJIT function,
|
|
35 passing in its pointers to the module, memory manager and target machine
|
|
36 objects, all of which will subsequently be owned by the MCJIT object.
|
|
37
|
|
38 The MCJIT class has a member variable, Dyld, which contains an instance of
|
|
39 the RuntimeDyld wrapper class. This member will be used for
|
|
40 communications between MCJIT and the actual RuntimeDyldImpl object that
|
|
41 gets created when an object is loaded.
|
|
42
|
|
43 .. image:: MCJIT-creation.png
|
|
44
|
|
45 Upon creation, MCJIT holds a pointer to the Module object that it received
|
|
46 from EngineBuilder but it does not immediately generate code for this
|
|
47 module. Code generation is deferred until either the
|
|
48 MCJIT::finalizeObject method is called explicitly or a function such as
|
|
49 MCJIT::getPointerToFunction is called which requires the code to have been
|
|
50 generated.
|
|
51
|
|
52 Code Generation
|
|
53 ===============
|
|
54
|
|
55 When code generation is triggered, as described above, MCJIT will first
|
|
56 attempt to retrieve an object image from its ObjectCache member, if one
|
|
57 has been set. If a cached object image cannot be retrieved, MCJIT will
|
|
58 call its emitObject method. MCJIT::emitObject uses a local PassManager
|
|
59 instance and creates a new ObjectBufferStream instance, both of which it
|
|
60 passes to TargetMachine::addPassesToEmitMC before calling PassManager::run
|
|
61 on the Module with which it was created.
|
|
62
|
|
63 .. image:: MCJIT-load.png
|
|
64
|
|
65 The PassManager::run call causes the MC code generation mechanisms to emit
|
|
66 a complete relocatable binary object image (either in either ELF or MachO
|
|
67 format, depending on the target) into the ObjectBufferStream object, which
|
|
68 is flushed to complete the process. If an ObjectCache is being used, the
|
|
69 image will be passed to the ObjectCache here.
|
|
70
|
|
71 At this point, the ObjectBufferStream contains the raw object image.
|
|
72 Before the code can be executed, the code and data sections from this
|
|
73 image must be loaded into suitable memory, relocations must be applied and
|
|
74 memory permission and code cache invalidation (if required) must be completed.
|
|
75
|
|
76 Object Loading
|
|
77 ==============
|
|
78
|
|
79 Once an object image has been obtained, either through code generation or
|
|
80 having been retrieved from an ObjectCache, it is passed to RuntimeDyld to
|
|
81 be loaded. The RuntimeDyld wrapper class examines the object to determine
|
|
82 its file format and creates an instance of either RuntimeDyldELF or
|
|
83 RuntimeDyldMachO (both of which derive from the RuntimeDyldImpl base
|
|
84 class) and calls the RuntimeDyldImpl::loadObject method to perform that
|
|
85 actual loading.
|
|
86
|
|
87 .. image:: MCJIT-dyld-load.png
|
|
88
|
|
89 RuntimeDyldImpl::loadObject begins by creating an ObjectImage instance
|
|
90 from the ObjectBuffer it received. ObjectImage, which wraps the
|
|
91 ObjectFile class, is a helper class which parses the binary object image
|
|
92 and provides access to the information contained in the format-specific
|
|
93 headers, including section, symbol and relocation information.
|
|
94
|
|
95 RuntimeDyldImpl::loadObject then iterates through the symbols in the
|
|
96 image. Information about common symbols is collected for later use. For
|
|
97 each function or data symbol, the associated section is loaded into memory
|
|
98 and the symbol is stored in a symbol table map data structure. When the
|
|
99 iteration is complete, a section is emitted for the common symbols.
|
|
100
|
|
101 Next, RuntimeDyldImpl::loadObject iterates through the sections in the
|
|
102 object image and for each section iterates through the relocations for
|
|
103 that sections. For each relocation, it calls the format-specific
|
|
104 processRelocationRef method, which will examine the relocation and store
|
|
105 it in one of two data structures, a section-based relocation list map and
|
|
106 an external symbol relocation map.
|
|
107
|
|
108 .. image:: MCJIT-load-object.png
|
|
109
|
|
110 When RuntimeDyldImpl::loadObject returns, all of the code and data
|
|
111 sections for the object will have been loaded into memory allocated by the
|
|
112 memory manager and relocation information will have been prepared, but the
|
|
113 relocations have not yet been applied and the generated code is still not
|
|
114 ready to be executed.
|
|
115
|
|
116 [Currently (as of August 2013) the MCJIT engine will immediately apply
|
|
117 relocations when loadObject completes. However, this shouldn't be
|
|
118 happening. Because the code may have been generated for a remote target,
|
|
119 the client should be given a chance to re-map the section addresses before
|
|
120 relocations are applied. It is possible to apply relocations multiple
|
|
121 times, but in the case where addresses are to be re-mapped, this first
|
|
122 application is wasted effort.]
|
|
123
|
|
124 Address Remapping
|
|
125 =================
|
|
126
|
|
127 At any time after initial code has been generated and before
|
|
128 finalizeObject is called, the client can remap the address of sections in
|
|
129 the object. Typically this is done because the code was generated for an
|
|
130 external process and is being mapped into that process' address space.
|
|
131 The client remaps the section address by calling MCJIT::mapSectionAddress.
|
|
132 This should happen before the section memory is copied to its new
|
|
133 location.
|
|
134
|
|
135 When MCJIT::mapSectionAddress is called, MCJIT passes the call on to
|
|
136 RuntimeDyldImpl (via its Dyld member). RuntimeDyldImpl stores the new
|
|
137 address in an internal data structure but does not update the code at this
|
|
138 time, since other sections are likely to change.
|
|
139
|
|
140 When the client is finished remapping section addresses, it will call
|
|
141 MCJIT::finalizeObject to complete the remapping process.
|
|
142
|
|
143 Final Preparations
|
|
144 ==================
|
|
145
|
|
146 When MCJIT::finalizeObject is called, MCJIT calls
|
|
147 RuntimeDyld::resolveRelocations. This function will attempt to locate any
|
|
148 external symbols and then apply all relocations for the object.
|
|
149
|
|
150 External symbols are resolved by calling the memory manager's
|
|
151 getPointerToNamedFunction method. The memory manager will return the
|
|
152 address of the requested symbol in the target address space. (Note, this
|
|
153 may not be a valid pointer in the host process.) RuntimeDyld will then
|
|
154 iterate through the list of relocations it has stored which are associated
|
|
155 with this symbol and invoke the resolveRelocation method which, through an
|
|
156 format-specific implementation, will apply the relocation to the loaded
|
|
157 section memory.
|
|
158
|
|
159 Next, RuntimeDyld::resolveRelocations iterates through the list of
|
|
160 sections and for each section iterates through a list of relocations that
|
|
161 have been saved which reference that symbol and call resolveRelocation for
|
|
162 each entry in this list. The relocation list here is a list of
|
|
163 relocations for which the symbol associated with the relocation is located
|
|
164 in the section associated with the list. Each of these locations will
|
|
165 have a target location at which the relocation will be applied that is
|
|
166 likely located in a different section.
|
|
167
|
|
168 .. image:: MCJIT-resolve-relocations.png
|
|
169
|
|
170 Once relocations have been applied as described above, MCJIT calls
|
|
171 RuntimeDyld::getEHFrameSection, and if a non-zero result is returned
|
|
172 passes the section data to the memory manager's registerEHFrames method.
|
|
173 This allows the memory manager to call any desired target-specific
|
|
174 functions, such as registering the EH frame information with a debugger.
|
|
175
|
|
176 Finally, MCJIT calls the memory manager's finalizeMemory method. In this
|
|
177 method, the memory manager will invalidate the target code cache, if
|
|
178 necessary, and apply final permissions to the memory pages it has
|
|
179 allocated for code and data memory.
|
|
180
|