150
|
1 =======================
|
|
2 Writing an LLVM Backend
|
|
3 =======================
|
|
4
|
|
5 .. toctree::
|
|
6 :hidden:
|
|
7
|
|
8 HowToUseInstrMappings
|
|
9
|
|
10 .. contents::
|
|
11 :local:
|
|
12
|
|
13 Introduction
|
|
14 ============
|
|
15
|
|
16 This document describes techniques for writing compiler backends that convert
|
|
17 the LLVM Intermediate Representation (IR) to code for a specified machine or
|
|
18 other languages. Code intended for a specific machine can take the form of
|
|
19 either assembly code or binary code (usable for a JIT compiler).
|
|
20
|
|
21 The backend of LLVM features a target-independent code generator that may
|
|
22 create output for several types of target CPUs --- including X86, PowerPC,
|
|
23 ARM, and SPARC. The backend may also be used to generate code targeted at SPUs
|
|
24 of the Cell processor or GPUs to support the execution of compute kernels.
|
|
25
|
|
26 The document focuses on existing examples found in subdirectories of
|
|
27 ``llvm/lib/Target`` in a downloaded LLVM release. In particular, this document
|
|
28 focuses on the example of creating a static compiler (one that emits text
|
|
29 assembly) for a SPARC target, because SPARC has fairly standard
|
|
30 characteristics, such as a RISC instruction set and straightforward calling
|
|
31 conventions.
|
|
32
|
|
33 Audience
|
|
34 --------
|
|
35
|
|
36 The audience for this document is anyone who needs to write an LLVM backend to
|
|
37 generate code for a specific hardware or software target.
|
|
38
|
|
39 Prerequisite Reading
|
|
40 --------------------
|
|
41
|
|
42 These essential documents must be read before reading this document:
|
|
43
|
|
44 * `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for
|
|
45 the LLVM assembly language.
|
|
46
|
|
47 * :doc:`CodeGenerator` --- a guide to the components (classes and code
|
|
48 generation algorithms) for translating the LLVM internal representation into
|
|
49 machine code for a specified target. Pay particular attention to the
|
|
50 descriptions of code generation stages: Instruction Selection, Scheduling and
|
|
51 Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code
|
|
52 Insertion, Late Machine Code Optimizations, and Code Emission.
|
|
53
|
|
54 * :doc:`TableGen/index` --- a document that describes the TableGen
|
|
55 (``tblgen``) application that manages domain-specific information to support
|
|
56 LLVM code generation. TableGen processes input from a target description
|
|
57 file (``.td`` suffix) and generates C++ code that can be used for code
|
|
58 generation.
|
|
59
|
|
60 * :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as
|
|
61 are several ``SelectionDAG`` processing steps.
|
|
62
|
|
63 To follow the SPARC examples in this document, have a copy of `The SPARC
|
|
64 Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for
|
|
65 reference. For details about the ARM instruction set, refer to the `ARM
|
|
66 Architecture Reference Manual <http://infocenter.arm.com/>`_. For more about
|
|
67 the GNU Assembler format (``GAS``), see `Using As
|
|
68 <http://sourceware.org/binutils/docs/as/index.html>`_, especially for the
|
|
69 assembly printer. "Using As" contains a list of target machine dependent
|
|
70 features.
|
|
71
|
|
72 Basic Steps
|
|
73 -----------
|
|
74
|
|
75 To write a compiler backend for LLVM that converts the LLVM IR to code for a
|
|
76 specified target (machine or other language), follow these steps:
|
|
77
|
|
78 * Create a subclass of the ``TargetMachine`` class that describes
|
|
79 characteristics of your target machine. Copy existing examples of specific
|
|
80 ``TargetMachine`` class and header files; for example, start with
|
|
81 ``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file
|
|
82 names for your target. Similarly, change code that references "``Sparc``" to
|
|
83 reference your target.
|
|
84
|
|
85 * Describe the register set of the target. Use TableGen to generate code for
|
|
86 register definition, register aliases, and register classes from a
|
|
87 target-specific ``RegisterInfo.td`` input file. You should also write
|
|
88 additional code for a subclass of the ``TargetRegisterInfo`` class that
|
|
89 represents the class register file data used for register allocation and also
|
|
90 describes the interactions between registers.
|
|
91
|
|
92 * Describe the instruction set of the target. Use TableGen to generate code
|
|
93 for target-specific instructions from target-specific versions of
|
|
94 ``TargetInstrFormats.td`` and ``TargetInstrInfo.td``. You should write
|
|
95 additional code for a subclass of the ``TargetInstrInfo`` class to represent
|
|
96 machine instructions supported by the target machine.
|
|
97
|
|
98 * Describe the selection and conversion of the LLVM IR from a Directed Acyclic
|
|
99 Graph (DAG) representation of instructions to native target-specific
|
|
100 instructions. Use TableGen to generate code that matches patterns and
|
|
101 selects instructions based on additional information in a target-specific
|
|
102 version of ``TargetInstrInfo.td``. Write code for ``XXXISelDAGToDAG.cpp``,
|
|
103 where ``XXX`` identifies the specific target, to perform pattern matching and
|
|
104 DAG-to-DAG instruction selection. Also write code in ``XXXISelLowering.cpp``
|
|
105 to replace or remove operations and data types that are not supported
|
|
106 natively in a SelectionDAG.
|
|
107
|
|
108 * Write code for an assembly printer that converts LLVM IR to a GAS format for
|
|
109 your target machine. You should add assembly strings to the instructions
|
|
110 defined in your target-specific version of ``TargetInstrInfo.td``. You
|
|
111 should also write code for a subclass of ``AsmPrinter`` that performs the
|
|
112 LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``.
|
|
113
|
|
114 * Optionally, add support for subtargets (i.e., variants with different
|
|
115 capabilities). You should also write code for a subclass of the
|
|
116 ``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and
|
|
117 ``-mattr=`` command-line options.
|
|
118
|
|
119 * Optionally, add JIT support and create a machine code emitter (subclass of
|
|
120 ``TargetJITInfo``) that is used to emit binary code directly into memory.
|
|
121
|
|
122 In the ``.cpp`` and ``.h``. files, initially stub up these methods and then
|
|
123 implement them later. Initially, you may not know which private members that
|
|
124 the class will need and which components will need to be subclassed.
|
|
125
|
|
126 Preliminaries
|
|
127 -------------
|
|
128
|
|
129 To actually create your compiler backend, you need to create and modify a few
|
|
130 files. The absolute minimum is discussed here. But to actually use the LLVM
|
|
131 target-independent code generator, you must perform the steps described in the
|
|
132 :doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document.
|
|
133
|
|
134 First, you should create a subdirectory under ``lib/Target`` to hold all the
|
|
135 files related to your target. If your target is called "Dummy", create the
|
|
136 directory ``lib/Target/Dummy``.
|
|
137
|
|
138 In this new directory, create a ``CMakeLists.txt``. It is easiest to copy a
|
|
139 ``CMakeLists.txt`` of another target and modify it. It should at least contain
|
|
140 the ``LLVM_TARGET_DEFINITIONS`` variable. The library can be named ``LLVMDummy``
|
|
141 (for example, see the MIPS target). Alternatively, you can split the library
|
|
142 into ``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which
|
|
143 should be implemented in a subdirectory below ``lib/Target/Dummy`` (for example,
|
|
144 see the PowerPC target).
|
|
145
|
|
146 Note that these two naming schemes are hardcoded into ``llvm-config``. Using
|
|
147 any other naming scheme will confuse ``llvm-config`` and produce a lot of
|
|
148 (seemingly unrelated) linker errors when linking ``llc``.
|
|
149
|
|
150 To make your target actually do something, you need to implement a subclass of
|
|
151 ``TargetMachine``. This implementation should typically be in the file
|
|
152 ``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target``
|
|
153 directory will be built and should work. To use LLVM's target independent code
|
|
154 generator, you should do what all current machine backends do: create a
|
|
155 subclass of ``LLVMTargetMachine``. (To create a target from scratch, create a
|
|
156 subclass of ``TargetMachine``.)
|
|
157
|
|
158 To get LLVM to actually build and link your target, you need to run ``cmake``
|
|
159 with ``-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=Dummy``. This will build your
|
|
160 target without needing to add it to the list of all the targets.
|
|
161
|
|
162 Once your target is stable, you can add it to the ``LLVM_ALL_TARGETS`` variable
|
|
163 located in the main ``CMakeLists.txt``.
|
|
164
|
|
165 Target Machine
|
|
166 ==============
|
|
167
|
|
168 ``LLVMTargetMachine`` is designed as a base class for targets implemented with
|
|
169 the LLVM target-independent code generator. The ``LLVMTargetMachine`` class
|
|
170 should be specialized by a concrete target class that implements the various
|
|
171 virtual methods. ``LLVMTargetMachine`` is defined as a subclass of
|
|
172 ``TargetMachine`` in ``include/llvm/Target/TargetMachine.h``. The
|
|
173 ``TargetMachine`` class implementation (``TargetMachine.cpp``) also processes
|
|
174 numerous command-line options.
|
|
175
|
|
176 To create a concrete target-specific subclass of ``LLVMTargetMachine``, start
|
|
177 by copying an existing ``TargetMachine`` class and header. You should name the
|
|
178 files that you create to reflect your specific target. For instance, for the
|
|
179 SPARC target, name the files ``SparcTargetMachine.h`` and
|
|
180 ``SparcTargetMachine.cpp``.
|
|
181
|
|
182 For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must
|
|
183 have access methods to obtain objects that represent target components. These
|
|
184 methods are named ``get*Info``, and are intended to obtain the instruction set
|
|
185 (``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout
|
|
186 (``getFrameInfo``), and similar information. ``XXXTargetMachine`` must also
|
|
187 implement the ``getDataLayout`` method to access an object with target-specific
|
|
188 data characteristics, such as data type size and alignment requirements.
|
|
189
|
|
190 For instance, for the SPARC target, the header file ``SparcTargetMachine.h``
|
|
191 declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that
|
|
192 simply return a class member.
|
|
193
|
|
194 .. code-block:: c++
|
|
195
|
|
196 namespace llvm {
|
|
197
|
|
198 class Module;
|
|
199
|
|
200 class SparcTargetMachine : public LLVMTargetMachine {
|
|
201 const DataLayout DataLayout; // Calculates type size & alignment
|
|
202 SparcSubtarget Subtarget;
|
|
203 SparcInstrInfo InstrInfo;
|
|
204 TargetFrameInfo FrameInfo;
|
|
205
|
|
206 protected:
|
|
207 virtual const TargetAsmInfo *createTargetAsmInfo() const;
|
|
208
|
|
209 public:
|
|
210 SparcTargetMachine(const Module &M, const std::string &FS);
|
|
211
|
|
212 virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
|
|
213 virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
|
|
214 virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
|
|
215 virtual const TargetRegisterInfo *getRegisterInfo() const {
|
|
216 return &InstrInfo.getRegisterInfo();
|
|
217 }
|
|
218 virtual const DataLayout *getDataLayout() const { return &DataLayout; }
|
|
219 static unsigned getModuleMatchQuality(const Module &M);
|
|
220
|
|
221 // Pass Pipeline Configuration
|
|
222 virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
|
|
223 virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
|
|
224 };
|
|
225
|
|
226 } // end namespace llvm
|
|
227
|
|
228 * ``getInstrInfo()``
|
|
229 * ``getRegisterInfo()``
|
|
230 * ``getFrameInfo()``
|
|
231 * ``getDataLayout()``
|
|
232 * ``getSubtargetImpl()``
|
|
233
|
|
234 For some targets, you also need to support the following methods:
|
|
235
|
|
236 * ``getTargetLowering()``
|
|
237 * ``getJITInfo()``
|
|
238
|
|
239 Some architectures, such as GPUs, do not support jumping to an arbitrary
|
|
240 program location and implement branching using masked execution and loop using
|
|
241 special instructions around the loop body. In order to avoid CFG modifications
|
|
242 that introduce irreducible control flow not handled by such hardware, a target
|
|
243 must call `setRequiresStructuredCFG(true)` when being initialized.
|
|
244
|
|
245 In addition, the ``XXXTargetMachine`` constructor should specify a
|
|
246 ``TargetDescription`` string that determines the data layout for the target
|
|
247 machine, including characteristics such as pointer size, alignment, and
|
|
248 endianness. For example, the constructor for ``SparcTargetMachine`` contains
|
|
249 the following:
|
|
250
|
|
251 .. code-block:: c++
|
|
252
|
|
253 SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
|
|
254 : DataLayout("E-p:32:32-f128:128:128"),
|
|
255 Subtarget(M, FS), InstrInfo(Subtarget),
|
|
256 FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
|
|
257 }
|
|
258
|
|
259 Hyphens separate portions of the ``TargetDescription`` string.
|
|
260
|
|
261 * An upper-case "``E``" in the string indicates a big-endian target data model.
|
|
262 A lower-case "``e``" indicates little-endian.
|
|
263
|
|
264 * "``p:``" is followed by pointer information: size, ABI alignment, and
|
|
265 preferred alignment. If only two figures follow "``p:``", then the first
|
|
266 value is pointer size, and the second value is both ABI and preferred
|
|
267 alignment.
|
|
268
|
|
269 * Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or
|
|
270 "``a``" (corresponding to integer, floating point, vector, or aggregate).
|
|
271 "``i``", "``v``", or "``a``" are followed by ABI alignment and preferred
|
|
272 alignment. "``f``" is followed by three values: the first indicates the size
|
|
273 of a long double, then ABI alignment, and then ABI preferred alignment.
|
|
274
|
|
275 Target Registration
|
|
276 ===================
|
|
277
|
|
278 You must also register your target with the ``TargetRegistry``, which is what
|
|
279 other LLVM tools use to be able to lookup and use your target at runtime. The
|
|
280 ``TargetRegistry`` can be used directly, but for most targets there are helper
|
|
281 templates which should take care of the work for you.
|
|
282
|
|
283 All targets should declare a global ``Target`` object which is used to
|
|
284 represent the target during registration. Then, in the target's ``TargetInfo``
|
|
285 library, the target should define that object and use the ``RegisterTarget``
|
|
286 template to register the target. For example, the Sparc registration code
|
|
287 looks like this:
|
|
288
|
|
289 .. code-block:: c++
|
|
290
|
|
291 Target llvm::getTheSparcTarget();
|
|
292
|
|
293 extern "C" void LLVMInitializeSparcTargetInfo() {
|
|
294 RegisterTarget<Triple::sparc, /*HasJIT=*/false>
|
|
295 X(getTheSparcTarget(), "sparc", "Sparc");
|
|
296 }
|
|
297
|
|
298 This allows the ``TargetRegistry`` to look up the target by name or by target
|
|
299 triple. In addition, most targets will also register additional features which
|
|
300 are available in separate libraries. These registration steps are separate,
|
|
301 because some clients may wish to only link in some parts of the target --- the
|
|
302 JIT code generator does not require the use of the assembler printer, for
|
|
303 example. Here is an example of registering the Sparc assembly printer:
|
|
304
|
|
305 .. code-block:: c++
|
|
306
|
|
307 extern "C" void LLVMInitializeSparcAsmPrinter() {
|
|
308 RegisterAsmPrinter<SparcAsmPrinter> X(getTheSparcTarget());
|
|
309 }
|
|
310
|
|
311 For more information, see "`llvm/Target/TargetRegistry.h
|
|
312 </doxygen/TargetRegistry_8h-source.html>`_".
|
|
313
|
|
314 Register Set and Register Classes
|
|
315 =================================
|
|
316
|
|
317 You should describe a concrete target-specific class that represents the
|
|
318 register file of a target machine. This class is called ``XXXRegisterInfo``
|
|
319 (where ``XXX`` identifies the target) and represents the class register file
|
|
320 data that is used for register allocation. It also describes the interactions
|
|
321 between registers.
|
|
322
|
|
323 You also need to define register classes to categorize related registers. A
|
|
324 register class should be added for groups of registers that are all treated the
|
|
325 same way for some instruction. Typical examples are register classes for
|
|
326 integer, floating-point, or vector registers. A register allocator allows an
|
|
327 instruction to use any register in a specified register class to perform the
|
|
328 instruction in a similar manner. Register classes allocate virtual registers
|
|
329 to instructions from these sets, and register classes let the
|
|
330 target-independent register allocator automatically choose the actual
|
|
331 registers.
|
|
332
|
|
333 Much of the code for registers, including register definition, register
|
|
334 aliases, and register classes, is generated by TableGen from
|
|
335 ``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc``
|
|
336 and ``XXXGenRegisterInfo.inc`` output files. Some of the code in the
|
|
337 implementation of ``XXXRegisterInfo`` requires hand-coding.
|
|
338
|
|
339 Defining a Register
|
|
340 -------------------
|
|
341
|
|
342 The ``XXXRegisterInfo.td`` file typically starts with register definitions for
|
|
343 a target machine. The ``Register`` class (specified in ``Target.td``) is used
|
|
344 to define an object for each register. The specified string ``n`` becomes the
|
|
345 ``Name`` of the register. The basic ``Register`` object does not have any
|
|
346 subregisters and does not specify any aliases.
|
|
347
|
|
348 .. code-block:: text
|
|
349
|
|
350 class Register<string n> {
|
|
351 string Namespace = "";
|
|
352 string AsmName = n;
|
|
353 string Name = n;
|
|
354 int SpillSize = 0;
|
|
355 int SpillAlignment = 0;
|
|
356 list<Register> Aliases = [];
|
|
357 list<Register> SubRegs = [];
|
|
358 list<int> DwarfNumbers = [];
|
|
359 }
|
|
360
|
|
361 For example, in the ``X86RegisterInfo.td`` file, there are register definitions
|
|
362 that utilize the ``Register`` class, such as:
|
|
363
|
|
364 .. code-block:: text
|
|
365
|
|
366 def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>;
|
|
367
|
|
368 This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``)
|
|
369 that are used by ``gcc``, ``gdb``, or a debug information writer to identify a
|
|
370 register. For register ``AL``, ``DwarfRegNum`` takes an array of 3 values
|
|
371 representing 3 different modes: the first element is for X86-64, the second for
|
|
372 exception handling (EH) on X86-32, and the third is generic. -1 is a special
|
|
373 Dwarf number that indicates the gcc number is undefined, and -2 indicates the
|
|
374 register number is invalid for this mode.
|
|
375
|
|
376 From the previously described line in the ``X86RegisterInfo.td`` file, TableGen
|
|
377 generates this code in the ``X86GenRegisterInfo.inc`` file:
|
|
378
|
|
379 .. code-block:: c++
|
|
380
|
|
381 static const unsigned GR8[] = { X86::AL, ... };
|
|
382
|
|
383 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
|
|
384
|
|
385 const TargetRegisterDesc RegisterDescriptors[] = {
|
|
386 ...
|
|
387 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
|
|
388
|
|
389 From the register info file, TableGen generates a ``TargetRegisterDesc`` object
|
|
390 for each register. ``TargetRegisterDesc`` is defined in
|
|
391 ``include/llvm/Target/TargetRegisterInfo.h`` with the following fields:
|
|
392
|
|
393 .. code-block:: c++
|
|
394
|
|
395 struct TargetRegisterDesc {
|
|
396 const char *AsmName; // Assembly language name for the register
|
|
397 const char *Name; // Printable name for the reg (for debugging)
|
|
398 const unsigned *AliasSet; // Register Alias Set
|
|
399 const unsigned *SubRegs; // Sub-register set
|
|
400 const unsigned *ImmSubRegs; // Immediate sub-register set
|
|
401 const unsigned *SuperRegs; // Super-register set
|
|
402 };
|
|
403
|
|
404 TableGen uses the entire target description file (``.td``) to determine text
|
|
405 names for the register (in the ``AsmName`` and ``Name`` fields of
|
|
406 ``TargetRegisterDesc``) and the relationships of other registers to the defined
|
|
407 register (in the other ``TargetRegisterDesc`` fields). In this example, other
|
|
408 definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as
|
|
409 aliases for one another, so TableGen generates a null-terminated array
|
|
410 (``AL_AliasSet``) for this register alias set.
|
|
411
|
|
412 The ``Register`` class is commonly used as a base class for more complex
|
|
413 classes. In ``Target.td``, the ``Register`` class is the base for the
|
|
414 ``RegisterWithSubRegs`` class that is used to define registers that need to
|
|
415 specify subregisters in the ``SubRegs`` list, as shown here:
|
|
416
|
|
417 .. code-block:: text
|
|
418
|
|
419 class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> {
|
|
420 let SubRegs = subregs;
|
|
421 }
|
|
422
|
|
423 In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC:
|
|
424 a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``,
|
|
425 and ``Rd``. SPARC registers are identified by 5-bit ID numbers, which is a
|
|
426 feature common to these subclasses. Note the use of "``let``" expressions to
|
|
427 override values that are initially defined in a superclass (such as ``SubRegs``
|
|
428 field in the ``Rd`` class).
|
|
429
|
|
430 .. code-block:: text
|
|
431
|
|
432 class SparcReg<string n> : Register<n> {
|
|
433 field bits<5> Num;
|
|
434 let Namespace = "SP";
|
|
435 }
|
|
436 // Ri - 32-bit integer registers
|
|
437 class Ri<bits<5> num, string n> :
|
|
438 SparcReg<n> {
|
|
439 let Num = num;
|
|
440 }
|
|
441 // Rf - 32-bit floating-point registers
|
|
442 class Rf<bits<5> num, string n> :
|
|
443 SparcReg<n> {
|
|
444 let Num = num;
|
|
445 }
|
|
446 // Rd - Slots in the FP register file for 64-bit floating-point values.
|
|
447 class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> {
|
|
448 let Num = num;
|
|
449 let SubRegs = subregs;
|
|
450 }
|
|
451
|
|
452 In the ``SparcRegisterInfo.td`` file, there are register definitions that
|
|
453 utilize these subclasses of ``Register``, such as:
|
|
454
|
|
455 .. code-block:: text
|
|
456
|
|
457 def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>;
|
|
458 def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
|
|
459 ...
|
|
460 def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>;
|
|
461 def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>;
|
|
462 ...
|
|
463 def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>;
|
|
464 def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>;
|
|
465
|
|
466 The last two registers shown above (``D0`` and ``D1``) are double-precision
|
|
467 floating-point registers that are aliases for pairs of single-precision
|
|
468 floating-point sub-registers. In addition to aliases, the sub-register and
|
|
469 super-register relationships of the defined register are in fields of a
|
|
470 register's ``TargetRegisterDesc``.
|
|
471
|
|
472 Defining a Register Class
|
|
473 -------------------------
|
|
474
|
|
475 The ``RegisterClass`` class (specified in ``Target.td``) is used to define an
|
|
476 object that represents a group of related registers and also defines the
|
|
477 default allocation order of the registers. A target description file
|
|
478 ``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes
|
|
479 using the following class:
|
|
480
|
|
481 .. code-block:: text
|
|
482
|
|
483 class RegisterClass<string namespace,
|
|
484 list<ValueType> regTypes, int alignment, dag regList> {
|
|
485 string Namespace = namespace;
|
|
486 list<ValueType> RegTypes = regTypes;
|
|
487 int Size = 0; // spill size, in bits; zero lets tblgen pick the size
|
|
488 int Alignment = alignment;
|
|
489
|
|
490 // CopyCost is the cost of copying a value between two registers
|
|
491 // default value 1 means a single instruction
|
|
492 // A negative value means copying is extremely expensive or impossible
|
|
493 int CopyCost = 1;
|
|
494 dag MemberList = regList;
|
|
495
|
|
496 // for register classes that are subregisters of this class
|
|
497 list<RegisterClass> SubRegClassList = [];
|
|
498
|
|
499 code MethodProtos = [{}]; // to insert arbitrary code
|
|
500 code MethodBodies = [{}];
|
|
501 }
|
|
502
|
|
503 To define a ``RegisterClass``, use the following 4 arguments:
|
|
504
|
|
505 * The first argument of the definition is the name of the namespace.
|
|
506
|
|
507 * The second argument is a list of ``ValueType`` register type values that are
|
|
508 defined in ``include/llvm/CodeGen/ValueTypes.td``. Defined values include
|
|
509 integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean),
|
|
510 floating-point types (``f32``, ``f64``), and vector types (for example,
|
|
511 ``v8i16`` for an ``8 x i16`` vector). All registers in a ``RegisterClass``
|
|
512 must have the same ``ValueType``, but some registers may store vector data in
|
|
513 different configurations. For example a register that can process a 128-bit
|
|
514 vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4
|
|
515 32-bit integers, and so on.
|
|
516
|
|
517 * The third argument of the ``RegisterClass`` definition specifies the
|
|
518 alignment required of the registers when they are stored or loaded to
|
|
519 memory.
|
|
520
|
|
521 * The final argument, ``regList``, specifies which registers are in this class.
|
|
522 If an alternative allocation order method is not specified, then ``regList``
|
|
523 also defines the order of allocation used by the register allocator. Besides
|
|
524 simply listing registers with ``(add R0, R1, ...)``, more advanced set
|
|
525 operators are available. See ``include/llvm/Target/Target.td`` for more
|
|
526 information.
|
|
527
|
|
528 In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined:
|
|
529 ``FPRegs``, ``DFPRegs``, and ``IntRegs``. For all three register classes, the
|
|
530 first argument defines the namespace with the string "``SP``". ``FPRegs``
|
|
531 defines a group of 32 single-precision floating-point registers (``F0`` to
|
|
532 ``F31``); ``DFPRegs`` defines a group of 16 double-precision registers
|
|
533 (``D0-D15``).
|
|
534
|
|
535 .. code-block:: text
|
|
536
|
|
537 // F0, F1, F2, ..., F31
|
|
538 def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>;
|
|
539
|
|
540 def DFPRegs : RegisterClass<"SP", [f64], 64,
|
|
541 (add D0, D1, D2, D3, D4, D5, D6, D7, D8,
|
|
542 D9, D10, D11, D12, D13, D14, D15)>;
|
|
543
|
|
544 def IntRegs : RegisterClass<"SP", [i32], 32,
|
|
545 (add L0, L1, L2, L3, L4, L5, L6, L7,
|
|
546 I0, I1, I2, I3, I4, I5,
|
|
547 O0, O1, O2, O3, O4, O5, O7,
|
|
548 G1,
|
|
549 // Non-allocatable regs:
|
|
550 G2, G3, G4,
|
|
551 O6, // stack ptr
|
|
552 I6, // frame ptr
|
|
553 I7, // return address
|
|
554 G0, // constant zero
|
|
555 G5, G6, G7 // reserved for kernel
|
|
556 )>;
|
|
557
|
|
558 Using ``SparcRegisterInfo.td`` with TableGen generates several output files
|
|
559 that are intended for inclusion in other source code that you write.
|
|
560 ``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should
|
|
561 be included in the header file for the implementation of the SPARC register
|
|
562 implementation that you write (``SparcRegisterInfo.h``). In
|
|
563 ``SparcGenRegisterInfo.h.inc`` a new structure is defined called
|
|
564 ``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base. It also
|
|
565 specifies types, based upon the defined register classes: ``DFPRegsClass``,
|
|
566 ``FPRegsClass``, and ``IntRegsClass``.
|
|
567
|
|
568 ``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is
|
|
569 included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register
|
|
570 implementation. The code below shows only the generated integer registers and
|
|
571 associated register classes. The order of registers in ``IntRegs`` reflects
|
|
572 the order in the definition of ``IntRegs`` in the target description file.
|
|
573
|
|
574 .. code-block:: c++
|
|
575
|
|
576 // IntRegs Register Class...
|
|
577 static const unsigned IntRegs[] = {
|
|
578 SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
|
|
579 SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
|
|
580 SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
|
|
581 SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
|
|
582 SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
|
|
583 SP::G6, SP::G7,
|
|
584 };
|
|
585
|
|
586 // IntRegsVTs Register Class Value Types...
|
|
587 static const MVT::ValueType IntRegsVTs[] = {
|
|
588 MVT::i32, MVT::Other
|
|
589 };
|
|
590
|
|
591 namespace SP { // Register class instances
|
|
592 DFPRegsClass DFPRegsRegClass;
|
|
593 FPRegsClass FPRegsRegClass;
|
|
594 IntRegsClass IntRegsRegClass;
|
|
595 ...
|
|
596 // IntRegs Sub-register Classes...
|
|
597 static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
|
|
598 NULL
|
|
599 };
|
|
600 ...
|
|
601 // IntRegs Super-register Classes..
|
|
602 static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
|
|
603 NULL
|
|
604 };
|
|
605 ...
|
|
606 // IntRegs Register Class sub-classes...
|
|
607 static const TargetRegisterClass* const IntRegsSubclasses [] = {
|
|
608 NULL
|
|
609 };
|
|
610 ...
|
|
611 // IntRegs Register Class super-classes...
|
|
612 static const TargetRegisterClass* const IntRegsSuperclasses [] = {
|
|
613 NULL
|
|
614 };
|
|
615
|
|
616 IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
|
|
617 IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
|
|
618 IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
|
|
619 }
|
|
620
|
|
621 The register allocators will avoid using reserved registers, and callee saved
|
|
622 registers are not used until all the volatile registers have been used. That
|
|
623 is usually good enough, but in some cases it may be necessary to provide custom
|
|
624 allocation orders.
|
|
625
|
|
626 Implement a subclass of ``TargetRegisterInfo``
|
|
627 ----------------------------------------------
|
|
628
|
|
629 The final step is to hand code portions of ``XXXRegisterInfo``, which
|
|
630 implements the interface described in ``TargetRegisterInfo.h`` (see
|
|
631 :ref:`TargetRegisterInfo`). These functions return ``0``, ``NULL``, or
|
|
632 ``false``, unless overridden. Here is a list of functions that are overridden
|
|
633 for the SPARC implementation in ``SparcRegisterInfo.cpp``:
|
|
634
|
|
635 * ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the
|
|
636 order of the desired callee-save stack frame offset.
|
|
637
|
|
638 * ``getReservedRegs`` --- Returns a bitset indexed by physical register
|
|
639 numbers, indicating if a particular register is unavailable.
|
|
640
|
|
641 * ``hasFP`` --- Return a Boolean indicating if a function should have a
|
|
642 dedicated frame pointer register.
|
|
643
|
|
644 * ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo
|
|
645 instructions are used, this can be called to eliminate them.
|
|
646
|
|
647 * ``eliminateFrameIndex`` --- Eliminate abstract frame indices from
|
|
648 instructions that may use them.
|
|
649
|
|
650 * ``emitPrologue`` --- Insert prologue code into the function.
|
|
651
|
|
652 * ``emitEpilogue`` --- Insert epilogue code into the function.
|
|
653
|
|
654 .. _instruction-set:
|
|
655
|
|
656 Instruction Set
|
|
657 ===============
|
|
658
|
|
659 During the early stages of code generation, the LLVM IR code is converted to a
|
|
660 ``SelectionDAG`` with nodes that are instances of the ``SDNode`` class
|
|
661 containing target instructions. An ``SDNode`` has an opcode, operands, type
|
|
662 requirements, and operation properties. For example, is an operation
|
|
663 commutative, does an operation load from memory. The various operation node
|
|
664 types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file
|
|
665 (values of the ``NodeType`` enum in the ``ISD`` namespace).
|
|
666
|
|
667 TableGen uses the following target description (``.td``) input files to
|
|
668 generate much of the code for instruction definition:
|
|
669
|
|
670 * ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and
|
|
671 other fundamental classes are defined.
|
|
672
|
|
673 * ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection
|
|
674 generators, contains ``SDTC*`` classes (selection DAG type constraint),
|
|
675 definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``,
|
|
676 ``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``,
|
|
677 ``PatFrag``, ``PatLeaf``, ``ComplexPattern``.
|
|
678
|
|
679 * ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific
|
|
680 instructions.
|
|
681
|
|
682 * ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates,
|
|
683 condition codes, and instructions of an instruction set. For architecture
|
|
684 modifications, a different file name may be used. For example, for Pentium
|
|
685 with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with
|
|
686 MMX, this file is ``X86InstrMMX.td``.
|
|
687
|
|
688 There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of
|
|
689 the target. The ``XXX.td`` file includes the other ``.td`` input files, but
|
|
690 its contents are only directly important for subtargets.
|
|
691
|
|
692 You should describe a concrete target-specific class ``XXXInstrInfo`` that
|
|
693 represents machine instructions supported by a target machine.
|
|
694 ``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of
|
|
695 which describes one instruction. An instruction descriptor defines:
|
|
696
|
|
697 * Opcode mnemonic
|
|
698 * Number of operands
|
|
699 * List of implicit register definitions and uses
|
|
700 * Target-independent properties (such as memory access, is commutable)
|
|
701 * Target-specific flags
|
|
702
|
|
703 The Instruction class (defined in ``Target.td``) is mostly used as a base for
|
|
704 more complex instruction classes.
|
|
705
|
|
706 .. code-block:: text
|
|
707
|
|
708 class Instruction {
|
|
709 string Namespace = "";
|
|
710 dag OutOperandList; // A dag containing the MI def operand list.
|
|
711 dag InOperandList; // A dag containing the MI use operand list.
|
|
712 string AsmString = ""; // The .s format to print the instruction with.
|
|
713 list<dag> Pattern; // Set to the DAG pattern for this instruction.
|
|
714 list<Register> Uses = [];
|
|
715 list<Register> Defs = [];
|
|
716 list<Predicate> Predicates = []; // predicates turned into isel match code
|
|
717 ... remainder not shown for space ...
|
|
718 }
|
|
719
|
|
720 A ``SelectionDAG`` node (``SDNode``) should contain an object representing a
|
|
721 target-specific instruction that is defined in ``XXXInstrInfo.td``. The
|
|
722 instruction objects should represent instructions from the architecture manual
|
|
723 of the target machine (such as the SPARC Architecture Manual for the SPARC
|
|
724 target).
|
|
725
|
|
726 A single instruction from the architecture manual is often modeled as multiple
|
|
727 target instructions, depending upon its operands. For example, a manual might
|
|
728 describe an add instruction that takes a register or an immediate operand. An
|
|
729 LLVM target could model this with two instructions named ``ADDri`` and
|
|
730 ``ADDrr``.
|
|
731
|
|
732 You should define a class for each instruction category and define each opcode
|
|
733 as a subclass of the category with appropriate parameters such as the fixed
|
|
734 binary encoding of opcodes and extended opcodes. You should map the register
|
|
735 bits to the bits of the instruction in which they are encoded (for the JIT).
|
|
736 Also you should specify how the instruction should be printed when the
|
|
737 automatic assembly printer is used.
|
|
738
|
|
739 As is described in the SPARC Architecture Manual, Version 8, there are three
|
|
740 major 32-bit formats for instructions. Format 1 is only for the ``CALL``
|
|
741 instruction. Format 2 is for branch on condition codes and ``SETHI`` (set high
|
|
742 bits of a register) instructions. Format 3 is for other instructions.
|
|
743
|
|
744 Each of these formats has corresponding classes in ``SparcInstrFormat.td``.
|
|
745 ``InstSP`` is a base class for other instruction classes. Additional base
|
|
746 classes are specified for more precise formats: for example in
|
|
747 ``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for
|
|
748 branches. There are three other base classes: ``F3_1`` for register/register
|
|
749 operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for
|
|
750 floating-point operations. ``SparcInstrInfo.td`` also adds the base class
|
|
751 ``Pseudo`` for synthetic SPARC instructions.
|
|
752
|
|
753 ``SparcInstrInfo.td`` largely consists of operand and instruction definitions
|
|
754 for the SPARC target. In ``SparcInstrInfo.td``, the following target
|
|
755 description file entry, ``LDrr``, defines the Load Integer instruction for a
|
|
756 Word (the ``LD`` SPARC opcode) from a memory address to a register. The first
|
|
757 parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this
|
|
758 category of operation. The second parameter (``000000``\ :sub:`2`) is the
|
|
759 specific operation value for ``LD``/Load Word. The third parameter is the
|
|
760 output destination, which is a register operand and defined in the ``Register``
|
|
761 target description file (``IntRegs``).
|
|
762
|
|
763 .. code-block:: text
|
|
764
|
|
765 def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
|
|
766 "ld [$addr], $dst",
|
|
767 [(set i32:$dst, (load ADDRrr:$addr))]>;
|
|
768
|
|
769 The fourth parameter is the input source, which uses the address operand
|
|
770 ``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``:
|
|
771
|
|
772 .. code-block:: text
|
|
773
|
|
774 def MEMrr : Operand<i32> {
|
|
775 let PrintMethod = "printMemOperand";
|
|
776 let MIOperandInfo = (ops IntRegs, IntRegs);
|
|
777 }
|
|
778
|
|
779 The fifth parameter is a string that is used by the assembly printer and can be
|
|
780 left as an empty string until the assembly printer interface is implemented.
|
|
781 The sixth and final parameter is the pattern used to match the instruction
|
|
782 during the SelectionDAG Select Phase described in :doc:`CodeGenerator`.
|
|
783 This parameter is detailed in the next section, :ref:`instruction-selector`.
|
|
784
|
|
785 Instruction class definitions are not overloaded for different operand types,
|
|
786 so separate versions of instructions are needed for register, memory, or
|
|
787 immediate value operands. For example, to perform a Load Integer instruction
|
|
788 for a Word from an immediate operand to a register, the following instruction
|
|
789 class is defined:
|
|
790
|
|
791 .. code-block:: text
|
|
792
|
|
793 def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
|
|
794 "ld [$addr], $dst",
|
|
795 [(set i32:$dst, (load ADDRri:$addr))]>;
|
|
796
|
|
797 Writing these definitions for so many similar instructions can involve a lot of
|
|
798 cut and paste. In ``.td`` files, the ``multiclass`` directive enables the
|
|
799 creation of templates to define several instruction classes at once (using the
|
|
800 ``defm`` directive). For example in ``SparcInstrInfo.td``, the ``multiclass``
|
|
801 pattern ``F3_12`` is defined to create 2 instruction classes each time
|
|
802 ``F3_12`` is invoked:
|
|
803
|
|
804 .. code-block:: text
|
|
805
|
|
806 multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
|
|
807 def rr : F3_1 <2, Op3Val,
|
|
808 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
|
|
809 !strconcat(OpcStr, " $b, $c, $dst"),
|
|
810 [(set i32:$dst, (OpNode i32:$b, i32:$c))]>;
|
|
811 def ri : F3_2 <2, Op3Val,
|
|
812 (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
|
|
813 !strconcat(OpcStr, " $b, $c, $dst"),
|
|
814 [(set i32:$dst, (OpNode i32:$b, simm13:$c))]>;
|
|
815 }
|
|
816
|
|
817 So when the ``defm`` directive is used for the ``XOR`` and ``ADD``
|
|
818 instructions, as seen below, it creates four instruction objects: ``XORrr``,
|
|
819 ``XORri``, ``ADDrr``, and ``ADDri``.
|
|
820
|
|
821 .. code-block:: text
|
|
822
|
|
823 defm XOR : F3_12<"xor", 0b000011, xor>;
|
|
824 defm ADD : F3_12<"add", 0b000000, add>;
|
|
825
|
|
826 ``SparcInstrInfo.td`` also includes definitions for condition codes that are
|
|
827 referenced by branch instructions. The following definitions in
|
|
828 ``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code.
|
|
829 For example, the 10\ :sup:`th` bit represents the "greater than" condition for
|
|
830 integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for
|
|
831 floats.
|
|
832
|
|
833 .. code-block:: text
|
|
834
|
|
835 def ICC_NE : ICC_VAL< 9>; // Not Equal
|
|
836 def ICC_E : ICC_VAL< 1>; // Equal
|
|
837 def ICC_G : ICC_VAL<10>; // Greater
|
|
838 ...
|
|
839 def FCC_U : FCC_VAL<23>; // Unordered
|
|
840 def FCC_G : FCC_VAL<22>; // Greater
|
|
841 def FCC_UG : FCC_VAL<21>; // Unordered or Greater
|
|
842 ...
|
|
843
|
|
844 (Note that ``Sparc.h`` also defines enums that correspond to the same SPARC
|
|
845 condition codes. Care must be taken to ensure the values in ``Sparc.h``
|
|
846 correspond to the values in ``SparcInstrInfo.td``. I.e., ``SPCC::ICC_NE = 9``,
|
|
847 ``SPCC::FCC_U = 23`` and so on.)
|
|
848
|
|
849 Instruction Operand Mapping
|
|
850 ---------------------------
|
|
851
|
|
852 The code generator backend maps instruction operands to fields in the
|
|
853 instruction. Operands are assigned to unbound fields in the instruction in the
|
|
854 order they are defined. Fields are bound when they are assigned a value. For
|
|
855 example, the Sparc target defines the ``XNORrr`` instruction as a ``F3_1``
|
|
856 format instruction having three operands.
|
|
857
|
|
858 .. code-block:: text
|
|
859
|
|
860 def XNORrr : F3_1<2, 0b000111,
|
|
861 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
|
|
862 "xnor $b, $c, $dst",
|
|
863 [(set i32:$dst, (not (xor i32:$b, i32:$c)))]>;
|
|
864
|
|
865 The instruction templates in ``SparcInstrFormats.td`` show the base class for
|
|
866 ``F3_1`` is ``InstSP``.
|
|
867
|
|
868 .. code-block:: text
|
|
869
|
|
870 class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction {
|
|
871 field bits<32> Inst;
|
|
872 let Namespace = "SP";
|
|
873 bits<2> op;
|
|
874 let Inst{31-30} = op;
|
|
875 dag OutOperandList = outs;
|
|
876 dag InOperandList = ins;
|
|
877 let AsmString = asmstr;
|
|
878 let Pattern = pattern;
|
|
879 }
|
|
880
|
|
881 ``InstSP`` leaves the ``op`` field unbound.
|
|
882
|
|
883 .. code-block:: text
|
|
884
|
|
885 class F3<dag outs, dag ins, string asmstr, list<dag> pattern>
|
|
886 : InstSP<outs, ins, asmstr, pattern> {
|
|
887 bits<5> rd;
|
|
888 bits<6> op3;
|
|
889 bits<5> rs1;
|
|
890 let op{1} = 1; // Op = 2 or 3
|
|
891 let Inst{29-25} = rd;
|
|
892 let Inst{24-19} = op3;
|
|
893 let Inst{18-14} = rs1;
|
|
894 }
|
|
895
|
|
896 ``F3`` binds the ``op`` field and defines the ``rd``, ``op3``, and ``rs1``
|
|
897 fields. ``F3`` format instructions will bind the operands ``rd``, ``op3``, and
|
|
898 ``rs1`` fields.
|
|
899
|
|
900 .. code-block:: text
|
|
901
|
|
902 class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins,
|
|
903 string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> {
|
|
904 bits<8> asi = 0; // asi not currently used
|
|
905 bits<5> rs2;
|
|
906 let op = opVal;
|
|
907 let op3 = op3val;
|
|
908 let Inst{13} = 0; // i field = 0
|
|
909 let Inst{12-5} = asi; // address space identifier
|
|
910 let Inst{4-0} = rs2;
|
|
911 }
|
|
912
|
|
913 ``F3_1`` binds the ``op3`` field and defines the ``rs2`` fields. ``F3_1``
|
|
914 format instructions will bind the operands to the ``rd``, ``rs1``, and ``rs2``
|
|
915 fields. This results in the ``XNORrr`` instruction binding ``$dst``, ``$b``,
|
|
916 and ``$c`` operands to the ``rd``, ``rs1``, and ``rs2`` fields respectively.
|
|
917
|
|
918 Instruction Operand Name Mapping
|
|
919 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
920
|
|
921 TableGen will also generate a function called getNamedOperandIdx() which
|
|
922 can be used to look up an operand's index in a MachineInstr based on its
|
|
923 TableGen name. Setting the UseNamedOperandTable bit in an instruction's
|
|
924 TableGen definition will add all of its operands to an enumeration in the
|
|
925 llvm::XXX:OpName namespace and also add an entry for it into the OperandMap
|
|
926 table, which can be queried using getNamedOperandIdx()
|
|
927
|
|
928 .. code-block:: text
|
|
929
|
|
930 int DstIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::dst); // => 0
|
|
931 int BIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::b); // => 1
|
|
932 int CIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::c); // => 2
|
|
933 int DIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::d); // => -1
|
|
934
|
|
935 ...
|
|
936
|
|
937 The entries in the OpName enum are taken verbatim from the TableGen definitions,
|
|
938 so operands with lowercase names will have lower case entries in the enum.
|
|
939
|
|
940 To include the getNamedOperandIdx() function in your backend, you will need
|
|
941 to define a few preprocessor macros in XXXInstrInfo.cpp and XXXInstrInfo.h.
|
|
942 For example:
|
|
943
|
|
944 XXXInstrInfo.cpp:
|
|
945
|
|
946 .. code-block:: c++
|
|
947
|
|
948 #define GET_INSTRINFO_NAMED_OPS // For getNamedOperandIdx() function
|
|
949 #include "XXXGenInstrInfo.inc"
|
|
950
|
|
951 XXXInstrInfo.h:
|
|
952
|
|
953 .. code-block:: c++
|
|
954
|
|
955 #define GET_INSTRINFO_OPERAND_ENUM // For OpName enum
|
|
956 #include "XXXGenInstrInfo.inc"
|
|
957
|
|
958 namespace XXX {
|
|
959 int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIndex);
|
|
960 } // End namespace XXX
|
|
961
|
|
962 Instruction Operand Types
|
|
963 ^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
964
|
|
965 TableGen will also generate an enumeration consisting of all named Operand
|
|
966 types defined in the backend, in the llvm::XXX::OpTypes namespace.
|
|
967 Some common immediate Operand types (for instance i8, i32, i64, f32, f64)
|
|
968 are defined for all targets in ``include/llvm/Target/Target.td``, and are
|
|
969 available in each Target's OpTypes enum. Also, only named Operand types appear
|
|
970 in the enumeration: anonymous types are ignored.
|
|
971 For example, the X86 backend defines ``brtarget`` and ``brtarget8``, both
|
|
972 instances of the TableGen ``Operand`` class, which represent branch target
|
|
973 operands:
|
|
974
|
|
975 .. code-block:: text
|
|
976
|
|
977 def brtarget : Operand<OtherVT>;
|
|
978 def brtarget8 : Operand<OtherVT>;
|
|
979
|
|
980 This results in:
|
|
981
|
|
982 .. code-block:: c++
|
|
983
|
|
984 namespace X86 {
|
|
985 namespace OpTypes {
|
|
986 enum OperandType {
|
|
987 ...
|
|
988 brtarget,
|
|
989 brtarget8,
|
|
990 ...
|
|
991 i32imm,
|
|
992 i64imm,
|
|
993 ...
|
|
994 OPERAND_TYPE_LIST_END
|
|
995 } // End namespace OpTypes
|
|
996 } // End namespace X86
|
|
997
|
|
998 In typical TableGen fashion, to use the enum, you will need to define a
|
|
999 preprocessor macro:
|
|
1000
|
|
1001 .. code-block:: c++
|
|
1002
|
|
1003 #define GET_INSTRINFO_OPERAND_TYPES_ENUM // For OpTypes enum
|
|
1004 #include "XXXGenInstrInfo.inc"
|
|
1005
|
|
1006
|
|
1007 Instruction Scheduling
|
|
1008 ----------------------
|
|
1009
|
|
1010 Instruction itineraries can be queried using MCDesc::getSchedClass(). The
|
|
1011 value can be named by an enumeration in llvm::XXX::Sched namespace generated
|
|
1012 by TableGen in XXXGenInstrInfo.inc. The name of the schedule classes are
|
|
1013 the same as provided in XXXSchedule.td plus a default NoItinerary class.
|
|
1014
|
|
1015 The schedule models are generated by TableGen by the SubtargetEmitter,
|
|
1016 using the ``CodeGenSchedModels`` class. This is distinct from the itinerary
|
|
1017 method of specifying machine resource use. The tool ``utils/schedcover.py``
|
|
1018 can be used to determine which instructions have been covered by the
|
|
1019 schedule model description and which haven't. The first step is to use the
|
|
1020 instructions below to create an output file. Then run ``schedcover.py`` on the
|
|
1021 output file:
|
|
1022
|
|
1023 .. code-block:: shell
|
|
1024
|
|
1025 $ <src>/utils/schedcover.py <build>/lib/Target/AArch64/tblGenSubtarget.with
|
|
1026 instruction, default, CortexA53Model, CortexA57Model, CycloneModel, ExynosM3Model, FalkorModel, KryoModel, ThunderX2T99Model, ThunderXT8XModel
|
|
1027 ABSv16i8, WriteV, , , CyWriteV3, M3WriteNMISC1, FalkorWr_2VXVY_2cyc, KryoWrite_2cyc_XY_XY_150ln, ,
|
|
1028 ABSv1i64, WriteV, , , CyWriteV3, M3WriteNMISC1, FalkorWr_1VXVY_2cyc, KryoWrite_2cyc_XY_noRSV_67ln, ,
|
|
1029 ...
|
|
1030
|
|
1031 To capture the debug output from generating a schedule model, change to the
|
|
1032 appropriate target directory and use the following command:
|
|
1033 command with the ``subtarget-emitter`` debug option:
|
|
1034
|
|
1035 .. code-block:: shell
|
|
1036
|
|
1037 $ <build>/bin/llvm-tblgen -debug-only=subtarget-emitter -gen-subtarget \
|
|
1038 -I <src>/lib/Target/<target> -I <src>/include \
|
|
1039 -I <src>/lib/Target <src>/lib/Target/<target>/<target>.td \
|
|
1040 -o <build>/lib/Target/<target>/<target>GenSubtargetInfo.inc.tmp \
|
|
1041 > tblGenSubtarget.dbg 2>&1
|
|
1042
|
|
1043 Where ``<build>`` is the build directory, ``src`` is the source directory,
|
|
1044 and ``<target>`` is the name of the target.
|
|
1045 To double check that the above command is what is needed, one can capture the
|
|
1046 exact TableGen command from a build by using:
|
|
1047
|
|
1048 .. code-block:: shell
|
|
1049
|
|
1050 $ VERBOSE=1 make ...
|
|
1051
|
|
1052 and search for ``llvm-tblgen`` commands in the output.
|
|
1053
|
|
1054
|
|
1055 Instruction Relation Mapping
|
|
1056 ----------------------------
|
|
1057
|
|
1058 This TableGen feature is used to relate instructions with each other. It is
|
|
1059 particularly useful when you have multiple instruction formats and need to
|
|
1060 switch between them after instruction selection. This entire feature is driven
|
|
1061 by relation models which can be defined in ``XXXInstrInfo.td`` files
|
|
1062 according to the target-specific instruction set. Relation models are defined
|
|
1063 using ``InstrMapping`` class as a base. TableGen parses all the models
|
|
1064 and generates instruction relation maps using the specified information.
|
|
1065 Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file
|
|
1066 along with the functions to query them. For the detailed information on how to
|
|
1067 use this feature, please refer to :doc:`HowToUseInstrMappings`.
|
|
1068
|
|
1069 Implement a subclass of ``TargetInstrInfo``
|
|
1070 -------------------------------------------
|
|
1071
|
|
1072 The final step is to hand code portions of ``XXXInstrInfo``, which implements
|
|
1073 the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`).
|
|
1074 These functions return ``0`` or a Boolean or they assert, unless overridden.
|
|
1075 Here's a list of functions that are overridden for the SPARC implementation in
|
|
1076 ``SparcInstrInfo.cpp``:
|
|
1077
|
|
1078 * ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct
|
|
1079 load from a stack slot, return the register number of the destination and the
|
|
1080 ``FrameIndex`` of the stack slot.
|
|
1081
|
|
1082 * ``isStoreToStackSlot`` --- If the specified machine instruction is a direct
|
|
1083 store to a stack slot, return the register number of the destination and the
|
|
1084 ``FrameIndex`` of the stack slot.
|
|
1085
|
|
1086 * ``copyPhysReg`` --- Copy values between a pair of physical registers.
|
|
1087
|
|
1088 * ``storeRegToStackSlot`` --- Store a register value to a stack slot.
|
|
1089
|
|
1090 * ``loadRegFromStackSlot`` --- Load a register value from a stack slot.
|
|
1091
|
|
1092 * ``storeRegToAddr`` --- Store a register value to memory.
|
|
1093
|
|
1094 * ``loadRegFromAddr`` --- Load a register value from memory.
|
|
1095
|
|
1096 * ``foldMemoryOperand`` --- Attempt to combine instructions of any load or
|
|
1097 store instruction for the specified operand(s).
|
|
1098
|
|
1099 Branch Folding and If Conversion
|
|
1100 --------------------------------
|
|
1101
|
|
1102 Performance can be improved by combining instructions or by eliminating
|
|
1103 instructions that are never reached. The ``analyzeBranch`` method in
|
|
1104 ``XXXInstrInfo`` may be implemented to examine conditional instructions and
|
|
1105 remove unnecessary instructions. ``analyzeBranch`` looks at the end of a
|
|
1106 machine basic block (MBB) for opportunities for improvement, such as branch
|
|
1107 folding and if conversion. The ``BranchFolder`` and ``IfConverter`` machine
|
|
1108 function passes (see the source files ``BranchFolding.cpp`` and
|
|
1109 ``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``analyzeBranch``
|
|
1110 to improve the control flow graph that represents the instructions.
|
|
1111
|
|
1112 Several implementations of ``analyzeBranch`` (for ARM, Alpha, and X86) can be
|
|
1113 examined as models for your own ``analyzeBranch`` implementation. Since SPARC
|
|
1114 does not implement a useful ``analyzeBranch``, the ARM target implementation is
|
|
1115 shown below.
|
|
1116
|
|
1117 ``analyzeBranch`` returns a Boolean value and takes four parameters:
|
|
1118
|
|
1119 * ``MachineBasicBlock &MBB`` --- The incoming block to be examined.
|
|
1120
|
|
1121 * ``MachineBasicBlock *&TBB`` --- A destination block that is returned. For a
|
|
1122 conditional branch that evaluates to true, ``TBB`` is the destination.
|
|
1123
|
|
1124 * ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to
|
|
1125 false, ``FBB`` is returned as the destination.
|
|
1126
|
|
1127 * ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a
|
|
1128 condition for a conditional branch.
|
|
1129
|
|
1130 In the simplest case, if a block ends without a branch, then it falls through
|
|
1131 to the successor block. No destination blocks are specified for either ``TBB``
|
|
1132 or ``FBB``, so both parameters return ``NULL``. The start of the
|
|
1133 ``analyzeBranch`` (see code below for the ARM target) shows the function
|
|
1134 parameters and the code for the simplest case.
|
|
1135
|
|
1136 .. code-block:: c++
|
|
1137
|
|
1138 bool ARMInstrInfo::analyzeBranch(MachineBasicBlock &MBB,
|
|
1139 MachineBasicBlock *&TBB,
|
|
1140 MachineBasicBlock *&FBB,
|
|
1141 std::vector<MachineOperand> &Cond) const
|
|
1142 {
|
|
1143 MachineBasicBlock::iterator I = MBB.end();
|
|
1144 if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
|
|
1145 return false;
|
|
1146
|
|
1147 If a block ends with a single unconditional branch instruction, then
|
|
1148 ``analyzeBranch`` (shown below) should return the destination of that branch in
|
|
1149 the ``TBB`` parameter.
|
|
1150
|
|
1151 .. code-block:: c++
|
|
1152
|
|
1153 if (LastOpc == ARM::B || LastOpc == ARM::tB) {
|
|
1154 TBB = LastInst->getOperand(0).getMBB();
|
|
1155 return false;
|
|
1156 }
|
|
1157
|
|
1158 If a block ends with two unconditional branches, then the second branch is
|
|
1159 never reached. In that situation, as shown below, remove the last branch
|
|
1160 instruction and return the penultimate branch in the ``TBB`` parameter.
|
|
1161
|
|
1162 .. code-block:: c++
|
|
1163
|
|
1164 if ((SecondLastOpc == ARM::B || SecondLastOpc == ARM::tB) &&
|
|
1165 (LastOpc == ARM::B || LastOpc == ARM::tB)) {
|
|
1166 TBB = SecondLastInst->getOperand(0).getMBB();
|
|
1167 I = LastInst;
|
|
1168 I->eraseFromParent();
|
|
1169 return false;
|
|
1170 }
|
|
1171
|
|
1172 A block may end with a single conditional branch instruction that falls through
|
|
1173 to successor block if the condition evaluates to false. In that case,
|
|
1174 ``analyzeBranch`` (shown below) should return the destination of that
|
|
1175 conditional branch in the ``TBB`` parameter and a list of operands in the
|
|
1176 ``Cond`` parameter to evaluate the condition.
|
|
1177
|
|
1178 .. code-block:: c++
|
|
1179
|
|
1180 if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
|
|
1181 // Block ends with fall-through condbranch.
|
|
1182 TBB = LastInst->getOperand(0).getMBB();
|
|
1183 Cond.push_back(LastInst->getOperand(1));
|
|
1184 Cond.push_back(LastInst->getOperand(2));
|
|
1185 return false;
|
|
1186 }
|
|
1187
|
|
1188 If a block ends with both a conditional branch and an ensuing unconditional
|
|
1189 branch, then ``analyzeBranch`` (shown below) should return the conditional
|
|
1190 branch destination (assuming it corresponds to a conditional evaluation of
|
|
1191 "``true``") in the ``TBB`` parameter and the unconditional branch destination
|
|
1192 in the ``FBB`` (corresponding to a conditional evaluation of "``false``"). A
|
|
1193 list of operands to evaluate the condition should be returned in the ``Cond``
|
|
1194 parameter.
|
|
1195
|
|
1196 .. code-block:: c++
|
|
1197
|
|
1198 unsigned SecondLastOpc = SecondLastInst->getOpcode();
|
|
1199
|
|
1200 if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) ||
|
|
1201 (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
|
|
1202 TBB = SecondLastInst->getOperand(0).getMBB();
|
|
1203 Cond.push_back(SecondLastInst->getOperand(1));
|
|
1204 Cond.push_back(SecondLastInst->getOperand(2));
|
|
1205 FBB = LastInst->getOperand(0).getMBB();
|
|
1206 return false;
|
|
1207 }
|
|
1208
|
|
1209 For the last two cases (ending with a single conditional branch or ending with
|
|
1210 one conditional and one unconditional branch), the operands returned in the
|
|
1211 ``Cond`` parameter can be passed to methods of other instructions to create new
|
|
1212 branches or perform other operations. An implementation of ``analyzeBranch``
|
|
1213 requires the helper methods ``removeBranch`` and ``insertBranch`` to manage
|
|
1214 subsequent operations.
|
|
1215
|
|
1216 ``analyzeBranch`` should return false indicating success in most circumstances.
|
|
1217 ``analyzeBranch`` should only return true when the method is stumped about what
|
|
1218 to do, for example, if a block has three terminating branches.
|
|
1219 ``analyzeBranch`` may return true if it encounters a terminator it cannot
|
|
1220 handle, such as an indirect branch.
|
|
1221
|
|
1222 .. _instruction-selector:
|
|
1223
|
|
1224 Instruction Selector
|
|
1225 ====================
|
|
1226
|
|
1227 LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of
|
|
1228 the ``SelectionDAG`` ideally represent native target instructions. During code
|
|
1229 generation, instruction selection passes are performed to convert non-native
|
|
1230 DAG instructions into native target-specific instructions. The pass described
|
|
1231 in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG
|
|
1232 instruction selection. Optionally, a pass may be defined (in
|
|
1233 ``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch
|
|
1234 instructions. Later, the code in ``XXXISelLowering.cpp`` replaces or removes
|
|
1235 operations and data types not supported natively (legalizes) in a
|
|
1236 ``SelectionDAG``.
|
|
1237
|
|
1238 TableGen generates code for instruction selection using the following target
|
|
1239 description input files:
|
|
1240
|
|
1241 * ``XXXInstrInfo.td`` --- Contains definitions of instructions in a
|
|
1242 target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is
|
|
1243 included in ``XXXISelDAGToDAG.cpp``.
|
|
1244
|
|
1245 * ``XXXCallingConv.td`` --- Contains the calling and return value conventions
|
|
1246 for the target architecture, and it generates ``XXXGenCallingConv.inc``,
|
|
1247 which is included in ``XXXISelLowering.cpp``.
|
|
1248
|
|
1249 The implementation of an instruction selection pass must include a header that
|
|
1250 declares the ``FunctionPass`` class or a subclass of ``FunctionPass``. In
|
|
1251 ``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction
|
|
1252 selection pass into the queue of passes to run.
|
|
1253
|
|
1254 The LLVM static compiler (``llc``) is an excellent tool for visualizing the
|
|
1255 contents of DAGs. To display the ``SelectionDAG`` before or after specific
|
|
1256 processing phases, use the command line options for ``llc``, described at
|
|
1257 :ref:`SelectionDAG-Process`.
|
|
1258
|
|
1259 To describe instruction selector behavior, you should add patterns for lowering
|
|
1260 LLVM code into a ``SelectionDAG`` as the last parameter of the instruction
|
|
1261 definitions in ``XXXInstrInfo.td``. For example, in ``SparcInstrInfo.td``,
|
|
1262 this entry defines a register store operation, and the last parameter describes
|
|
1263 a pattern with the store DAG operator.
|
|
1264
|
|
1265 .. code-block:: text
|
|
1266
|
|
1267 def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
|
|
1268 "st $src, [$addr]", [(store i32:$src, ADDRrr:$addr)]>;
|
|
1269
|
|
1270 ``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``:
|
|
1271
|
|
1272 .. code-block:: text
|
|
1273
|
|
1274 def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
|
|
1275
|
|
1276 The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function
|
|
1277 defined in an implementation of the Instructor Selector (such as
|
|
1278 ``SparcISelDAGToDAG.cpp``).
|
|
1279
|
|
1280 In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined
|
|
1281 below:
|
|
1282
|
|
1283 .. code-block:: text
|
|
1284
|
|
1285 def store : PatFrag<(ops node:$val, node:$ptr),
|
|
1286 (st node:$val, node:$ptr), [{
|
|
1287 if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
|
|
1288 return !ST->isTruncatingStore() &&
|
|
1289 ST->getAddressingMode() == ISD::UNINDEXED;
|
|
1290 return false;
|
|
1291 }]>;
|
|
1292
|
|
1293 ``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the
|
|
1294 ``SelectCode`` method that is used to call the appropriate processing method
|
|
1295 for an instruction. In this example, ``SelectCode`` calls ``Select_ISD_STORE``
|
|
1296 for the ``ISD::STORE`` opcode.
|
|
1297
|
|
1298 .. code-block:: c++
|
|
1299
|
|
1300 SDNode *SelectCode(SDValue N) {
|
|
1301 ...
|
|
1302 MVT::ValueType NVT = N.getNode()->getValueType(0);
|
|
1303 switch (N.getOpcode()) {
|
|
1304 case ISD::STORE: {
|
|
1305 switch (NVT) {
|
|
1306 default:
|
|
1307 return Select_ISD_STORE(N);
|
|
1308 break;
|
|
1309 }
|
|
1310 break;
|
|
1311 }
|
|
1312 ...
|
|
1313
|
|
1314 The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``,
|
|
1315 code for ``STrr`` is created for ``Select_ISD_STORE``. The ``Emit_22`` method
|
|
1316 is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this
|
|
1317 instruction.
|
|
1318
|
|
1319 .. code-block:: c++
|
|
1320
|
|
1321 SDNode *Select_ISD_STORE(const SDValue &N) {
|
|
1322 SDValue Chain = N.getOperand(0);
|
|
1323 if (Predicate_store(N.getNode())) {
|
|
1324 SDValue N1 = N.getOperand(1);
|
|
1325 SDValue N2 = N.getOperand(2);
|
|
1326 SDValue CPTmp0;
|
|
1327 SDValue CPTmp1;
|
|
1328
|
|
1329 // Pattern: (st:void i32:i32:$src,
|
|
1330 // ADDRrr:i32:$addr)<<P:Predicate_store>>
|
|
1331 // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
|
|
1332 // Pattern complexity = 13 cost = 1 size = 0
|
|
1333 if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
|
|
1334 N1.getNode()->getValueType(0) == MVT::i32 &&
|
|
1335 N2.getNode()->getValueType(0) == MVT::i32) {
|
|
1336 return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
|
|
1337 }
|
|
1338 ...
|
|
1339
|
|
1340 The SelectionDAG Legalize Phase
|
|
1341 -------------------------------
|
|
1342
|
|
1343 The Legalize phase converts a DAG to use types and operations that are natively
|
|
1344 supported by the target. For natively unsupported types and operations, you
|
|
1345 need to add code to the target-specific ``XXXTargetLowering`` implementation to
|
|
1346 convert unsupported types and operations to supported ones.
|
|
1347
|
|
1348 In the constructor for the ``XXXTargetLowering`` class, first use the
|
|
1349 ``addRegisterClass`` method to specify which types are supported and which
|
|
1350 register classes are associated with them. The code for the register classes
|
|
1351 are generated by TableGen from ``XXXRegisterInfo.td`` and placed in
|
|
1352 ``XXXGenRegisterInfo.h.inc``. For example, the implementation of the
|
|
1353 constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``)
|
|
1354 starts with the following code:
|
|
1355
|
|
1356 .. code-block:: c++
|
|
1357
|
|
1358 addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
|
|
1359 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
|
|
1360 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
|
|
1361
|
|
1362 You should examine the node types in the ``ISD`` namespace
|
|
1363 (``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations
|
|
1364 the target natively supports. For operations that do **not** have native
|
|
1365 support, add a callback to the constructor for the ``XXXTargetLowering`` class,
|
|
1366 so the instruction selection process knows what to do. The ``TargetLowering``
|
|
1367 class callback methods (declared in ``llvm/Target/TargetLowering.h``) are:
|
|
1368
|
|
1369 * ``setOperationAction`` --- General operation.
|
|
1370 * ``setLoadExtAction`` --- Load with extension.
|
|
1371 * ``setTruncStoreAction`` --- Truncating store.
|
|
1372 * ``setIndexedLoadAction`` --- Indexed load.
|
|
1373 * ``setIndexedStoreAction`` --- Indexed store.
|
|
1374 * ``setConvertAction`` --- Type conversion.
|
|
1375 * ``setCondCodeAction`` --- Support for a given condition code.
|
|
1376
|
|
1377 Note: on older releases, ``setLoadXAction`` is used instead of
|
|
1378 ``setLoadExtAction``. Also, on older releases, ``setCondCodeAction`` may not
|
|
1379 be supported. Examine your release to see what methods are specifically
|
|
1380 supported.
|
|
1381
|
|
1382 These callbacks are used to determine that an operation does or does not work
|
|
1383 with a specified type (or types). And in all cases, the third parameter is a
|
|
1384 ``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or
|
|
1385 ``Legal``. ``SparcISelLowering.cpp`` contains examples of all four
|
|
1386 ``LegalAction`` values.
|
|
1387
|
|
1388 Promote
|
|
1389 ^^^^^^^
|
|
1390
|
|
1391 For an operation without native support for a given type, the specified type
|
|
1392 may be promoted to a larger type that is supported. For example, SPARC does
|
|
1393 not support a sign-extending load for Boolean values (``i1`` type), so in
|
|
1394 ``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes
|
|
1395 ``i1`` type values to a large type before loading.
|
|
1396
|
|
1397 .. code-block:: c++
|
|
1398
|
|
1399 setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
|
|
1400
|
|
1401 Expand
|
|
1402 ^^^^^^
|
|
1403
|
|
1404 For a type without native support, a value may need to be broken down further,
|
|
1405 rather than promoted. For an operation without native support, a combination
|
|
1406 of other operations may be used to similar effect. In SPARC, the
|
|
1407 floating-point sine and cosine trig operations are supported by expansion to
|
|
1408 other operations, as indicated by the third parameter, ``Expand``, to
|
|
1409 ``setOperationAction``:
|
|
1410
|
|
1411 .. code-block:: c++
|
|
1412
|
|
1413 setOperationAction(ISD::FSIN, MVT::f32, Expand);
|
|
1414 setOperationAction(ISD::FCOS, MVT::f32, Expand);
|
|
1415
|
|
1416 Custom
|
|
1417 ^^^^^^
|
|
1418
|
|
1419 For some operations, simple type promotion or operation expansion may be
|
|
1420 insufficient. In some cases, a special intrinsic function must be implemented.
|
|
1421
|
|
1422 For example, a constant value may require special treatment, or an operation
|
|
1423 may require spilling and restoring registers in the stack and working with
|
|
1424 register allocators.
|
|
1425
|
|
1426 As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion
|
|
1427 from a floating point value to a signed integer, first the
|
|
1428 ``setOperationAction`` should be called with ``Custom`` as the third parameter:
|
|
1429
|
|
1430 .. code-block:: c++
|
|
1431
|
|
1432 setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
|
|
1433
|
|
1434 In the ``LowerOperation`` method, for each ``Custom`` operation, a case
|
|
1435 statement should be added to indicate what function to call. In the following
|
|
1436 code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method:
|
|
1437
|
|
1438 .. code-block:: c++
|
|
1439
|
|
1440 SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) {
|
|
1441 switch (Op.getOpcode()) {
|
|
1442 case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
|
|
1443 ...
|
|
1444 }
|
|
1445 }
|
|
1446
|
|
1447 Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to
|
|
1448 convert the floating-point value to an integer.
|
|
1449
|
|
1450 .. code-block:: c++
|
|
1451
|
|
1452 static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) {
|
|
1453 assert(Op.getValueType() == MVT::i32);
|
|
1454 Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
|
|
1455 return DAG.getNode(ISD::BITCAST, MVT::i32, Op);
|
|
1456 }
|
|
1457
|
|
1458 Legal
|
|
1459 ^^^^^
|
|
1460
|
|
1461 The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation
|
|
1462 **is** natively supported. ``Legal`` represents the default condition, so it
|
|
1463 is rarely used. In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an
|
|
1464 operation to count the bits set in an integer) is natively supported only for
|
|
1465 SPARC v9. The following code enables the ``Expand`` conversion technique for
|
|
1466 non-v9 SPARC implementations.
|
|
1467
|
|
1468 .. code-block:: c++
|
|
1469
|
|
1470 setOperationAction(ISD::CTPOP, MVT::i32, Expand);
|
|
1471 ...
|
|
1472 if (TM.getSubtarget<SparcSubtarget>().isV9())
|
|
1473 setOperationAction(ISD::CTPOP, MVT::i32, Legal);
|
|
1474
|
|
1475 Calling Conventions
|
|
1476 -------------------
|
|
1477
|
|
1478 To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses
|
|
1479 interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in
|
|
1480 ``lib/Target/TargetCallingConv.td``. TableGen can take the target descriptor
|
|
1481 file ``XXXGenCallingConv.td`` and generate the header file
|
|
1482 ``XXXGenCallingConv.inc``, which is typically included in
|
|
1483 ``XXXISelLowering.cpp``. You can use the interfaces in
|
|
1484 ``TargetCallingConv.td`` to specify:
|
|
1485
|
|
1486 * The order of parameter allocation.
|
|
1487
|
|
1488 * Where parameters and return values are placed (that is, on the stack or in
|
|
1489 registers).
|
|
1490
|
|
1491 * Which registers may be used.
|
|
1492
|
|
1493 * Whether the caller or callee unwinds the stack.
|
|
1494
|
|
1495 The following example demonstrates the use of the ``CCIfType`` and
|
|
1496 ``CCAssignToReg`` interfaces. If the ``CCIfType`` predicate is true (that is,
|
|
1497 if the current argument is of type ``f32`` or ``f64``), then the action is
|
|
1498 performed. In this case, the ``CCAssignToReg`` action assigns the argument
|
|
1499 value to the first available register: either ``R0`` or ``R1``.
|
|
1500
|
|
1501 .. code-block:: text
|
|
1502
|
|
1503 CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
|
|
1504
|
|
1505 ``SparcCallingConv.td`` contains definitions for a target-specific return-value
|
|
1506 calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention
|
|
1507 (``CC_Sparc32``). The definition of ``RetCC_Sparc32`` (shown below) indicates
|
|
1508 which registers are used for specified scalar return types. A single-precision
|
|
1509 float is returned to register ``F0``, and a double-precision float goes to
|
|
1510 register ``D0``. A 32-bit integer is returned in register ``I0`` or ``I1``.
|
|
1511
|
|
1512 .. code-block:: text
|
|
1513
|
|
1514 def RetCC_Sparc32 : CallingConv<[
|
|
1515 CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
|
|
1516 CCIfType<[f32], CCAssignToReg<[F0]>>,
|
|
1517 CCIfType<[f64], CCAssignToReg<[D0]>>
|
|
1518 ]>;
|
|
1519
|
|
1520 The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces
|
|
1521 ``CCAssignToStack``, which assigns the value to a stack slot with the specified
|
|
1522 size and alignment. In the example below, the first parameter, 4, indicates
|
|
1523 the size of the slot, and the second parameter, also 4, indicates the stack
|
|
1524 alignment along 4-byte units. (Special cases: if size is zero, then the ABI
|
|
1525 size is used; if alignment is zero, then the ABI alignment is used.)
|
|
1526
|
|
1527 .. code-block:: text
|
|
1528
|
|
1529 def CC_Sparc32 : CallingConv<[
|
|
1530 // All arguments get passed in integer registers if there is space.
|
|
1531 CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
|
|
1532 CCAssignToStack<4, 4>
|
|
1533 ]>;
|
|
1534
|
|
1535 ``CCDelegateTo`` is another commonly used interface, which tries to find a
|
|
1536 specified sub-calling convention, and, if a match is found, it is invoked. In
|
|
1537 the following example (in ``X86CallingConv.td``), the definition of
|
|
1538 ``RetCC_X86_32_C`` ends with ``CCDelegateTo``. After the current value is
|
|
1539 assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is
|
|
1540 invoked.
|
|
1541
|
|
1542 .. code-block:: text
|
|
1543
|
|
1544 def RetCC_X86_32_C : CallingConv<[
|
|
1545 CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
|
|
1546 CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
|
|
1547 CCDelegateTo<RetCC_X86Common>
|
|
1548 ]>;
|
|
1549
|
|
1550 ``CCIfCC`` is an interface that attempts to match the given name to the current
|
|
1551 calling convention. If the name identifies the current calling convention,
|
|
1552 then a specified action is invoked. In the following example (in
|
|
1553 ``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then
|
|
1554 ``RetCC_X86_32_Fast`` is invoked. If the ``SSECall`` calling convention is in
|
|
1555 use, then ``RetCC_X86_32_SSE`` is invoked.
|
|
1556
|
|
1557 .. code-block:: text
|
|
1558
|
|
1559 def RetCC_X86_32 : CallingConv<[
|
|
1560 CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
|
|
1561 CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
|
|
1562 CCDelegateTo<RetCC_X86_32_C>
|
|
1563 ]>;
|
|
1564
|
|
1565 Other calling convention interfaces include:
|
|
1566
|
|
1567 * ``CCIf <predicate, action>`` --- If the predicate matches, apply the action.
|
|
1568
|
|
1569 * ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``"
|
|
1570 attribute, then apply the action.
|
|
1571
|
|
1572 * ``CCIfNest <action>`` --- If the argument is marked with the "``nest``"
|
|
1573 attribute, then apply the action.
|
|
1574
|
|
1575 * ``CCIfNotVarArg <action>`` --- If the current function does not take a
|
|
1576 variable number of arguments, apply the action.
|
|
1577
|
|
1578 * ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to
|
|
1579 ``CCAssignToReg``, but with a shadow list of registers.
|
|
1580
|
|
1581 * ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the
|
|
1582 minimum specified size and alignment.
|
|
1583
|
|
1584 * ``CCPromoteToType <type>`` --- Promote the current value to the specified
|
|
1585 type.
|
|
1586
|
|
1587 * ``CallingConv <[actions]>`` --- Define each calling convention that is
|
|
1588 supported.
|
|
1589
|
|
1590 Assembly Printer
|
|
1591 ================
|
|
1592
|
|
1593 During the code emission stage, the code generator may utilize an LLVM pass to
|
|
1594 produce assembly output. To do this, you want to implement the code for a
|
|
1595 printer that converts LLVM IR to a GAS-format assembly language for your target
|
|
1596 machine, using the following steps:
|
|
1597
|
|
1598 * Define all the assembly strings for your target, adding them to the
|
|
1599 instructions defined in the ``XXXInstrInfo.td`` file. (See
|
|
1600 :ref:`instruction-set`.) TableGen will produce an output file
|
|
1601 (``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction``
|
|
1602 method for the ``XXXAsmPrinter`` class.
|
|
1603
|
|
1604 * Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of
|
|
1605 the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``).
|
|
1606
|
|
1607 * Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for
|
|
1608 ``TargetAsmInfo`` properties and sometimes new implementations for methods.
|
|
1609
|
|
1610 * Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that
|
|
1611 performs the LLVM-to-assembly conversion.
|
|
1612
|
|
1613 The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the
|
|
1614 ``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``. Similarly,
|
|
1615 ``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo``
|
|
1616 replacement values that override the default values in ``TargetAsmInfo.cpp``.
|
|
1617 For example in ``SparcTargetAsmInfo.cpp``:
|
|
1618
|
|
1619 .. code-block:: c++
|
|
1620
|
|
1621 SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
|
|
1622 Data16bitsDirective = "\t.half\t";
|
|
1623 Data32bitsDirective = "\t.word\t";
|
|
1624 Data64bitsDirective = 0; // .xword is only supported by V9.
|
|
1625 ZeroDirective = "\t.skip\t";
|
|
1626 CommentString = "!";
|
|
1627 ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
|
|
1628 }
|
|
1629
|
|
1630 The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example
|
|
1631 where the target specific ``TargetAsmInfo`` class uses an overridden methods:
|
|
1632 ``ExpandInlineAsm``.
|
|
1633
|
|
1634 A target-specific implementation of ``AsmPrinter`` is written in
|
|
1635 ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts
|
|
1636 the LLVM to printable assembly. The implementation must include the following
|
|
1637 headers that have declarations for the ``AsmPrinter`` and
|
|
1638 ``MachineFunctionPass`` classes. The ``MachineFunctionPass`` is a subclass of
|
|
1639 ``FunctionPass``.
|
|
1640
|
|
1641 .. code-block:: c++
|
|
1642
|
|
1643 #include "llvm/CodeGen/AsmPrinter.h"
|
|
1644 #include "llvm/CodeGen/MachineFunctionPass.h"
|
|
1645
|
|
1646 As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set
|
|
1647 up the ``AsmPrinter``. In ``SparcAsmPrinter``, a ``Mangler`` object is
|
|
1648 instantiated to process variable names.
|
|
1649
|
|
1650 In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in
|
|
1651 ``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``. In
|
|
1652 ``MachineFunctionPass``, the ``runOnFunction`` method invokes
|
|
1653 ``runOnMachineFunction``. Target-specific implementations of
|
|
1654 ``runOnMachineFunction`` differ, but generally do the following to process each
|
|
1655 machine function:
|
|
1656
|
|
1657 * Call ``SetupMachineFunction`` to perform initialization.
|
|
1658
|
|
1659 * Call ``EmitConstantPool`` to print out (to the output stream) constants which
|
|
1660 have been spilled to memory.
|
|
1661
|
|
1662 * Call ``EmitJumpTableInfo`` to print out jump tables used by the current
|
|
1663 function.
|
|
1664
|
|
1665 * Print out the label for the current function.
|
|
1666
|
|
1667 * Print out the code for the function, including basic block labels and the
|
|
1668 assembly for the instruction (using ``printInstruction``)
|
|
1669
|
|
1670 The ``XXXAsmPrinter`` implementation must also include the code generated by
|
|
1671 TableGen that is output in the ``XXXGenAsmWriter.inc`` file. The code in
|
|
1672 ``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction``
|
|
1673 method that may call these methods:
|
|
1674
|
|
1675 * ``printOperand``
|
|
1676 * ``printMemOperand``
|
|
1677 * ``printCCOperand`` (for conditional statements)
|
|
1678 * ``printDataDirective``
|
|
1679 * ``printDeclare``
|
|
1680 * ``printImplicitDef``
|
|
1681 * ``printInlineAsm``
|
|
1682
|
|
1683 The implementations of ``printDeclare``, ``printImplicitDef``,
|
|
1684 ``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally
|
|
1685 adequate for printing assembly and do not need to be overridden.
|
|
1686
|
|
1687 The ``printOperand`` method is implemented with a long ``switch``/``case``
|
|
1688 statement for the type of operand: register, immediate, basic block, external
|
|
1689 symbol, global address, constant pool index, or jump table index. For an
|
|
1690 instruction with a memory address operand, the ``printMemOperand`` method
|
|
1691 should be implemented to generate the proper output. Similarly,
|
|
1692 ``printCCOperand`` should be used to print a conditional operand.
|
|
1693
|
|
1694 ``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be
|
|
1695 called to shut down the assembly printer. During ``doFinalization``, global
|
|
1696 variables and constants are printed to output.
|
|
1697
|
|
1698 Subtarget Support
|
|
1699 =================
|
|
1700
|
|
1701 Subtarget support is used to inform the code generation process of instruction
|
|
1702 set variations for a given chip set. For example, the LLVM SPARC
|
|
1703 implementation provided covers three major versions of the SPARC microprocessor
|
|
1704 architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a
|
|
1705 64-bit architecture), and the UltraSPARC architecture. V8 has 16
|
|
1706 double-precision floating-point registers that are also usable as either 32
|
|
1707 single-precision or 8 quad-precision registers. V8 is also purely big-endian.
|
|
1708 V9 has 32 double-precision floating-point registers that are also usable as 16
|
|
1709 quad-precision registers, but cannot be used as single-precision registers.
|
|
1710 The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
|
|
1711 extensions.
|
|
1712
|
|
1713 If subtarget support is needed, you should implement a target-specific
|
|
1714 ``XXXSubtarget`` class for your architecture. This class should process the
|
|
1715 command-line options ``-mcpu=`` and ``-mattr=``.
|
|
1716
|
|
1717 TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to
|
|
1718 generate code in ``SparcGenSubtarget.inc``. In ``Target.td``, shown below, the
|
|
1719 ``SubtargetFeature`` interface is defined. The first 4 string parameters of
|
|
1720 the ``SubtargetFeature`` interface are a feature name, an attribute set by the
|
|
1721 feature, the value of the attribute, and a description of the feature. (The
|
|
1722 fifth parameter is a list of features whose presence is implied, and its
|
|
1723 default value is an empty array.)
|
|
1724
|
|
1725 .. code-block:: text
|
|
1726
|
|
1727 class SubtargetFeature<string n, string a, string v, string d,
|
|
1728 list<SubtargetFeature> i = []> {
|
|
1729 string Name = n;
|
|
1730 string Attribute = a;
|
|
1731 string Value = v;
|
|
1732 string Desc = d;
|
|
1733 list<SubtargetFeature> Implies = i;
|
|
1734 }
|
|
1735
|
|
1736 In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the
|
|
1737 following features.
|
|
1738
|
|
1739 .. code-block:: text
|
|
1740
|
|
1741 def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
|
|
1742 "Enable SPARC-V9 instructions">;
|
|
1743 def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
|
|
1744 "V8DeprecatedInsts", "true",
|
|
1745 "Enable deprecated V8 instructions in V9 mode">;
|
|
1746 def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
|
|
1747 "Enable UltraSPARC Visual Instruction Set extensions">;
|
|
1748
|
|
1749 Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to
|
|
1750 define particular SPARC processor subtypes that may have the previously
|
|
1751 described features.
|
|
1752
|
|
1753 .. code-block:: text
|
|
1754
|
|
1755 class Proc<string Name, list<SubtargetFeature> Features>
|
|
1756 : Processor<Name, NoItineraries, Features>;
|
|
1757
|
|
1758 def : Proc<"generic", []>;
|
|
1759 def : Proc<"v8", []>;
|
|
1760 def : Proc<"supersparc", []>;
|
|
1761 def : Proc<"sparclite", []>;
|
|
1762 def : Proc<"f934", []>;
|
|
1763 def : Proc<"hypersparc", []>;
|
|
1764 def : Proc<"sparclite86x", []>;
|
|
1765 def : Proc<"sparclet", []>;
|
|
1766 def : Proc<"tsc701", []>;
|
|
1767 def : Proc<"v9", [FeatureV9]>;
|
|
1768 def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>;
|
|
1769 def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>;
|
|
1770 def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
|
|
1771
|
|
1772 From ``Target.td`` and ``Sparc.td`` files, the resulting
|
|
1773 ``SparcGenSubtarget.inc`` specifies enum values to identify the features,
|
|
1774 arrays of constants to represent the CPU features and CPU subtypes, and the
|
|
1775 ``ParseSubtargetFeatures`` method that parses the features string that sets
|
|
1776 specified subtarget options. The generated ``SparcGenSubtarget.inc`` file
|
|
1777 should be included in the ``SparcSubtarget.cpp``. The target-specific
|
|
1778 implementation of the ``XXXSubtarget`` method should follow this pseudocode:
|
|
1779
|
|
1780 .. code-block:: c++
|
|
1781
|
|
1782 XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
|
|
1783 // Set the default features
|
|
1784 // Determine default and user specified characteristics of the CPU
|
|
1785 // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
|
|
1786 // Perform any additional operations
|
|
1787 }
|
|
1788
|
|
1789 JIT Support
|
|
1790 ===========
|
|
1791
|
|
1792 The implementation of a target machine optionally includes a Just-In-Time (JIT)
|
|
1793 code generator that emits machine code and auxiliary structures as binary
|
|
1794 output that can be written directly to memory. To do this, implement JIT code
|
|
1795 generation by performing the following steps:
|
|
1796
|
|
1797 * Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass
|
|
1798 that transforms target-machine instructions into relocatable machine
|
|
1799 code.
|
|
1800
|
|
1801 * Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for
|
|
1802 target-specific code-generation activities, such as emitting machine code and
|
|
1803 stubs.
|
|
1804
|
|
1805 * Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object
|
|
1806 through its ``getJITInfo`` method.
|
|
1807
|
|
1808 There are several different approaches to writing the JIT support code. For
|
|
1809 instance, TableGen and target descriptor files may be used for creating a JIT
|
|
1810 code generator, but are not mandatory. For the Alpha and PowerPC target
|
|
1811 machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which
|
|
1812 contains the binary coding of machine instructions and the
|
|
1813 ``getBinaryCodeForInstr`` method to access those codes. Other JIT
|
|
1814 implementations do not.
|
|
1815
|
|
1816 Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the
|
|
1817 ``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the
|
|
1818 ``MachineCodeEmitter`` class containing code for several callback functions
|
|
1819 that write data (in bytes, words, strings, etc.) to the output stream.
|
|
1820
|
|
1821 Machine Code Emitter
|
|
1822 --------------------
|
|
1823
|
|
1824 In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is
|
|
1825 implemented as a function pass (subclass of ``MachineFunctionPass``). The
|
|
1826 target-specific implementation of ``runOnMachineFunction`` (invoked by
|
|
1827 ``runOnFunction`` in ``MachineFunctionPass``) iterates through the
|
|
1828 ``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and
|
|
1829 emit binary code. ``emitInstruction`` is largely implemented with case
|
|
1830 statements on the instruction types defined in ``XXXInstrInfo.h``. For
|
|
1831 example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built
|
|
1832 around the following ``switch``/``case`` statements:
|
|
1833
|
|
1834 .. code-block:: c++
|
|
1835
|
|
1836 switch (Desc->TSFlags & X86::FormMask) {
|
|
1837 case X86II::Pseudo: // for not yet implemented instructions
|
|
1838 ... // or pseudo-instructions
|
|
1839 break;
|
|
1840 case X86II::RawFrm: // for instructions with a fixed opcode value
|
|
1841 ...
|
|
1842 break;
|
|
1843 case X86II::AddRegFrm: // for instructions that have one register operand
|
|
1844 ... // added to their opcode
|
|
1845 break;
|
|
1846 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
|
|
1847 ... // to specify a destination (register)
|
|
1848 break;
|
|
1849 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
|
|
1850 ... // to specify a destination (memory)
|
|
1851 break;
|
|
1852 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
|
|
1853 ... // to specify a source (register)
|
|
1854 break;
|
|
1855 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
|
|
1856 ... // to specify a source (memory)
|
|
1857 break;
|
|
1858 case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on
|
|
1859 case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and
|
|
1860 case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field
|
|
1861 case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data
|
|
1862 ...
|
|
1863 break;
|
|
1864 case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on
|
|
1865 case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and
|
|
1866 case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field
|
|
1867 case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data
|
|
1868 ...
|
|
1869 break;
|
|
1870 case X86II::MRMInitReg: // for instructions whose source and
|
|
1871 ... // destination are the same register
|
|
1872 break;
|
|
1873 }
|
|
1874
|
|
1875 The implementations of these case statements often first emit the opcode and
|
|
1876 then get the operand(s). Then depending upon the operand, helper methods may
|
|
1877 be called to process the operand(s). For example, in ``X86CodeEmitter.cpp``,
|
|
1878 for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is
|
|
1879 the opcode added to the register operand. Then an object representing the
|
|
1880 machine operand, ``MO1``, is extracted. The helper methods such as
|
|
1881 ``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``,
|
|
1882 ``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type.
|
|
1883 (``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``,
|
|
1884 ``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``,
|
|
1885 and ``emitJumpTableAddress`` that emit the data into the output stream.)
|
|
1886
|
|
1887 .. code-block:: c++
|
|
1888
|
|
1889 case X86II::AddRegFrm:
|
|
1890 MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
|
|
1891
|
|
1892 if (CurOp != NumOps) {
|
|
1893 const MachineOperand &MO1 = MI.getOperand(CurOp++);
|
|
1894 unsigned Size = X86InstrInfo::sizeOfImm(Desc);
|
|
1895 if (MO1.isImmediate())
|
|
1896 emitConstant(MO1.getImm(), Size);
|
|
1897 else {
|
|
1898 unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
|
|
1899 : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
|
|
1900 if (Opcode == X86::MOV64ri)
|
|
1901 rt = X86::reloc_absolute_dword; // FIXME: add X86II flag?
|
|
1902 if (MO1.isGlobalAddress()) {
|
|
1903 bool NeedStub = isa<Function>(MO1.getGlobal());
|
|
1904 bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
|
|
1905 emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
|
|
1906 NeedStub, isLazy);
|
|
1907 } else if (MO1.isExternalSymbol())
|
|
1908 emitExternalSymbolAddress(MO1.getSymbolName(), rt);
|
|
1909 else if (MO1.isConstantPoolIndex())
|
|
1910 emitConstPoolAddress(MO1.getIndex(), rt);
|
|
1911 else if (MO1.isJumpTableIndex())
|
|
1912 emitJumpTableAddress(MO1.getIndex(), rt);
|
|
1913 }
|
|
1914 }
|
|
1915 break;
|
|
1916
|
|
1917 In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which
|
|
1918 is a ``RelocationType`` enum that may be used to relocate addresses (for
|
|
1919 example, a global address with a PIC base offset). The ``RelocationType`` enum
|
|
1920 for that target is defined in the short target-specific ``XXXRelocations.h``
|
|
1921 file. The ``RelocationType`` is used by the ``relocate`` method defined in
|
|
1922 ``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols.
|
|
1923
|
|
1924 For example, ``X86Relocations.h`` specifies the following relocation types for
|
|
1925 the X86 addresses. In all four cases, the relocated value is added to the
|
|
1926 value already in memory. For ``reloc_pcrel_word`` and ``reloc_picrel_word``,
|
|
1927 there is an additional initial adjustment.
|
|
1928
|
|
1929 .. code-block:: c++
|
|
1930
|
|
1931 enum RelocationType {
|
|
1932 reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc
|
|
1933 reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base
|
|
1934 reloc_absolute_word = 2, // absolute relocation; no additional adjustment
|
|
1935 reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
|
|
1936 };
|
|
1937
|
|
1938 Target JIT Info
|
|
1939 ---------------
|
|
1940
|
|
1941 ``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific
|
|
1942 code-generation activities, such as emitting machine code and stubs. At
|
|
1943 minimum, a target-specific version of ``XXXJITInfo`` implements the following:
|
|
1944
|
|
1945 * ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a
|
|
1946 function that is used for compilation.
|
|
1947
|
|
1948 * ``emitFunctionStub`` --- Returns a native function with a specified address
|
|
1949 for a callback function.
|
|
1950
|
|
1951 * ``relocate`` --- Changes the addresses of referenced globals, based on
|
|
1952 relocation types.
|
|
1953
|
|
1954 * Callback function that are wrappers to a function stub that is used when the
|
|
1955 real target is not initially known.
|
|
1956
|
|
1957 ``getLazyResolverFunction`` is generally trivial to implement. It makes the
|
|
1958 incoming parameter as the global ``JITCompilerFunction`` and returns the
|
|
1959 callback function that will be used a function wrapper. For the Alpha target
|
|
1960 (in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is
|
|
1961 simply:
|
|
1962
|
|
1963 .. code-block:: c++
|
|
1964
|
|
1965 TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
|
|
1966 JITCompilerFn F) {
|
|
1967 JITCompilerFunction = F;
|
|
1968 return AlphaCompilationCallback;
|
|
1969 }
|
|
1970
|
|
1971 For the X86 target, the ``getLazyResolverFunction`` implementation is a little
|
|
1972 more complicated, because it returns a different callback function for
|
|
1973 processors with SSE instructions and XMM registers.
|
|
1974
|
|
1975 The callback function initially saves and later restores the callee register
|
|
1976 values, incoming arguments, and frame and return address. The callback
|
|
1977 function needs low-level access to the registers or stack, so it is typically
|
|
1978 implemented with assembler.
|
|
1979
|