CbC/CbC_llvm: llvm/docs/CodeGenerator.rst annotate

annotate llvm/docs/CodeGenerator.rst @ 235:edfff9242030 cbc-llvm13

...

author	Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date	Wed, 21 Jul 2021 11:30:30 +0900
parents	5f17cb93ff66
children	c4bab56944e8

rev	line source
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	1 ==========================================
1d019706d866 LLVM10 anatofuz parents: diff changeset	2 The LLVM Target-Independent Code Generator
1d019706d866 LLVM10 anatofuz parents: diff changeset	3 ==========================================
1d019706d866 LLVM10 anatofuz parents: diff changeset	4
1d019706d866 LLVM10 anatofuz parents: diff changeset	5 .. role:: raw-html(raw)
1d019706d866 LLVM10 anatofuz parents: diff changeset	6 :format: html
1d019706d866 LLVM10 anatofuz parents: diff changeset	7
1d019706d866 LLVM10 anatofuz parents: diff changeset	8 .. raw:: html
1d019706d866 LLVM10 anatofuz parents: diff changeset	9
1d019706d866 LLVM10 anatofuz parents: diff changeset	10 <style>
1d019706d866 LLVM10 anatofuz parents: diff changeset	11 .unknown { background-color: #C0C0C0; text-align: center; }
1d019706d866 LLVM10 anatofuz parents: diff changeset	12 .unknown:before { content: "?" }
1d019706d866 LLVM10 anatofuz parents: diff changeset	13 .no { background-color: #C11B17 }
1d019706d866 LLVM10 anatofuz parents: diff changeset	14 .no:before { content: "N" }
1d019706d866 LLVM10 anatofuz parents: diff changeset	15 .partial { background-color: #F88017 }
1d019706d866 LLVM10 anatofuz parents: diff changeset	16 .yes { background-color: #0F0; }
1d019706d866 LLVM10 anatofuz parents: diff changeset	17 .yes:before { content: "Y" }
1d019706d866 LLVM10 anatofuz parents: diff changeset	18 .na { background-color: #6666FF; }
1d019706d866 LLVM10 anatofuz parents: diff changeset	19 .na:before { content: "N/A" }
1d019706d866 LLVM10 anatofuz parents: diff changeset	20 </style>
1d019706d866 LLVM10 anatofuz parents: diff changeset	21
1d019706d866 LLVM10 anatofuz parents: diff changeset	22 .. contents::
1d019706d866 LLVM10 anatofuz parents: diff changeset	23 :local:
1d019706d866 LLVM10 anatofuz parents: diff changeset	24
1d019706d866 LLVM10 anatofuz parents: diff changeset	25 .. warning::
1d019706d866 LLVM10 anatofuz parents: diff changeset	26 This is a work in progress.
1d019706d866 LLVM10 anatofuz parents: diff changeset	27
1d019706d866 LLVM10 anatofuz parents: diff changeset	28 Introduction
1d019706d866 LLVM10 anatofuz parents: diff changeset	29 ============
1d019706d866 LLVM10 anatofuz parents: diff changeset	30
1d019706d866 LLVM10 anatofuz parents: diff changeset	31 The LLVM target-independent code generator is a framework that provides a suite
1d019706d866 LLVM10 anatofuz parents: diff changeset	32 of reusable components for translating the LLVM internal representation to the
1d019706d866 LLVM10 anatofuz parents: diff changeset	33 machine code for a specified target---either in assembly form (suitable for a
1d019706d866 LLVM10 anatofuz parents: diff changeset	34 static compiler) or in binary machine code format (usable for a JIT
1d019706d866 LLVM10 anatofuz parents: diff changeset	35 compiler). The LLVM target-independent code generator consists of six main
1d019706d866 LLVM10 anatofuz parents: diff changeset	36 components:
1d019706d866 LLVM10 anatofuz parents: diff changeset	37
1d019706d866 LLVM10 anatofuz parents: diff changeset	38 1. `Abstract target description`_ interfaces which capture important properties
1d019706d866 LLVM10 anatofuz parents: diff changeset	39 about various aspects of the machine, independently of how they will be used.
1d019706d866 LLVM10 anatofuz parents: diff changeset	40 These interfaces are defined in ``include/llvm/Target/``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	41
1d019706d866 LLVM10 anatofuz parents: diff changeset	42 2. Classes used to represent the `code being generated`_ for a target. These
1d019706d866 LLVM10 anatofuz parents: diff changeset	43 classes are intended to be abstract enough to represent the machine code for
1d019706d866 LLVM10 anatofuz parents: diff changeset	44 any target machine. These classes are defined in
1d019706d866 LLVM10 anatofuz parents: diff changeset	45 ``include/llvm/CodeGen/``. At this level, concepts like "constant pool
1d019706d866 LLVM10 anatofuz parents: diff changeset	46 entries" and "jump tables" are explicitly exposed.
1d019706d866 LLVM10 anatofuz parents: diff changeset	47
1d019706d866 LLVM10 anatofuz parents: diff changeset	48 3. Classes and algorithms used to represent code at the object file level, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	49 `MC Layer`_. These classes represent assembly level constructs like labels,
1d019706d866 LLVM10 anatofuz parents: diff changeset	50 sections, and instructions. At this level, concepts like "constant pool
1d019706d866 LLVM10 anatofuz parents: diff changeset	51 entries" and "jump tables" don't exist.
1d019706d866 LLVM10 anatofuz parents: diff changeset	52
1d019706d866 LLVM10 anatofuz parents: diff changeset	53 4. `Target-independent algorithms`_ used to implement various phases of native
1d019706d866 LLVM10 anatofuz parents: diff changeset	54 code generation (register allocation, scheduling, stack frame representation,
1d019706d866 LLVM10 anatofuz parents: diff changeset	55 etc). This code lives in ``lib/CodeGen/``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	56
1d019706d866 LLVM10 anatofuz parents: diff changeset	57 5. `Implementations of the abstract target description interfaces`_ for
1d019706d866 LLVM10 anatofuz parents: diff changeset	58 particular targets. These machine descriptions make use of the components
1d019706d866 LLVM10 anatofuz parents: diff changeset	59 provided by LLVM, and can optionally provide custom target-specific passes,
1d019706d866 LLVM10 anatofuz parents: diff changeset	60 to build complete code generators for a specific target. Target descriptions
1d019706d866 LLVM10 anatofuz parents: diff changeset	61 live in ``lib/Target/``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	62
1d019706d866 LLVM10 anatofuz parents: diff changeset	63 6. The target-independent JIT components. The LLVM JIT is completely target
1d019706d866 LLVM10 anatofuz parents: diff changeset	64 independent (it uses the ``TargetJITInfo`` structure to interface for
1d019706d866 LLVM10 anatofuz parents: diff changeset	65 target-specific issues. The code for the target-independent JIT lives in
1d019706d866 LLVM10 anatofuz parents: diff changeset	66 ``lib/ExecutionEngine/JIT``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	67
1d019706d866 LLVM10 anatofuz parents: diff changeset	68 Depending on which part of the code generator you are interested in working on,
1d019706d866 LLVM10 anatofuz parents: diff changeset	69 different pieces of this will be useful to you. In any case, you should be
1d019706d866 LLVM10 anatofuz parents: diff changeset	70 familiar with the `target description`_ and `machine code representation`_
1d019706d866 LLVM10 anatofuz parents: diff changeset	71 classes. If you want to add a backend for a new target, you will need to
1d019706d866 LLVM10 anatofuz parents: diff changeset	72 `implement the target description`_ classes for your new target and understand
1d019706d866 LLVM10 anatofuz parents: diff changeset	73 the :doc:`LLVM code representation <LangRef>`. If you are interested in
1d019706d866 LLVM10 anatofuz parents: diff changeset	74 implementing a new `code generation algorithm`_, it should only depend on the
1d019706d866 LLVM10 anatofuz parents: diff changeset	75 target-description and machine code representation classes, ensuring that it is
1d019706d866 LLVM10 anatofuz parents: diff changeset	76 portable.
1d019706d866 LLVM10 anatofuz parents: diff changeset	77
1d019706d866 LLVM10 anatofuz parents: diff changeset	78 Required components in the code generator
1d019706d866 LLVM10 anatofuz parents: diff changeset	79 -----------------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	80
1d019706d866 LLVM10 anatofuz parents: diff changeset	81 The two pieces of the LLVM code generator are the high-level interface to the
1d019706d866 LLVM10 anatofuz parents: diff changeset	82 code generator and the set of reusable components that can be used to build
1d019706d866 LLVM10 anatofuz parents: diff changeset	83 target-specific backends. The two most important interfaces (:raw-html:`<tt>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	84 `TargetMachine`_ :raw-html:`</tt>` and :raw-html:`<tt>` `DataLayout`_
1d019706d866 LLVM10 anatofuz parents: diff changeset	85 :raw-html:`</tt>`) are the only ones that are required to be defined for a
1d019706d866 LLVM10 anatofuz parents: diff changeset	86 backend to fit into the LLVM system, but the others must be defined if the
1d019706d866 LLVM10 anatofuz parents: diff changeset	87 reusable code generator components are going to be used.
1d019706d866 LLVM10 anatofuz parents: diff changeset	88
1d019706d866 LLVM10 anatofuz parents: diff changeset	89 This design has two important implications. The first is that LLVM can support
1d019706d866 LLVM10 anatofuz parents: diff changeset	90 completely non-traditional code generation targets. For example, the C backend
1d019706d866 LLVM10 anatofuz parents: diff changeset	91 does not require register allocation, instruction selection, or any of the other
1d019706d866 LLVM10 anatofuz parents: diff changeset	92 standard components provided by the system. As such, it only implements these
1d019706d866 LLVM10 anatofuz parents: diff changeset	93 two interfaces, and does its own thing. Note that C backend was removed from the
1d019706d866 LLVM10 anatofuz parents: diff changeset	94 trunk since LLVM 3.1 release. Another example of a code generator like this is a
1d019706d866 LLVM10 anatofuz parents: diff changeset	95 (purely hypothetical) backend that converts LLVM to the GCC RTL form and uses
1d019706d866 LLVM10 anatofuz parents: diff changeset	96 GCC to emit machine code for a target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	97
1d019706d866 LLVM10 anatofuz parents: diff changeset	98 This design also implies that it is possible to design and implement radically
1d019706d866 LLVM10 anatofuz parents: diff changeset	99 different code generators in the LLVM system that do not make use of any of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	100 built-in components. Doing so is not recommended at all, but could be required
1d019706d866 LLVM10 anatofuz parents: diff changeset	101 for radically different targets that do not fit into the LLVM machine
1d019706d866 LLVM10 anatofuz parents: diff changeset	102 description model: FPGAs for example.
1d019706d866 LLVM10 anatofuz parents: diff changeset	103
1d019706d866 LLVM10 anatofuz parents: diff changeset	104 .. _high-level design of the code generator:
1d019706d866 LLVM10 anatofuz parents: diff changeset	105
1d019706d866 LLVM10 anatofuz parents: diff changeset	106 The high-level design of the code generator
1d019706d866 LLVM10 anatofuz parents: diff changeset	107 -------------------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	108
1d019706d866 LLVM10 anatofuz parents: diff changeset	109 The LLVM target-independent code generator is designed to support efficient and
1d019706d866 LLVM10 anatofuz parents: diff changeset	110 quality code generation for standard register-based microprocessors. Code
1d019706d866 LLVM10 anatofuz parents: diff changeset	111 generation in this model is divided into the following stages:
1d019706d866 LLVM10 anatofuz parents: diff changeset	112
1d019706d866 LLVM10 anatofuz parents: diff changeset	113 1. `Instruction Selection`_ --- This phase determines an efficient way to
1d019706d866 LLVM10 anatofuz parents: diff changeset	114 express the input LLVM code in the target instruction set. This stage
1d019706d866 LLVM10 anatofuz parents: diff changeset	115 produces the initial code for the program in the target instruction set, then
1d019706d866 LLVM10 anatofuz parents: diff changeset	116 makes use of virtual registers in SSA form and physical registers that
1d019706d866 LLVM10 anatofuz parents: diff changeset	117 represent any required register assignments due to target constraints or
1d019706d866 LLVM10 anatofuz parents: diff changeset	118 calling conventions. This step turns the LLVM code into a DAG of target
1d019706d866 LLVM10 anatofuz parents: diff changeset	119 instructions.
1d019706d866 LLVM10 anatofuz parents: diff changeset	120
1d019706d866 LLVM10 anatofuz parents: diff changeset	121 2. `Scheduling and Formation`_ --- This phase takes the DAG of target
1d019706d866 LLVM10 anatofuz parents: diff changeset	122 instructions produced by the instruction selection phase, determines an
1d019706d866 LLVM10 anatofuz parents: diff changeset	123 ordering of the instructions, then emits the instructions as :raw-html:`<tt>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	124 `MachineInstr`_\s :raw-html:`</tt>` with that ordering. Note that we
1d019706d866 LLVM10 anatofuz parents: diff changeset	125 describe this in the `instruction selection section`_ because it operates on
1d019706d866 LLVM10 anatofuz parents: diff changeset	126 a `SelectionDAG`_.
1d019706d866 LLVM10 anatofuz parents: diff changeset	127
1d019706d866 LLVM10 anatofuz parents: diff changeset	128 3. `SSA-based Machine Code Optimizations`_ --- This optional stage consists of a
1d019706d866 LLVM10 anatofuz parents: diff changeset	129 series of machine-code optimizations that operate on the SSA-form produced by
1d019706d866 LLVM10 anatofuz parents: diff changeset	130 the instruction selector. Optimizations like modulo-scheduling or peephole
1d019706d866 LLVM10 anatofuz parents: diff changeset	131 optimization work here.
1d019706d866 LLVM10 anatofuz parents: diff changeset	132
1d019706d866 LLVM10 anatofuz parents: diff changeset	133 4. `Register Allocation`_ --- The target code is transformed from an infinite
1d019706d866 LLVM10 anatofuz parents: diff changeset	134 virtual register file in SSA form to the concrete register file used by the
1d019706d866 LLVM10 anatofuz parents: diff changeset	135 target. This phase introduces spill code and eliminates all virtual register
1d019706d866 LLVM10 anatofuz parents: diff changeset	136 references from the program.
1d019706d866 LLVM10 anatofuz parents: diff changeset	137
1d019706d866 LLVM10 anatofuz parents: diff changeset	138 5. `Prolog/Epilog Code Insertion`_ --- Once the machine code has been generated
1d019706d866 LLVM10 anatofuz parents: diff changeset	139 for the function and the amount of stack space required is known (used for
1d019706d866 LLVM10 anatofuz parents: diff changeset	140 LLVM alloca's and spill slots), the prolog and epilog code for the function
1d019706d866 LLVM10 anatofuz parents: diff changeset	141 can be inserted and "abstract stack location references" can be eliminated.
1d019706d866 LLVM10 anatofuz parents: diff changeset	142 This stage is responsible for implementing optimizations like frame-pointer
1d019706d866 LLVM10 anatofuz parents: diff changeset	143 elimination and stack packing.
1d019706d866 LLVM10 anatofuz parents: diff changeset	144
1d019706d866 LLVM10 anatofuz parents: diff changeset	145 6. `Late Machine Code Optimizations`_ --- Optimizations that operate on "final"
1d019706d866 LLVM10 anatofuz parents: diff changeset	146 machine code can go here, such as spill code scheduling and peephole
1d019706d866 LLVM10 anatofuz parents: diff changeset	147 optimizations.
1d019706d866 LLVM10 anatofuz parents: diff changeset	148
1d019706d866 LLVM10 anatofuz parents: diff changeset	149 7. `Code Emission`_ --- The final stage actually puts out the code for the
1d019706d866 LLVM10 anatofuz parents: diff changeset	150 current function, either in the target assembler format or in machine
1d019706d866 LLVM10 anatofuz parents: diff changeset	151 code.
1d019706d866 LLVM10 anatofuz parents: diff changeset	152
1d019706d866 LLVM10 anatofuz parents: diff changeset	153 The code generator is based on the assumption that the instruction selector will
1d019706d866 LLVM10 anatofuz parents: diff changeset	154 use an optimal pattern matching selector to create high-quality sequences of
1d019706d866 LLVM10 anatofuz parents: diff changeset	155 native instructions. Alternative code generator designs based on pattern
1d019706d866 LLVM10 anatofuz parents: diff changeset	156 expansion and aggressive iterative peephole optimization are much slower. This
1d019706d866 LLVM10 anatofuz parents: diff changeset	157 design permits efficient compilation (important for JIT environments) and
1d019706d866 LLVM10 anatofuz parents: diff changeset	158 aggressive optimization (used when generating code offline) by allowing
1d019706d866 LLVM10 anatofuz parents: diff changeset	159 components of varying levels of sophistication to be used for any step of
1d019706d866 LLVM10 anatofuz parents: diff changeset	160 compilation.
1d019706d866 LLVM10 anatofuz parents: diff changeset	161
1d019706d866 LLVM10 anatofuz parents: diff changeset	162 In addition to these stages, target implementations can insert arbitrary
1d019706d866 LLVM10 anatofuz parents: diff changeset	163 target-specific passes into the flow. For example, the X86 target uses a
1d019706d866 LLVM10 anatofuz parents: diff changeset	164 special pass to handle the 80x87 floating point stack architecture. Other
1d019706d866 LLVM10 anatofuz parents: diff changeset	165 targets with unusual requirements can be supported with custom passes as needed.
1d019706d866 LLVM10 anatofuz parents: diff changeset	166
1d019706d866 LLVM10 anatofuz parents: diff changeset	167 Using TableGen for target description
1d019706d866 LLVM10 anatofuz parents: diff changeset	168 -------------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	169
1d019706d866 LLVM10 anatofuz parents: diff changeset	170 The target description classes require a detailed description of the target
1d019706d866 LLVM10 anatofuz parents: diff changeset	171 architecture. These target descriptions often have a large amount of common
1d019706d866 LLVM10 anatofuz parents: diff changeset	172 information (e.g., an ``add`` instruction is almost identical to a ``sub``
1d019706d866 LLVM10 anatofuz parents: diff changeset	173 instruction). In order to allow the maximum amount of commonality to be
1d019706d866 LLVM10 anatofuz parents: diff changeset	174 factored out, the LLVM code generator uses the
1d019706d866 LLVM10 anatofuz parents: diff changeset	175 :doc:`TableGen/index` tool to describe big chunks of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	176 target machine, which allows the use of domain-specific and target-specific
1d019706d866 LLVM10 anatofuz parents: diff changeset	177 abstractions to reduce the amount of repetition.
1d019706d866 LLVM10 anatofuz parents: diff changeset	178
1d019706d866 LLVM10 anatofuz parents: diff changeset	179 As LLVM continues to be developed and refined, we plan to move more and more of
1d019706d866 LLVM10 anatofuz parents: diff changeset	180 the target description to the ``.td`` form. Doing so gives us a number of
1d019706d866 LLVM10 anatofuz parents: diff changeset	181 advantages. The most important is that it makes it easier to port LLVM because
1d019706d866 LLVM10 anatofuz parents: diff changeset	182 it reduces the amount of C++ code that has to be written, and the surface area
1d019706d866 LLVM10 anatofuz parents: diff changeset	183 of the code generator that needs to be understood before someone can get
1d019706d866 LLVM10 anatofuz parents: diff changeset	184 something working. Second, it makes it easier to change things. In particular,
1d019706d866 LLVM10 anatofuz parents: diff changeset	185 if tables and other things are all emitted by ``tblgen``, we only need a change
1d019706d866 LLVM10 anatofuz parents: diff changeset	186 in one place (``tblgen``) to update all of the targets to a new interface.
1d019706d866 LLVM10 anatofuz parents: diff changeset	187
1d019706d866 LLVM10 anatofuz parents: diff changeset	188 .. _Abstract target description:
1d019706d866 LLVM10 anatofuz parents: diff changeset	189 .. _target description:
1d019706d866 LLVM10 anatofuz parents: diff changeset	190
1d019706d866 LLVM10 anatofuz parents: diff changeset	191 Target description classes
1d019706d866 LLVM10 anatofuz parents: diff changeset	192 ==========================
1d019706d866 LLVM10 anatofuz parents: diff changeset	193
1d019706d866 LLVM10 anatofuz parents: diff changeset	194 The LLVM target description classes (located in the ``include/llvm/Target``
1d019706d866 LLVM10 anatofuz parents: diff changeset	195 directory) provide an abstract description of the target machine independent of
1d019706d866 LLVM10 anatofuz parents: diff changeset	196 any particular client. These classes are designed to capture the abstract
1d019706d866 LLVM10 anatofuz parents: diff changeset	197 properties of the target (such as the instructions and registers it has), and do
1d019706d866 LLVM10 anatofuz parents: diff changeset	198 not incorporate any particular pieces of code generation algorithms.
1d019706d866 LLVM10 anatofuz parents: diff changeset	199
1d019706d866 LLVM10 anatofuz parents: diff changeset	200 All of the target description classes (except the :raw-html:`<tt>` `DataLayout`_
1d019706d866 LLVM10 anatofuz parents: diff changeset	201 :raw-html:`</tt>` class) are designed to be subclassed by the concrete target
1d019706d866 LLVM10 anatofuz parents: diff changeset	202 implementation, and have virtual methods implemented. To get to these
1d019706d866 LLVM10 anatofuz parents: diff changeset	203 implementations, the :raw-html:`<tt>` `TargetMachine`_ :raw-html:`</tt>` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	204 provides accessors that should be implemented by the target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	205
1d019706d866 LLVM10 anatofuz parents: diff changeset	206 .. _TargetMachine:
1d019706d866 LLVM10 anatofuz parents: diff changeset	207
1d019706d866 LLVM10 anatofuz parents: diff changeset	208 The ``TargetMachine`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	209 ---------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	210
1d019706d866 LLVM10 anatofuz parents: diff changeset	211 The ``TargetMachine`` class provides virtual methods that are used to access the
1d019706d866 LLVM10 anatofuz parents: diff changeset	212 target-specific implementations of the various target description classes via
1d019706d866 LLVM10 anatofuz parents: diff changeset	213 the ``get*Info`` methods (``getInstrInfo``, ``getRegisterInfo``,
1d019706d866 LLVM10 anatofuz parents: diff changeset	214 ``getFrameInfo``, etc.). This class is designed to be specialized by a concrete
1d019706d866 LLVM10 anatofuz parents: diff changeset	215 target implementation (e.g., ``X86TargetMachine``) which implements the various
1d019706d866 LLVM10 anatofuz parents: diff changeset	216 virtual methods. The only required target description class is the
1d019706d866 LLVM10 anatofuz parents: diff changeset	217 :raw-html:`<tt>` `DataLayout`_ :raw-html:`</tt>` class, but if the code
1d019706d866 LLVM10 anatofuz parents: diff changeset	218 generator components are to be used, the other interfaces should be implemented
1d019706d866 LLVM10 anatofuz parents: diff changeset	219 as well.
1d019706d866 LLVM10 anatofuz parents: diff changeset	220
1d019706d866 LLVM10 anatofuz parents: diff changeset	221 .. _DataLayout:
1d019706d866 LLVM10 anatofuz parents: diff changeset	222
1d019706d866 LLVM10 anatofuz parents: diff changeset	223 The ``DataLayout`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	224 ------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	225
1d019706d866 LLVM10 anatofuz parents: diff changeset	226 The ``DataLayout`` class is the only required target description class, and it
1d019706d866 LLVM10 anatofuz parents: diff changeset	227 is the only class that is not extensible (you cannot derive a new class from
1d019706d866 LLVM10 anatofuz parents: diff changeset	228 it). ``DataLayout`` specifies information about how the target lays out memory
1d019706d866 LLVM10 anatofuz parents: diff changeset	229 for structures, the alignment requirements for various data types, the size of
1d019706d866 LLVM10 anatofuz parents: diff changeset	230 pointers in the target, and whether the target is little-endian or
1d019706d866 LLVM10 anatofuz parents: diff changeset	231 big-endian.
1d019706d866 LLVM10 anatofuz parents: diff changeset	232
1d019706d866 LLVM10 anatofuz parents: diff changeset	233 .. _TargetLowering:
1d019706d866 LLVM10 anatofuz parents: diff changeset	234
1d019706d866 LLVM10 anatofuz parents: diff changeset	235 The ``TargetLowering`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	236 ----------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	237
1d019706d866 LLVM10 anatofuz parents: diff changeset	238 The ``TargetLowering`` class is used by SelectionDAG based instruction selectors
1d019706d866 LLVM10 anatofuz parents: diff changeset	239 primarily to describe how LLVM code should be lowered to SelectionDAG
1d019706d866 LLVM10 anatofuz parents: diff changeset	240 operations. Among other things, this class indicates:
1d019706d866 LLVM10 anatofuz parents: diff changeset	241
1d019706d866 LLVM10 anatofuz parents: diff changeset	242 * an initial register class to use for various ``ValueType``\s,
1d019706d866 LLVM10 anatofuz parents: diff changeset	243
1d019706d866 LLVM10 anatofuz parents: diff changeset	244 * which operations are natively supported by the target machine,
1d019706d866 LLVM10 anatofuz parents: diff changeset	245
1d019706d866 LLVM10 anatofuz parents: diff changeset	246 * the return type of ``setcc`` operations,
1d019706d866 LLVM10 anatofuz parents: diff changeset	247
1d019706d866 LLVM10 anatofuz parents: diff changeset	248 * the type to use for shift amounts, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	249
1d019706d866 LLVM10 anatofuz parents: diff changeset	250 * various high-level characteristics, like whether it is profitable to turn
1d019706d866 LLVM10 anatofuz parents: diff changeset	251 division by a constant into a multiplication sequence.
1d019706d866 LLVM10 anatofuz parents: diff changeset	252
1d019706d866 LLVM10 anatofuz parents: diff changeset	253 .. _TargetRegisterInfo:
1d019706d866 LLVM10 anatofuz parents: diff changeset	254
1d019706d866 LLVM10 anatofuz parents: diff changeset	255 The ``TargetRegisterInfo`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	256 --------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	257
1d019706d866 LLVM10 anatofuz parents: diff changeset	258 The ``TargetRegisterInfo`` class is used to describe the register file of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	259 target and any interactions between the registers.
1d019706d866 LLVM10 anatofuz parents: diff changeset	260
1d019706d866 LLVM10 anatofuz parents: diff changeset	261 Registers are represented in the code generator by unsigned integers. Physical
1d019706d866 LLVM10 anatofuz parents: diff changeset	262 registers (those that actually exist in the target description) are unique
1d019706d866 LLVM10 anatofuz parents: diff changeset	263 small numbers, and virtual registers are generally large. Note that
1d019706d866 LLVM10 anatofuz parents: diff changeset	264 register ``#0`` is reserved as a flag value.
1d019706d866 LLVM10 anatofuz parents: diff changeset	265
1d019706d866 LLVM10 anatofuz parents: diff changeset	266 Each register in the processor description has an associated
1d019706d866 LLVM10 anatofuz parents: diff changeset	267 ``TargetRegisterDesc`` entry, which provides a textual name for the register
1d019706d866 LLVM10 anatofuz parents: diff changeset	268 (used for assembly output and debugging dumps) and a set of aliases (used to
1d019706d866 LLVM10 anatofuz parents: diff changeset	269 indicate whether one register overlaps with another).
1d019706d866 LLVM10 anatofuz parents: diff changeset	270
1d019706d866 LLVM10 anatofuz parents: diff changeset	271 In addition to the per-register description, the ``TargetRegisterInfo`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	272 exposes a set of processor specific register classes (instances of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	273 ``TargetRegisterClass`` class). Each register class contains sets of registers
1d019706d866 LLVM10 anatofuz parents: diff changeset	274 that have the same properties (for example, they are all 32-bit integer
1d019706d866 LLVM10 anatofuz parents: diff changeset	275 registers). Each SSA virtual register created by the instruction selector has
1d019706d866 LLVM10 anatofuz parents: diff changeset	276 an associated register class. When the register allocator runs, it replaces
1d019706d866 LLVM10 anatofuz parents: diff changeset	277 virtual registers with a physical register in the set.
1d019706d866 LLVM10 anatofuz parents: diff changeset	278
1d019706d866 LLVM10 anatofuz parents: diff changeset	279 The target-specific implementations of these classes is auto-generated from a
1d019706d866 LLVM10 anatofuz parents: diff changeset	280 :doc:`TableGen/index` description of the register file.
1d019706d866 LLVM10 anatofuz parents: diff changeset	281
1d019706d866 LLVM10 anatofuz parents: diff changeset	282 .. _TargetInstrInfo:
1d019706d866 LLVM10 anatofuz parents: diff changeset	283
1d019706d866 LLVM10 anatofuz parents: diff changeset	284 The ``TargetInstrInfo`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	285 -----------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	286
1d019706d866 LLVM10 anatofuz parents: diff changeset	287 The ``TargetInstrInfo`` class is used to describe the machine instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	288 supported by the target. Descriptions define things like the mnemonic for
1d019706d866 LLVM10 anatofuz parents: diff changeset	289 the opcode, the number of operands, the list of implicit register uses and defs,
1d019706d866 LLVM10 anatofuz parents: diff changeset	290 whether the instruction has certain target-independent properties (accesses
1d019706d866 LLVM10 anatofuz parents: diff changeset	291 memory, is commutable, etc), and holds any target-specific flags.
1d019706d866 LLVM10 anatofuz parents: diff changeset	292
1d019706d866 LLVM10 anatofuz parents: diff changeset	293 The ``TargetFrameLowering`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	294 ---------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	295
1d019706d866 LLVM10 anatofuz parents: diff changeset	296 The ``TargetFrameLowering`` class is used to provide information about the stack
1d019706d866 LLVM10 anatofuz parents: diff changeset	297 frame layout of the target. It holds the direction of stack growth, the known
1d019706d866 LLVM10 anatofuz parents: diff changeset	298 stack alignment on entry to each function, and the offset to the local area.
1d019706d866 LLVM10 anatofuz parents: diff changeset	299 The offset to the local area is the offset from the stack pointer on function
1d019706d866 LLVM10 anatofuz parents: diff changeset	300 entry to the first location where function data (local variables, spill
1d019706d866 LLVM10 anatofuz parents: diff changeset	301 locations) can be stored.
1d019706d866 LLVM10 anatofuz parents: diff changeset	302
1d019706d866 LLVM10 anatofuz parents: diff changeset	303 The ``TargetSubtarget`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	304 -----------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	305
1d019706d866 LLVM10 anatofuz parents: diff changeset	306 The ``TargetSubtarget`` class is used to provide information about the specific
1d019706d866 LLVM10 anatofuz parents: diff changeset	307 chip set being targeted. A sub-target informs code generation of which
1d019706d866 LLVM10 anatofuz parents: diff changeset	308 instructions are supported, instruction latencies and instruction execution
1d019706d866 LLVM10 anatofuz parents: diff changeset	309 itinerary; i.e., which processing units are used, in what order, and for how
1d019706d866 LLVM10 anatofuz parents: diff changeset	310 long.
1d019706d866 LLVM10 anatofuz parents: diff changeset	311
1d019706d866 LLVM10 anatofuz parents: diff changeset	312 The ``TargetJITInfo`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	313 ---------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	314
1d019706d866 LLVM10 anatofuz parents: diff changeset	315 The ``TargetJITInfo`` class exposes an abstract interface used by the
1d019706d866 LLVM10 anatofuz parents: diff changeset	316 Just-In-Time code generator to perform target-specific activities, such as
1d019706d866 LLVM10 anatofuz parents: diff changeset	317 emitting stubs. If a ``TargetMachine`` supports JIT code generation, it should
1d019706d866 LLVM10 anatofuz parents: diff changeset	318 provide one of these objects through the ``getJITInfo`` method.
1d019706d866 LLVM10 anatofuz parents: diff changeset	319
1d019706d866 LLVM10 anatofuz parents: diff changeset	320 .. _code being generated:
1d019706d866 LLVM10 anatofuz parents: diff changeset	321 .. _machine code representation:
1d019706d866 LLVM10 anatofuz parents: diff changeset	322
1d019706d866 LLVM10 anatofuz parents: diff changeset	323 Machine code description classes
1d019706d866 LLVM10 anatofuz parents: diff changeset	324 ================================
1d019706d866 LLVM10 anatofuz parents: diff changeset	325
1d019706d866 LLVM10 anatofuz parents: diff changeset	326 At the high-level, LLVM code is translated to a machine specific representation
1d019706d866 LLVM10 anatofuz parents: diff changeset	327 formed out of :raw-html:`<tt>` `MachineFunction`_ :raw-html:`</tt>`,
1d019706d866 LLVM10 anatofuz parents: diff changeset	328 :raw-html:`<tt>` `MachineBasicBlock`_ :raw-html:`</tt>`, and :raw-html:`<tt>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	329 `MachineInstr`_ :raw-html:`</tt>` instances (defined in
1d019706d866 LLVM10 anatofuz parents: diff changeset	330 ``include/llvm/CodeGen``). This representation is completely target agnostic,
1d019706d866 LLVM10 anatofuz parents: diff changeset	331 representing instructions in their most abstract form: an opcode and a series of
1d019706d866 LLVM10 anatofuz parents: diff changeset	332 operands. This representation is designed to support both an SSA representation
1d019706d866 LLVM10 anatofuz parents: diff changeset	333 for machine code, as well as a register allocated, non-SSA form.
1d019706d866 LLVM10 anatofuz parents: diff changeset	334
1d019706d866 LLVM10 anatofuz parents: diff changeset	335 .. _MachineInstr:
1d019706d866 LLVM10 anatofuz parents: diff changeset	336
1d019706d866 LLVM10 anatofuz parents: diff changeset	337 The ``MachineInstr`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	338 --------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	339
1d019706d866 LLVM10 anatofuz parents: diff changeset	340 Target machine instructions are represented as instances of the ``MachineInstr``
1d019706d866 LLVM10 anatofuz parents: diff changeset	341 class. This class is an extremely abstract way of representing machine
1d019706d866 LLVM10 anatofuz parents: diff changeset	342 instructions. In particular, it only keeps track of an opcode number and a set
1d019706d866 LLVM10 anatofuz parents: diff changeset	343 of operands.
1d019706d866 LLVM10 anatofuz parents: diff changeset	344
1d019706d866 LLVM10 anatofuz parents: diff changeset	345 The opcode number is a simple unsigned integer that only has meaning to a
1d019706d866 LLVM10 anatofuz parents: diff changeset	346 specific backend. All of the instructions for a target should be defined in the
1d019706d866 LLVM10 anatofuz parents: diff changeset	347 ``*InstrInfo.td`` file for the target. The opcode enum values are auto-generated
1d019706d866 LLVM10 anatofuz parents: diff changeset	348 from this description. The ``MachineInstr`` class does not have any information
1d019706d866 LLVM10 anatofuz parents: diff changeset	349 about how to interpret the instruction (i.e., what the semantics of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	350 instruction are); for that you must refer to the :raw-html:`<tt>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	351 `TargetInstrInfo`_ :raw-html:`</tt>` class.
1d019706d866 LLVM10 anatofuz parents: diff changeset	352
1d019706d866 LLVM10 anatofuz parents: diff changeset	353 The operands of a machine instruction can be of several different types: a
1d019706d866 LLVM10 anatofuz parents: diff changeset	354 register reference, a constant integer, a basic block reference, etc. In
1d019706d866 LLVM10 anatofuz parents: diff changeset	355 addition, a machine operand should be marked as a def or a use of the value
1d019706d866 LLVM10 anatofuz parents: diff changeset	356 (though only registers are allowed to be defs).
1d019706d866 LLVM10 anatofuz parents: diff changeset	357
1d019706d866 LLVM10 anatofuz parents: diff changeset	358 By convention, the LLVM code generator orders instruction operands so that all
1d019706d866 LLVM10 anatofuz parents: diff changeset	359 register definitions come before the register uses, even on architectures that
1d019706d866 LLVM10 anatofuz parents: diff changeset	360 are normally printed in other orders. For example, the SPARC add instruction:
1d019706d866 LLVM10 anatofuz parents: diff changeset	361 "``add %i1, %i2, %i3``" adds the "%i1", and "%i2" registers and stores the
1d019706d866 LLVM10 anatofuz parents: diff changeset	362 result into the "%i3" register. In the LLVM code generator, the operands should
1d019706d866 LLVM10 anatofuz parents: diff changeset	363 be stored as "``%i3, %i1, %i2``": with the destination first.
1d019706d866 LLVM10 anatofuz parents: diff changeset	364
1d019706d866 LLVM10 anatofuz parents: diff changeset	365 Keeping destination (definition) operands at the beginning of the operand list
1d019706d866 LLVM10 anatofuz parents: diff changeset	366 has several advantages. In particular, the debugging printer will print the
1d019706d866 LLVM10 anatofuz parents: diff changeset	367 instruction like this:
1d019706d866 LLVM10 anatofuz parents: diff changeset	368
1d019706d866 LLVM10 anatofuz parents: diff changeset	369 .. code-block:: llvm
1d019706d866 LLVM10 anatofuz parents: diff changeset	370
1d019706d866 LLVM10 anatofuz parents: diff changeset	371 %r3 = add %i1, %i2
1d019706d866 LLVM10 anatofuz parents: diff changeset	372
1d019706d866 LLVM10 anatofuz parents: diff changeset	373 Also if the first operand is a def, it is easier to `create instructions`_ whose
1d019706d866 LLVM10 anatofuz parents: diff changeset	374 only def is the first operand.
1d019706d866 LLVM10 anatofuz parents: diff changeset	375
1d019706d866 LLVM10 anatofuz parents: diff changeset	376 .. _create instructions:
1d019706d866 LLVM10 anatofuz parents: diff changeset	377
1d019706d866 LLVM10 anatofuz parents: diff changeset	378 Using the ``MachineInstrBuilder.h`` functions
1d019706d866 LLVM10 anatofuz parents: diff changeset	379 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	380
1d019706d866 LLVM10 anatofuz parents: diff changeset	381 Machine instructions are created by using the ``BuildMI`` functions, located in
1d019706d866 LLVM10 anatofuz parents: diff changeset	382 the ``include/llvm/CodeGen/MachineInstrBuilder.h`` file. The ``BuildMI``
1d019706d866 LLVM10 anatofuz parents: diff changeset	383 functions make it easy to build arbitrary machine instructions. Usage of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	384 ``BuildMI`` functions look like this:
1d019706d866 LLVM10 anatofuz parents: diff changeset	385
1d019706d866 LLVM10 anatofuz parents: diff changeset	386 .. code-block:: c++
1d019706d866 LLVM10 anatofuz parents: diff changeset	387
1d019706d866 LLVM10 anatofuz parents: diff changeset	388 // Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42')
1d019706d866 LLVM10 anatofuz parents: diff changeset	389 // instruction and insert it at the end of the given MachineBasicBlock.
1d019706d866 LLVM10 anatofuz parents: diff changeset	390 const TargetInstrInfo &TII = ...
1d019706d866 LLVM10 anatofuz parents: diff changeset	391 MachineBasicBlock &MBB = ...
1d019706d866 LLVM10 anatofuz parents: diff changeset	392 DebugLoc DL;
1d019706d866 LLVM10 anatofuz parents: diff changeset	393 MachineInstr *MI = BuildMI(MBB, DL, TII.get(X86::MOV32ri), DestReg).addImm(42);
1d019706d866 LLVM10 anatofuz parents: diff changeset	394
1d019706d866 LLVM10 anatofuz parents: diff changeset	395 // Create the same instr, but insert it before a specified iterator point.
1d019706d866 LLVM10 anatofuz parents: diff changeset	396 MachineBasicBlock::iterator MBBI = ...
1d019706d866 LLVM10 anatofuz parents: diff changeset	397 BuildMI(MBB, MBBI, DL, TII.get(X86::MOV32ri), DestReg).addImm(42);
1d019706d866 LLVM10 anatofuz parents: diff changeset	398
1d019706d866 LLVM10 anatofuz parents: diff changeset	399 // Create a 'cmp Reg, 0' instruction, no destination reg.
1d019706d866 LLVM10 anatofuz parents: diff changeset	400 MI = BuildMI(MBB, DL, TII.get(X86::CMP32ri8)).addReg(Reg).addImm(42);
1d019706d866 LLVM10 anatofuz parents: diff changeset	401
1d019706d866 LLVM10 anatofuz parents: diff changeset	402 // Create an 'sahf' instruction which takes no operands and stores nothing.
1d019706d866 LLVM10 anatofuz parents: diff changeset	403 MI = BuildMI(MBB, DL, TII.get(X86::SAHF));
1d019706d866 LLVM10 anatofuz parents: diff changeset	404
1d019706d866 LLVM10 anatofuz parents: diff changeset	405 // Create a self looping branch instruction.
1d019706d866 LLVM10 anatofuz parents: diff changeset	406 BuildMI(MBB, DL, TII.get(X86::JNE)).addMBB(&MBB);
1d019706d866 LLVM10 anatofuz parents: diff changeset	407
1d019706d866 LLVM10 anatofuz parents: diff changeset	408 If you need to add a definition operand (other than the optional destination
1d019706d866 LLVM10 anatofuz parents: diff changeset	409 register), you must explicitly mark it as such:
1d019706d866 LLVM10 anatofuz parents: diff changeset	410
1d019706d866 LLVM10 anatofuz parents: diff changeset	411 .. code-block:: c++
1d019706d866 LLVM10 anatofuz parents: diff changeset	412
1d019706d866 LLVM10 anatofuz parents: diff changeset	413 MI.addReg(Reg, RegState::Define);
1d019706d866 LLVM10 anatofuz parents: diff changeset	414
1d019706d866 LLVM10 anatofuz parents: diff changeset	415 Fixed (preassigned) registers
1d019706d866 LLVM10 anatofuz parents: diff changeset	416 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	417
1d019706d866 LLVM10 anatofuz parents: diff changeset	418 One important issue that the code generator needs to be aware of is the presence
1d019706d866 LLVM10 anatofuz parents: diff changeset	419 of fixed registers. In particular, there are often places in the instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	420 stream where the register allocator must arrange for a particular value to be
1d019706d866 LLVM10 anatofuz parents: diff changeset	421 in a particular register. This can occur due to limitations of the instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	422 set (e.g., the X86 can only do a 32-bit divide with the ``EAX``/``EDX``
1d019706d866 LLVM10 anatofuz parents: diff changeset	423 registers), or external factors like calling conventions. In any case, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	424 instruction selector should emit code that copies a virtual register into or out
1d019706d866 LLVM10 anatofuz parents: diff changeset	425 of a physical register when needed.
1d019706d866 LLVM10 anatofuz parents: diff changeset	426
1d019706d866 LLVM10 anatofuz parents: diff changeset	427 For example, consider this simple LLVM example:
1d019706d866 LLVM10 anatofuz parents: diff changeset	428
1d019706d866 LLVM10 anatofuz parents: diff changeset	429 .. code-block:: llvm
1d019706d866 LLVM10 anatofuz parents: diff changeset	430
1d019706d866 LLVM10 anatofuz parents: diff changeset	431 define i32 @test(i32 %X, i32 %Y) {
1d019706d866 LLVM10 anatofuz parents: diff changeset	432 %Z = sdiv i32 %X, %Y
1d019706d866 LLVM10 anatofuz parents: diff changeset	433 ret i32 %Z
1d019706d866 LLVM10 anatofuz parents: diff changeset	434 }
1d019706d866 LLVM10 anatofuz parents: diff changeset	435
1d019706d866 LLVM10 anatofuz parents: diff changeset	436 The X86 instruction selector might produce this machine code for the ``div`` and
1d019706d866 LLVM10 anatofuz parents: diff changeset	437 ``ret``:
1d019706d866 LLVM10 anatofuz parents: diff changeset	438
1d019706d866 LLVM10 anatofuz parents: diff changeset	439 .. code-block:: text
1d019706d866 LLVM10 anatofuz parents: diff changeset	440
1d019706d866 LLVM10 anatofuz parents: diff changeset	441 ;; Start of div
1d019706d866 LLVM10 anatofuz parents: diff changeset	442 %EAX = mov %reg1024 ;; Copy X (in reg1024) into EAX
1d019706d866 LLVM10 anatofuz parents: diff changeset	443 %reg1027 = sar %reg1024, 31
1d019706d866 LLVM10 anatofuz parents: diff changeset	444 %EDX = mov %reg1027 ;; Sign extend X into EDX
1d019706d866 LLVM10 anatofuz parents: diff changeset	445 idiv %reg1025 ;; Divide by Y (in reg1025)
1d019706d866 LLVM10 anatofuz parents: diff changeset	446 %reg1026 = mov %EAX ;; Read the result (Z) out of EAX
1d019706d866 LLVM10 anatofuz parents: diff changeset	447
1d019706d866 LLVM10 anatofuz parents: diff changeset	448 ;; Start of ret
1d019706d866 LLVM10 anatofuz parents: diff changeset	449 %EAX = mov %reg1026 ;; 32-bit return value goes in EAX
1d019706d866 LLVM10 anatofuz parents: diff changeset	450 ret
1d019706d866 LLVM10 anatofuz parents: diff changeset	451
1d019706d866 LLVM10 anatofuz parents: diff changeset	452 By the end of code generation, the register allocator would coalesce the
1d019706d866 LLVM10 anatofuz parents: diff changeset	453 registers and delete the resultant identity moves producing the following
1d019706d866 LLVM10 anatofuz parents: diff changeset	454 code:
1d019706d866 LLVM10 anatofuz parents: diff changeset	455
1d019706d866 LLVM10 anatofuz parents: diff changeset	456 .. code-block:: text
1d019706d866 LLVM10 anatofuz parents: diff changeset	457
1d019706d866 LLVM10 anatofuz parents: diff changeset	458 ;; X is in EAX, Y is in ECX
1d019706d866 LLVM10 anatofuz parents: diff changeset	459 mov %EAX, %EDX
1d019706d866 LLVM10 anatofuz parents: diff changeset	460 sar %EDX, 31
1d019706d866 LLVM10 anatofuz parents: diff changeset	461 idiv %ECX
1d019706d866 LLVM10 anatofuz parents: diff changeset	462 ret
1d019706d866 LLVM10 anatofuz parents: diff changeset	463
1d019706d866 LLVM10 anatofuz parents: diff changeset	464 This approach is extremely general (if it can handle the X86 architecture, it
1d019706d866 LLVM10 anatofuz parents: diff changeset	465 can handle anything!) and allows all of the target specific knowledge about the
1d019706d866 LLVM10 anatofuz parents: diff changeset	466 instruction stream to be isolated in the instruction selector. Note that
1d019706d866 LLVM10 anatofuz parents: diff changeset	467 physical registers should have a short lifetime for good code generation, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	468 all physical registers are assumed dead on entry to and exit from basic blocks
1d019706d866 LLVM10 anatofuz parents: diff changeset	469 (before register allocation). Thus, if you need a value to be live across basic
1d019706d866 LLVM10 anatofuz parents: diff changeset	470 block boundaries, it must live in a virtual register.
1d019706d866 LLVM10 anatofuz parents: diff changeset	471
1d019706d866 LLVM10 anatofuz parents: diff changeset	472 Call-clobbered registers
1d019706d866 LLVM10 anatofuz parents: diff changeset	473 ^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	474
1d019706d866 LLVM10 anatofuz parents: diff changeset	475 Some machine instructions, like calls, clobber a large number of physical
1d019706d866 LLVM10 anatofuz parents: diff changeset	476 registers. Rather than adding ``<def,dead>`` operands for all of them, it is
1d019706d866 LLVM10 anatofuz parents: diff changeset	477 possible to use an ``MO_RegisterMask`` operand instead. The register mask
1d019706d866 LLVM10 anatofuz parents: diff changeset	478 operand holds a bit mask of preserved registers, and everything else is
1d019706d866 LLVM10 anatofuz parents: diff changeset	479 considered to be clobbered by the instruction.
1d019706d866 LLVM10 anatofuz parents: diff changeset	480
1d019706d866 LLVM10 anatofuz parents: diff changeset	481 Machine code in SSA form
1d019706d866 LLVM10 anatofuz parents: diff changeset	482 ^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	483
1d019706d866 LLVM10 anatofuz parents: diff changeset	484 ``MachineInstr``'s are initially selected in SSA-form, and are maintained in
1d019706d866 LLVM10 anatofuz parents: diff changeset	485 SSA-form until register allocation happens. For the most part, this is
1d019706d866 LLVM10 anatofuz parents: diff changeset	486 trivially simple since LLVM is already in SSA form; LLVM PHI nodes become
1d019706d866 LLVM10 anatofuz parents: diff changeset	487 machine code PHI nodes, and virtual registers are only allowed to have a single
1d019706d866 LLVM10 anatofuz parents: diff changeset	488 definition.
1d019706d866 LLVM10 anatofuz parents: diff changeset	489
1d019706d866 LLVM10 anatofuz parents: diff changeset	490 After register allocation, machine code is no longer in SSA-form because there
1d019706d866 LLVM10 anatofuz parents: diff changeset	491 are no virtual registers left in the code.
1d019706d866 LLVM10 anatofuz parents: diff changeset	492
1d019706d866 LLVM10 anatofuz parents: diff changeset	493 .. _MachineBasicBlock:
1d019706d866 LLVM10 anatofuz parents: diff changeset	494
1d019706d866 LLVM10 anatofuz parents: diff changeset	495 The ``MachineBasicBlock`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	496 -------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	497
1d019706d866 LLVM10 anatofuz parents: diff changeset	498 The ``MachineBasicBlock`` class contains a list of machine instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	499 (:raw-html:`<tt>` `MachineInstr`_ :raw-html:`</tt>` instances). It roughly
1d019706d866 LLVM10 anatofuz parents: diff changeset	500 corresponds to the LLVM code input to the instruction selector, but there can be
1d019706d866 LLVM10 anatofuz parents: diff changeset	501 a one-to-many mapping (i.e. one LLVM basic block can map to multiple machine
1d019706d866 LLVM10 anatofuz parents: diff changeset	502 basic blocks). The ``MachineBasicBlock`` class has a "``getBasicBlock``" method,
1d019706d866 LLVM10 anatofuz parents: diff changeset	503 which returns the LLVM basic block that it comes from.
1d019706d866 LLVM10 anatofuz parents: diff changeset	504
1d019706d866 LLVM10 anatofuz parents: diff changeset	505 .. _MachineFunction:
1d019706d866 LLVM10 anatofuz parents: diff changeset	506
1d019706d866 LLVM10 anatofuz parents: diff changeset	507 The ``MachineFunction`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	508 -----------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	509
1d019706d866 LLVM10 anatofuz parents: diff changeset	510 The ``MachineFunction`` class contains a list of machine basic blocks
1d019706d866 LLVM10 anatofuz parents: diff changeset	511 (:raw-html:`<tt>` `MachineBasicBlock`_ :raw-html:`</tt>` instances). It
1d019706d866 LLVM10 anatofuz parents: diff changeset	512 corresponds one-to-one with the LLVM function input to the instruction selector.
1d019706d866 LLVM10 anatofuz parents: diff changeset	513 In addition to a list of basic blocks, the ``MachineFunction`` contains a a
1d019706d866 LLVM10 anatofuz parents: diff changeset	514 ``MachineConstantPool``, a ``MachineFrameInfo``, a ``MachineFunctionInfo``, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	515 a ``MachineRegisterInfo``. See ``include/llvm/CodeGen/MachineFunction.h`` for
1d019706d866 LLVM10 anatofuz parents: diff changeset	516 more information.
1d019706d866 LLVM10 anatofuz parents: diff changeset	517
1d019706d866 LLVM10 anatofuz parents: diff changeset	518 ``MachineInstr Bundles``
1d019706d866 LLVM10 anatofuz parents: diff changeset	519 ------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	520
1d019706d866 LLVM10 anatofuz parents: diff changeset	521 LLVM code generator can model sequences of instructions as MachineInstr
1d019706d866 LLVM10 anatofuz parents: diff changeset	522 bundles. A MI bundle can model a VLIW group / pack which contains an arbitrary
1d019706d866 LLVM10 anatofuz parents: diff changeset	523 number of parallel instructions. It can also be used to model a sequential list
1d019706d866 LLVM10 anatofuz parents: diff changeset	524 of instructions (potentially with data dependencies) that cannot be legally
1d019706d866 LLVM10 anatofuz parents: diff changeset	525 separated (e.g. ARM Thumb2 IT blocks).
1d019706d866 LLVM10 anatofuz parents: diff changeset	526
1d019706d866 LLVM10 anatofuz parents: diff changeset	527 Conceptually a MI bundle is a MI with a number of other MIs nested within:
1d019706d866 LLVM10 anatofuz parents: diff changeset	528
1d019706d866 LLVM10 anatofuz parents: diff changeset	529 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	530
1d019706d866 LLVM10 anatofuz parents: diff changeset	531 --------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	532 \| Bundle \| ---------
1d019706d866 LLVM10 anatofuz parents: diff changeset	533 -------------- \
1d019706d866 LLVM10 anatofuz parents: diff changeset	534 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	535 \| \| MI \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	536 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	537 \| \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	538 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	539 \| \| MI \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	540 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	541 \| \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	542 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	543 \| \| MI \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	544 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	545 \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	546 --------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	547 \| Bundle \| --------
1d019706d866 LLVM10 anatofuz parents: diff changeset	548 -------------- \
1d019706d866 LLVM10 anatofuz parents: diff changeset	549 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	550 \| \| MI \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	551 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	552 \| \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	553 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	554 \| \| MI \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	555 \| ----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	556 \| \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	557 \| ...
1d019706d866 LLVM10 anatofuz parents: diff changeset	558 \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	559 --------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	560 \| Bundle \| --------
1d019706d866 LLVM10 anatofuz parents: diff changeset	561 -------------- \
1d019706d866 LLVM10 anatofuz parents: diff changeset	562 \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	563 ...
1d019706d866 LLVM10 anatofuz parents: diff changeset	564
1d019706d866 LLVM10 anatofuz parents: diff changeset	565 MI bundle support does not change the physical representations of
1d019706d866 LLVM10 anatofuz parents: diff changeset	566 MachineBasicBlock and MachineInstr. All the MIs (including top level and nested
1d019706d866 LLVM10 anatofuz parents: diff changeset	567 ones) are stored as sequential list of MIs. The "bundled" MIs are marked with
1d019706d866 LLVM10 anatofuz parents: diff changeset	568 the 'InsideBundle' flag. A top level MI with the special BUNDLE opcode is used
1d019706d866 LLVM10 anatofuz parents: diff changeset	569 to represent the start of a bundle. It's legal to mix BUNDLE MIs with individual
1d019706d866 LLVM10 anatofuz parents: diff changeset	570 MIs that are not inside bundles nor represent bundles.
1d019706d866 LLVM10 anatofuz parents: diff changeset	571
1d019706d866 LLVM10 anatofuz parents: diff changeset	572 MachineInstr passes should operate on a MI bundle as a single unit. Member
1d019706d866 LLVM10 anatofuz parents: diff changeset	573 methods have been taught to correctly handle bundles and MIs inside bundles.
1d019706d866 LLVM10 anatofuz parents: diff changeset	574 The MachineBasicBlock iterator has been modified to skip over bundled MIs to
1d019706d866 LLVM10 anatofuz parents: diff changeset	575 enforce the bundle-as-a-single-unit concept. An alternative iterator
1d019706d866 LLVM10 anatofuz parents: diff changeset	576 instr_iterator has been added to MachineBasicBlock to allow passes to iterate
1d019706d866 LLVM10 anatofuz parents: diff changeset	577 over all of the MIs in a MachineBasicBlock, including those which are nested
1d019706d866 LLVM10 anatofuz parents: diff changeset	578 inside bundles. The top level BUNDLE instruction must have the correct set of
1d019706d866 LLVM10 anatofuz parents: diff changeset	579 register MachineOperand's that represent the cumulative inputs and outputs of
1d019706d866 LLVM10 anatofuz parents: diff changeset	580 the bundled MIs.
1d019706d866 LLVM10 anatofuz parents: diff changeset	581
1d019706d866 LLVM10 anatofuz parents: diff changeset	582 Packing / bundling of MachineInstrs for VLIW architectures should
1d019706d866 LLVM10 anatofuz parents: diff changeset	583 generally be done as part of the register allocation super-pass. More
1d019706d866 LLVM10 anatofuz parents: diff changeset	584 specifically, the pass which determines what MIs should be bundled
1d019706d866 LLVM10 anatofuz parents: diff changeset	585 together should be done after code generator exits SSA form
1d019706d866 LLVM10 anatofuz parents: diff changeset	586 (i.e. after two-address pass, PHI elimination, and copy coalescing).
1d019706d866 LLVM10 anatofuz parents: diff changeset	587 Such bundles should be finalized (i.e. adding BUNDLE MIs and input and
1d019706d866 LLVM10 anatofuz parents: diff changeset	588 output register MachineOperands) after virtual registers have been
1d019706d866 LLVM10 anatofuz parents: diff changeset	589 rewritten into physical registers. This eliminates the need to add
1d019706d866 LLVM10 anatofuz parents: diff changeset	590 virtual register operands to BUNDLE instructions which would
1d019706d866 LLVM10 anatofuz parents: diff changeset	591 effectively double the virtual register def and use lists. Bundles may
1d019706d866 LLVM10 anatofuz parents: diff changeset	592 use virtual registers and be formed in SSA form, but may not be
1d019706d866 LLVM10 anatofuz parents: diff changeset	593 appropriate for all use cases.
1d019706d866 LLVM10 anatofuz parents: diff changeset	594
1d019706d866 LLVM10 anatofuz parents: diff changeset	595 .. _MC Layer:
1d019706d866 LLVM10 anatofuz parents: diff changeset	596
1d019706d866 LLVM10 anatofuz parents: diff changeset	597 The "MC" Layer
1d019706d866 LLVM10 anatofuz parents: diff changeset	598 ==============
1d019706d866 LLVM10 anatofuz parents: diff changeset	599
1d019706d866 LLVM10 anatofuz parents: diff changeset	600 The MC Layer is used to represent and process code at the raw machine code
1d019706d866 LLVM10 anatofuz parents: diff changeset	601 level, devoid of "high level" information like "constant pools", "jump tables",
1d019706d866 LLVM10 anatofuz parents: diff changeset	602 "global variables" or anything like that. At this level, LLVM handles things
1d019706d866 LLVM10 anatofuz parents: diff changeset	603 like label names, machine instructions, and sections in the object file. The
1d019706d866 LLVM10 anatofuz parents: diff changeset	604 code in this layer is used for a number of important purposes: the tail end of
1d019706d866 LLVM10 anatofuz parents: diff changeset	605 the code generator uses it to write a .s or .o file, and it is also used by the
1d019706d866 LLVM10 anatofuz parents: diff changeset	606 llvm-mc tool to implement standalone machine code assemblers and disassemblers.
1d019706d866 LLVM10 anatofuz parents: diff changeset	607
1d019706d866 LLVM10 anatofuz parents: diff changeset	608 This section describes some of the important classes. There are also a number
1d019706d866 LLVM10 anatofuz parents: diff changeset	609 of important subsystems that interact at this layer, they are described later in
1d019706d866 LLVM10 anatofuz parents: diff changeset	610 this manual.
1d019706d866 LLVM10 anatofuz parents: diff changeset	611
1d019706d866 LLVM10 anatofuz parents: diff changeset	612 .. _MCStreamer:
1d019706d866 LLVM10 anatofuz parents: diff changeset	613
1d019706d866 LLVM10 anatofuz parents: diff changeset	614 The ``MCStreamer`` API
1d019706d866 LLVM10 anatofuz parents: diff changeset	615 ----------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	616
1d019706d866 LLVM10 anatofuz parents: diff changeset	617 MCStreamer is best thought of as an assembler API. It is an abstract API which
1d019706d866 LLVM10 anatofuz parents: diff changeset	618 is implemented in different ways (e.g. to output a .s file, output an ELF .o
1d019706d866 LLVM10 anatofuz parents: diff changeset	619 file, etc) but whose API correspond directly to what you see in a .s file.
1d019706d866 LLVM10 anatofuz parents: diff changeset	620 MCStreamer has one method per directive, such as EmitLabel, EmitSymbolAttribute,
1d019706d866 LLVM10 anatofuz parents: diff changeset	621 SwitchSection, EmitValue (for .byte, .word), etc, which directly correspond to
1d019706d866 LLVM10 anatofuz parents: diff changeset	622 assembly level directives. It also has an EmitInstruction method, which is used
1d019706d866 LLVM10 anatofuz parents: diff changeset	623 to output an MCInst to the streamer.
1d019706d866 LLVM10 anatofuz parents: diff changeset	624
1d019706d866 LLVM10 anatofuz parents: diff changeset	625 This API is most important for two clients: the llvm-mc stand-alone assembler is
1d019706d866 LLVM10 anatofuz parents: diff changeset	626 effectively a parser that parses a line, then invokes a method on MCStreamer. In
1d019706d866 LLVM10 anatofuz parents: diff changeset	627 the code generator, the `Code Emission`_ phase of the code generator lowers
1d019706d866 LLVM10 anatofuz parents: diff changeset	628 higher level LLVM IR and Machine* constructs down to the MC layer, emitting
1d019706d866 LLVM10 anatofuz parents: diff changeset	629 directives through MCStreamer.
1d019706d866 LLVM10 anatofuz parents: diff changeset	630
1d019706d866 LLVM10 anatofuz parents: diff changeset	631 On the implementation side of MCStreamer, there are two major implementations:
1d019706d866 LLVM10 anatofuz parents: diff changeset	632 one for writing out a .s file (MCAsmStreamer), and one for writing out a .o
1d019706d866 LLVM10 anatofuz parents: diff changeset	633 file (MCObjectStreamer). MCAsmStreamer is a straightforward implementation
1d019706d866 LLVM10 anatofuz parents: diff changeset	634 that prints out a directive for each method (e.g. ``EmitValue -> .byte``), but
1d019706d866 LLVM10 anatofuz parents: diff changeset	635 MCObjectStreamer implements a full assembler.
1d019706d866 LLVM10 anatofuz parents: diff changeset	636
1d019706d866 LLVM10 anatofuz parents: diff changeset	637 For target specific directives, the MCStreamer has a MCTargetStreamer instance.
1d019706d866 LLVM10 anatofuz parents: diff changeset	638 Each target that needs it defines a class that inherits from it and is a lot
1d019706d866 LLVM10 anatofuz parents: diff changeset	639 like MCStreamer itself: It has one method per directive and two classes that
1d019706d866 LLVM10 anatofuz parents: diff changeset	640 inherit from it, a target object streamer and a target asm streamer. The target
1d019706d866 LLVM10 anatofuz parents: diff changeset	641 asm streamer just prints it (``emitFnStart -> .fnstart``), and the object
1d019706d866 LLVM10 anatofuz parents: diff changeset	642 streamer implement the assembler logic for it.
1d019706d866 LLVM10 anatofuz parents: diff changeset	643
1d019706d866 LLVM10 anatofuz parents: diff changeset	644 To make llvm use these classes, the target initialization must call
1d019706d866 LLVM10 anatofuz parents: diff changeset	645 TargetRegistry::RegisterAsmStreamer and TargetRegistry::RegisterMCObjectStreamer
1d019706d866 LLVM10 anatofuz parents: diff changeset	646 passing callbacks that allocate the corresponding target streamer and pass it
1d019706d866 LLVM10 anatofuz parents: diff changeset	647 to createAsmStreamer or to the appropriate object streamer constructor.
1d019706d866 LLVM10 anatofuz parents: diff changeset	648
1d019706d866 LLVM10 anatofuz parents: diff changeset	649 The ``MCContext`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	650 -----------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	651
1d019706d866 LLVM10 anatofuz parents: diff changeset	652 The MCContext class is the owner of a variety of uniqued data structures at the
1d019706d866 LLVM10 anatofuz parents: diff changeset	653 MC layer, including symbols, sections, etc. As such, this is the class that you
1d019706d866 LLVM10 anatofuz parents: diff changeset	654 interact with to create symbols and sections. This class can not be subclassed.
1d019706d866 LLVM10 anatofuz parents: diff changeset	655
1d019706d866 LLVM10 anatofuz parents: diff changeset	656 The ``MCSymbol`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	657 ----------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	658
1d019706d866 LLVM10 anatofuz parents: diff changeset	659 The MCSymbol class represents a symbol (aka label) in the assembly file. There
1d019706d866 LLVM10 anatofuz parents: diff changeset	660 are two interesting kinds of symbols: assembler temporary symbols, and normal
1d019706d866 LLVM10 anatofuz parents: diff changeset	661 symbols. Assembler temporary symbols are used and processed by the assembler
1d019706d866 LLVM10 anatofuz parents: diff changeset	662 but are discarded when the object file is produced. The distinction is usually
1d019706d866 LLVM10 anatofuz parents: diff changeset	663 represented by adding a prefix to the label, for example "L" labels are
1d019706d866 LLVM10 anatofuz parents: diff changeset	664 assembler temporary labels in MachO.
1d019706d866 LLVM10 anatofuz parents: diff changeset	665
1d019706d866 LLVM10 anatofuz parents: diff changeset	666 MCSymbols are created by MCContext and uniqued there. This means that MCSymbols
1d019706d866 LLVM10 anatofuz parents: diff changeset	667 can be compared for pointer equivalence to find out if they are the same symbol.
1d019706d866 LLVM10 anatofuz parents: diff changeset	668 Note that pointer inequality does not guarantee the labels will end up at
1d019706d866 LLVM10 anatofuz parents: diff changeset	669 different addresses though. It's perfectly legal to output something like this
1d019706d866 LLVM10 anatofuz parents: diff changeset	670 to the .s file:
1d019706d866 LLVM10 anatofuz parents: diff changeset	671
1d019706d866 LLVM10 anatofuz parents: diff changeset	672 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	673
1d019706d866 LLVM10 anatofuz parents: diff changeset	674 foo:
1d019706d866 LLVM10 anatofuz parents: diff changeset	675 bar:
1d019706d866 LLVM10 anatofuz parents: diff changeset	676 .byte 4
1d019706d866 LLVM10 anatofuz parents: diff changeset	677
1d019706d866 LLVM10 anatofuz parents: diff changeset	678 In this case, both the foo and bar symbols will have the same address.
1d019706d866 LLVM10 anatofuz parents: diff changeset	679
1d019706d866 LLVM10 anatofuz parents: diff changeset	680 The ``MCSection`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	681 -----------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	682
1d019706d866 LLVM10 anatofuz parents: diff changeset	683 The ``MCSection`` class represents an object-file specific section. It is
1d019706d866 LLVM10 anatofuz parents: diff changeset	684 subclassed by object file specific implementations (e.g. ``MCSectionMachO``,
1d019706d866 LLVM10 anatofuz parents: diff changeset	685 ``MCSectionCOFF``, ``MCSectionELF``) and these are created and uniqued by
1d019706d866 LLVM10 anatofuz parents: diff changeset	686 MCContext. The MCStreamer has a notion of the current section, which can be
1d019706d866 LLVM10 anatofuz parents: diff changeset	687 changed with the SwitchToSection method (which corresponds to a ".section"
1d019706d866 LLVM10 anatofuz parents: diff changeset	688 directive in a .s file).
1d019706d866 LLVM10 anatofuz parents: diff changeset	689
1d019706d866 LLVM10 anatofuz parents: diff changeset	690 .. _MCInst:
1d019706d866 LLVM10 anatofuz parents: diff changeset	691
1d019706d866 LLVM10 anatofuz parents: diff changeset	692 The ``MCInst`` class
1d019706d866 LLVM10 anatofuz parents: diff changeset	693 --------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	694
1d019706d866 LLVM10 anatofuz parents: diff changeset	695 The ``MCInst`` class is a target-independent representation of an instruction.
1d019706d866 LLVM10 anatofuz parents: diff changeset	696 It is a simple class (much more so than `MachineInstr`_) that holds a
1d019706d866 LLVM10 anatofuz parents: diff changeset	697 target-specific opcode and a vector of MCOperands. MCOperand, in turn, is a
1d019706d866 LLVM10 anatofuz parents: diff changeset	698 simple discriminated union of three cases: 1) a simple immediate, 2) a target
1d019706d866 LLVM10 anatofuz parents: diff changeset	699 register ID, 3) a symbolic expression (e.g. "``Lfoo-Lbar+42``") as an MCExpr.
1d019706d866 LLVM10 anatofuz parents: diff changeset	700
1d019706d866 LLVM10 anatofuz parents: diff changeset	701 MCInst is the common currency used to represent machine instructions at the MC
1d019706d866 LLVM10 anatofuz parents: diff changeset	702 layer. It is the type used by the instruction encoder, the instruction printer,
1d019706d866 LLVM10 anatofuz parents: diff changeset	703 and the type generated by the assembly parser and disassembler.
1d019706d866 LLVM10 anatofuz parents: diff changeset	704
1d019706d866 LLVM10 anatofuz parents: diff changeset	705 .. _Target-independent algorithms:
1d019706d866 LLVM10 anatofuz parents: diff changeset	706 .. _code generation algorithm:
1d019706d866 LLVM10 anatofuz parents: diff changeset	707
1d019706d866 LLVM10 anatofuz parents: diff changeset	708 Target-independent code generation algorithms
1d019706d866 LLVM10 anatofuz parents: diff changeset	709 =============================================
1d019706d866 LLVM10 anatofuz parents: diff changeset	710
1d019706d866 LLVM10 anatofuz parents: diff changeset	711 This section documents the phases described in the `high-level design of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	712 code generator`_. It explains how they work and some of the rationale behind
1d019706d866 LLVM10 anatofuz parents: diff changeset	713 their design.
1d019706d866 LLVM10 anatofuz parents: diff changeset	714
1d019706d866 LLVM10 anatofuz parents: diff changeset	715 .. _Instruction Selection:
1d019706d866 LLVM10 anatofuz parents: diff changeset	716 .. _instruction selection section:
1d019706d866 LLVM10 anatofuz parents: diff changeset	717
1d019706d866 LLVM10 anatofuz parents: diff changeset	718 Instruction Selection
1d019706d866 LLVM10 anatofuz parents: diff changeset	719 ---------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	720
1d019706d866 LLVM10 anatofuz parents: diff changeset	721 Instruction Selection is the process of translating LLVM code presented to the
1d019706d866 LLVM10 anatofuz parents: diff changeset	722 code generator into target-specific machine instructions. There are several
1d019706d866 LLVM10 anatofuz parents: diff changeset	723 well-known ways to do this in the literature. LLVM uses a SelectionDAG based
1d019706d866 LLVM10 anatofuz parents: diff changeset	724 instruction selector.
1d019706d866 LLVM10 anatofuz parents: diff changeset	725
1d019706d866 LLVM10 anatofuz parents: diff changeset	726 Portions of the DAG instruction selector are generated from the target
1d019706d866 LLVM10 anatofuz parents: diff changeset	727 description (``*.td``) files. Our goal is for the entire instruction selector
1d019706d866 LLVM10 anatofuz parents: diff changeset	728 to be generated from these ``.td`` files, though currently there are still
1d019706d866 LLVM10 anatofuz parents: diff changeset	729 things that require custom C++ code.
1d019706d866 LLVM10 anatofuz parents: diff changeset	730
223 5f17cb93ff66 LLVM13 (2021/7/18) Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 221 diff changeset	731 `GlobalISel <https://llvm.org/docs/GlobalISel/index.html>`_ is another
5f17cb93ff66 LLVM13 (2021/7/18) Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 221 diff changeset	732 instruction selection framework.
5f17cb93ff66 LLVM13 (2021/7/18) Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 221 diff changeset	733
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	734 .. _SelectionDAG:
1d019706d866 LLVM10 anatofuz parents: diff changeset	735
1d019706d866 LLVM10 anatofuz parents: diff changeset	736 Introduction to SelectionDAGs
1d019706d866 LLVM10 anatofuz parents: diff changeset	737 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	738
1d019706d866 LLVM10 anatofuz parents: diff changeset	739 The SelectionDAG provides an abstraction for code representation in a way that
1d019706d866 LLVM10 anatofuz parents: diff changeset	740 is amenable to instruction selection using automatic techniques
1d019706d866 LLVM10 anatofuz parents: diff changeset	741 (e.g. dynamic-programming based optimal pattern matching selectors). It is also
1d019706d866 LLVM10 anatofuz parents: diff changeset	742 well-suited to other phases of code generation; in particular, instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	743 scheduling (SelectionDAG's are very close to scheduling DAGs post-selection).
1d019706d866 LLVM10 anatofuz parents: diff changeset	744 Additionally, the SelectionDAG provides a host representation where a large
1d019706d866 LLVM10 anatofuz parents: diff changeset	745 variety of very-low-level (but target-independent) `optimizations`_ may be
1d019706d866 LLVM10 anatofuz parents: diff changeset	746 performed; ones which require extensive information about the instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	747 efficiently supported by the target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	748
1d019706d866 LLVM10 anatofuz parents: diff changeset	749 The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	750 ``SDNode`` class. The primary payload of the ``SDNode`` is its operation code
1d019706d866 LLVM10 anatofuz parents: diff changeset	751 (Opcode) that indicates what operation the node performs and the operands to the
1d019706d866 LLVM10 anatofuz parents: diff changeset	752 operation. The various operation node types are described at the top of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	753 ``include/llvm/CodeGen/ISDOpcodes.h`` file.
1d019706d866 LLVM10 anatofuz parents: diff changeset	754
1d019706d866 LLVM10 anatofuz parents: diff changeset	755 Although most operations define a single value, each node in the graph may
1d019706d866 LLVM10 anatofuz parents: diff changeset	756 define multiple values. For example, a combined div/rem operation will define
1d019706d866 LLVM10 anatofuz parents: diff changeset	757 both the dividend and the remainder. Many other situations require multiple
1d019706d866 LLVM10 anatofuz parents: diff changeset	758 values as well. Each node also has some number of operands, which are edges to
1d019706d866 LLVM10 anatofuz parents: diff changeset	759 the node defining the used value. Because nodes may define multiple values,
1d019706d866 LLVM10 anatofuz parents: diff changeset	760 edges are represented by instances of the ``SDValue`` class, which is a
1d019706d866 LLVM10 anatofuz parents: diff changeset	761 ``<SDNode, unsigned>`` pair, indicating the node and result value being used,
1d019706d866 LLVM10 anatofuz parents: diff changeset	762 respectively. Each value produced by an ``SDNode`` has an associated ``MVT``
1d019706d866 LLVM10 anatofuz parents: diff changeset	763 (Machine Value Type) indicating what the type of the value is.
1d019706d866 LLVM10 anatofuz parents: diff changeset	764
1d019706d866 LLVM10 anatofuz parents: diff changeset	765 SelectionDAGs contain two different kinds of values: those that represent data
1d019706d866 LLVM10 anatofuz parents: diff changeset	766 flow and those that represent control flow dependencies. Data values are simple
1d019706d866 LLVM10 anatofuz parents: diff changeset	767 edges with an integer or floating point value type. Control edges are
1d019706d866 LLVM10 anatofuz parents: diff changeset	768 represented as "chain" edges which are of type ``MVT::Other``. These edges
1d019706d866 LLVM10 anatofuz parents: diff changeset	769 provide an ordering between nodes that have side effects (such as loads, stores,
1d019706d866 LLVM10 anatofuz parents: diff changeset	770 calls, returns, etc). All nodes that have side effects should take a token
1d019706d866 LLVM10 anatofuz parents: diff changeset	771 chain as input and produce a new one as output. By convention, token chain
1d019706d866 LLVM10 anatofuz parents: diff changeset	772 inputs are always operand #0, and chain results are always the last value
1d019706d866 LLVM10 anatofuz parents: diff changeset	773 produced by an operation. However, after instruction selection, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	774 machine nodes have their chain after the instruction's operands, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	775 may be followed by glue nodes.
1d019706d866 LLVM10 anatofuz parents: diff changeset	776
1d019706d866 LLVM10 anatofuz parents: diff changeset	777 A SelectionDAG has designated "Entry" and "Root" nodes. The Entry node is
1d019706d866 LLVM10 anatofuz parents: diff changeset	778 always a marker node with an Opcode of ``ISD::EntryToken``. The Root node is
1d019706d866 LLVM10 anatofuz parents: diff changeset	779 the final side-effecting node in the token chain. For example, in a single basic
1d019706d866 LLVM10 anatofuz parents: diff changeset	780 block function it would be the return node.
1d019706d866 LLVM10 anatofuz parents: diff changeset	781
1d019706d866 LLVM10 anatofuz parents: diff changeset	782 One important concept for SelectionDAGs is the notion of a "legal" vs.
1d019706d866 LLVM10 anatofuz parents: diff changeset	783 "illegal" DAG. A legal DAG for a target is one that only uses supported
1d019706d866 LLVM10 anatofuz parents: diff changeset	784 operations and supported types. On a 32-bit PowerPC, for example, a DAG with a
1d019706d866 LLVM10 anatofuz parents: diff changeset	785 value of type i1, i8, i16, or i64 would be illegal, as would a DAG that uses a
1d019706d866 LLVM10 anatofuz parents: diff changeset	786 SREM or UREM operation. The `legalize types`_ and `legalize operations`_ phases
1d019706d866 LLVM10 anatofuz parents: diff changeset	787 are responsible for turning an illegal DAG into a legal DAG.
1d019706d866 LLVM10 anatofuz parents: diff changeset	788
1d019706d866 LLVM10 anatofuz parents: diff changeset	789 .. _SelectionDAG-Process:
1d019706d866 LLVM10 anatofuz parents: diff changeset	790
1d019706d866 LLVM10 anatofuz parents: diff changeset	791 SelectionDAG Instruction Selection Process
1d019706d866 LLVM10 anatofuz parents: diff changeset	792 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	793
1d019706d866 LLVM10 anatofuz parents: diff changeset	794 SelectionDAG-based instruction selection consists of the following steps:
1d019706d866 LLVM10 anatofuz parents: diff changeset	795
1d019706d866 LLVM10 anatofuz parents: diff changeset	796 #. `Build initial DAG`_ --- This stage performs a simple translation from the
1d019706d866 LLVM10 anatofuz parents: diff changeset	797 input LLVM code to an illegal SelectionDAG.
1d019706d866 LLVM10 anatofuz parents: diff changeset	798
1d019706d866 LLVM10 anatofuz parents: diff changeset	799 #. `Optimize SelectionDAG`_ --- This stage performs simple optimizations on the
1d019706d866 LLVM10 anatofuz parents: diff changeset	800 SelectionDAG to simplify it, and recognize meta instructions (like rotates
1d019706d866 LLVM10 anatofuz parents: diff changeset	801 and ``div``/``rem`` pairs) for targets that support these meta operations.
1d019706d866 LLVM10 anatofuz parents: diff changeset	802 This makes the resultant code more efficient and the `select instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	803 from DAG`_ phase (below) simpler.
1d019706d866 LLVM10 anatofuz parents: diff changeset	804
1d019706d866 LLVM10 anatofuz parents: diff changeset	805 #. `Legalize SelectionDAG Types`_ --- This stage transforms SelectionDAG nodes
1d019706d866 LLVM10 anatofuz parents: diff changeset	806 to eliminate any types that are unsupported on the target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	807
1d019706d866 LLVM10 anatofuz parents: diff changeset	808 #. `Optimize SelectionDAG`_ --- The SelectionDAG optimizer is run to clean up
1d019706d866 LLVM10 anatofuz parents: diff changeset	809 redundancies exposed by type legalization.
1d019706d866 LLVM10 anatofuz parents: diff changeset	810
1d019706d866 LLVM10 anatofuz parents: diff changeset	811 #. `Legalize SelectionDAG Ops`_ --- This stage transforms SelectionDAG nodes to
1d019706d866 LLVM10 anatofuz parents: diff changeset	812 eliminate any operations that are unsupported on the target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	813
1d019706d866 LLVM10 anatofuz parents: diff changeset	814 #. `Optimize SelectionDAG`_ --- The SelectionDAG optimizer is run to eliminate
1d019706d866 LLVM10 anatofuz parents: diff changeset	815 inefficiencies introduced by operation legalization.
1d019706d866 LLVM10 anatofuz parents: diff changeset	816
1d019706d866 LLVM10 anatofuz parents: diff changeset	817 #. `Select instructions from DAG`_ --- Finally, the target instruction selector
1d019706d866 LLVM10 anatofuz parents: diff changeset	818 matches the DAG operations to target instructions. This process translates
1d019706d866 LLVM10 anatofuz parents: diff changeset	819 the target-independent input DAG into another DAG of target instructions.
1d019706d866 LLVM10 anatofuz parents: diff changeset	820
1d019706d866 LLVM10 anatofuz parents: diff changeset	821 #. `SelectionDAG Scheduling and Formation`_ --- The last phase assigns a linear
1d019706d866 LLVM10 anatofuz parents: diff changeset	822 order to the instructions in the target-instruction DAG and emits them into
1d019706d866 LLVM10 anatofuz parents: diff changeset	823 the MachineFunction being compiled. This step uses traditional prepass
1d019706d866 LLVM10 anatofuz parents: diff changeset	824 scheduling techniques.
1d019706d866 LLVM10 anatofuz parents: diff changeset	825
1d019706d866 LLVM10 anatofuz parents: diff changeset	826 After all of these steps are complete, the SelectionDAG is destroyed and the
1d019706d866 LLVM10 anatofuz parents: diff changeset	827 rest of the code generation passes are run.
1d019706d866 LLVM10 anatofuz parents: diff changeset	828
1d019706d866 LLVM10 anatofuz parents: diff changeset	829 One great way to visualize what is going on here is to take advantage of a few
1d019706d866 LLVM10 anatofuz parents: diff changeset	830 LLC command line options. The following options pop up a window displaying the
1d019706d866 LLVM10 anatofuz parents: diff changeset	831 SelectionDAG at specific times (if you only get errors printed to the console
1d019706d866 LLVM10 anatofuz parents: diff changeset	832 while using this, you probably `need to configure your
1d019706d866 LLVM10 anatofuz parents: diff changeset	833 system <ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ to add support for it).
1d019706d866 LLVM10 anatofuz parents: diff changeset	834
1d019706d866 LLVM10 anatofuz parents: diff changeset	835 * ``-view-dag-combine1-dags`` displays the DAG after being built, before the
1d019706d866 LLVM10 anatofuz parents: diff changeset	836 first optimization pass.
1d019706d866 LLVM10 anatofuz parents: diff changeset	837
1d019706d866 LLVM10 anatofuz parents: diff changeset	838 * ``-view-legalize-dags`` displays the DAG before Legalization.
1d019706d866 LLVM10 anatofuz parents: diff changeset	839
1d019706d866 LLVM10 anatofuz parents: diff changeset	840 * ``-view-dag-combine2-dags`` displays the DAG before the second optimization
1d019706d866 LLVM10 anatofuz parents: diff changeset	841 pass.
1d019706d866 LLVM10 anatofuz parents: diff changeset	842
1d019706d866 LLVM10 anatofuz parents: diff changeset	843 * ``-view-isel-dags`` displays the DAG before the Select phase.
1d019706d866 LLVM10 anatofuz parents: diff changeset	844
1d019706d866 LLVM10 anatofuz parents: diff changeset	845 * ``-view-sched-dags`` displays the DAG before Scheduling.
1d019706d866 LLVM10 anatofuz parents: diff changeset	846
1d019706d866 LLVM10 anatofuz parents: diff changeset	847 The ``-view-sunit-dags`` displays the Scheduler's dependency graph. This graph
1d019706d866 LLVM10 anatofuz parents: diff changeset	848 is based on the final SelectionDAG, with nodes that must be scheduled together
1d019706d866 LLVM10 anatofuz parents: diff changeset	849 bundled into a single scheduling-unit node, and with immediate operands and
1d019706d866 LLVM10 anatofuz parents: diff changeset	850 other nodes that aren't relevant for scheduling omitted.
1d019706d866 LLVM10 anatofuz parents: diff changeset	851
1d019706d866 LLVM10 anatofuz parents: diff changeset	852 The option ``-filter-view-dags`` allows to select the name of the basic block
1d019706d866 LLVM10 anatofuz parents: diff changeset	853 that you are interested to visualize and filters all the previous
1d019706d866 LLVM10 anatofuz parents: diff changeset	854 ``view-*-dags`` options.
1d019706d866 LLVM10 anatofuz parents: diff changeset	855
1d019706d866 LLVM10 anatofuz parents: diff changeset	856 .. _Build initial DAG:
1d019706d866 LLVM10 anatofuz parents: diff changeset	857
1d019706d866 LLVM10 anatofuz parents: diff changeset	858 Initial SelectionDAG Construction
1d019706d866 LLVM10 anatofuz parents: diff changeset	859 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	860
1d019706d866 LLVM10 anatofuz parents: diff changeset	861 The initial SelectionDAG is na\ :raw-html:`ï`\ vely peephole expanded from
1d019706d866 LLVM10 anatofuz parents: diff changeset	862 the LLVM input by the ``SelectionDAGBuilder`` class. The intent of this pass
1d019706d866 LLVM10 anatofuz parents: diff changeset	863 is to expose as much low-level, target-specific details to the SelectionDAG as
1d019706d866 LLVM10 anatofuz parents: diff changeset	864 possible. This pass is mostly hard-coded (e.g. an LLVM ``add`` turns into an
1d019706d866 LLVM10 anatofuz parents: diff changeset	865 ``SDNode add`` while a ``getelementptr`` is expanded into the obvious
1d019706d866 LLVM10 anatofuz parents: diff changeset	866 arithmetic). This pass requires target-specific hooks to lower calls, returns,
1d019706d866 LLVM10 anatofuz parents: diff changeset	867 varargs, etc. For these features, the :raw-html:`<tt>` `TargetLowering`_
1d019706d866 LLVM10 anatofuz parents: diff changeset	868 :raw-html:`</tt>` interface is used.
1d019706d866 LLVM10 anatofuz parents: diff changeset	869
1d019706d866 LLVM10 anatofuz parents: diff changeset	870 .. _legalize types:
1d019706d866 LLVM10 anatofuz parents: diff changeset	871 .. _Legalize SelectionDAG Types:
1d019706d866 LLVM10 anatofuz parents: diff changeset	872 .. _Legalize SelectionDAG Ops:
1d019706d866 LLVM10 anatofuz parents: diff changeset	873
1d019706d866 LLVM10 anatofuz parents: diff changeset	874 SelectionDAG LegalizeTypes Phase
1d019706d866 LLVM10 anatofuz parents: diff changeset	875 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	876
1d019706d866 LLVM10 anatofuz parents: diff changeset	877 The Legalize phase is in charge of converting a DAG to only use the types that
1d019706d866 LLVM10 anatofuz parents: diff changeset	878 are natively supported by the target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	879
1d019706d866 LLVM10 anatofuz parents: diff changeset	880 There are two main ways of converting values of unsupported scalar types to
1d019706d866 LLVM10 anatofuz parents: diff changeset	881 values of supported types: converting small types to larger types ("promoting"),
1d019706d866 LLVM10 anatofuz parents: diff changeset	882 and breaking up large integer types into smaller ones ("expanding"). For
1d019706d866 LLVM10 anatofuz parents: diff changeset	883 example, a target might require that all f32 values are promoted to f64 and that
1d019706d866 LLVM10 anatofuz parents: diff changeset	884 all i1/i8/i16 values are promoted to i32. The same target might require that
1d019706d866 LLVM10 anatofuz parents: diff changeset	885 all i64 values be expanded into pairs of i32 values. These changes can insert
1d019706d866 LLVM10 anatofuz parents: diff changeset	886 sign and zero extensions as needed to make sure that the final code has the same
1d019706d866 LLVM10 anatofuz parents: diff changeset	887 behavior as the input.
1d019706d866 LLVM10 anatofuz parents: diff changeset	888
1d019706d866 LLVM10 anatofuz parents: diff changeset	889 There are two main ways of converting values of unsupported vector types to
1d019706d866 LLVM10 anatofuz parents: diff changeset	890 value of supported types: splitting vector types, multiple times if necessary,
1d019706d866 LLVM10 anatofuz parents: diff changeset	891 until a legal type is found, and extending vector types by adding elements to
1d019706d866 LLVM10 anatofuz parents: diff changeset	892 the end to round them out to legal types ("widening"). If a vector gets split
1d019706d866 LLVM10 anatofuz parents: diff changeset	893 all the way down to single-element parts with no supported vector type being
1d019706d866 LLVM10 anatofuz parents: diff changeset	894 found, the elements are converted to scalars ("scalarizing").
1d019706d866 LLVM10 anatofuz parents: diff changeset	895
1d019706d866 LLVM10 anatofuz parents: diff changeset	896 A target implementation tells the legalizer which types are supported (and which
1d019706d866 LLVM10 anatofuz parents: diff changeset	897 register class to use for them) by calling the ``addRegisterClass`` method in
1d019706d866 LLVM10 anatofuz parents: diff changeset	898 its ``TargetLowering`` constructor.
1d019706d866 LLVM10 anatofuz parents: diff changeset	899
1d019706d866 LLVM10 anatofuz parents: diff changeset	900 .. _legalize operations:
1d019706d866 LLVM10 anatofuz parents: diff changeset	901 .. _Legalizer:
1d019706d866 LLVM10 anatofuz parents: diff changeset	902
1d019706d866 LLVM10 anatofuz parents: diff changeset	903 SelectionDAG Legalize Phase
1d019706d866 LLVM10 anatofuz parents: diff changeset	904 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	905
1d019706d866 LLVM10 anatofuz parents: diff changeset	906 The Legalize phase is in charge of converting a DAG to only use the operations
1d019706d866 LLVM10 anatofuz parents: diff changeset	907 that are natively supported by the target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	908
1d019706d866 LLVM10 anatofuz parents: diff changeset	909 Targets often have weird constraints, such as not supporting every operation on
1d019706d866 LLVM10 anatofuz parents: diff changeset	910 every supported datatype (e.g. X86 does not support byte conditional moves and
1d019706d866 LLVM10 anatofuz parents: diff changeset	911 PowerPC does not support sign-extending loads from a 16-bit memory location).
1d019706d866 LLVM10 anatofuz parents: diff changeset	912 Legalize takes care of this by open-coding another sequence of operations to
1d019706d866 LLVM10 anatofuz parents: diff changeset	913 emulate the operation ("expansion"), by promoting one type to a larger type that
1d019706d866 LLVM10 anatofuz parents: diff changeset	914 supports the operation ("promotion"), or by using a target-specific hook to
1d019706d866 LLVM10 anatofuz parents: diff changeset	915 implement the legalization ("custom").
1d019706d866 LLVM10 anatofuz parents: diff changeset	916
1d019706d866 LLVM10 anatofuz parents: diff changeset	917 A target implementation tells the legalizer which operations are not supported
1d019706d866 LLVM10 anatofuz parents: diff changeset	918 (and which of the above three actions to take) by calling the
1d019706d866 LLVM10 anatofuz parents: diff changeset	919 ``setOperationAction`` method in its ``TargetLowering`` constructor.
1d019706d866 LLVM10 anatofuz parents: diff changeset	920
1d019706d866 LLVM10 anatofuz parents: diff changeset	921 If a target has legal vector types, it is expected to produce efficient machine
1d019706d866 LLVM10 anatofuz parents: diff changeset	922 code for common forms of the shufflevector IR instruction using those types.
1d019706d866 LLVM10 anatofuz parents: diff changeset	923 This may require custom legalization for SelectionDAG vector operations that
1d019706d866 LLVM10 anatofuz parents: diff changeset	924 are created from the shufflevector IR. The shufflevector forms that should be
1d019706d866 LLVM10 anatofuz parents: diff changeset	925 handled include:
1d019706d866 LLVM10 anatofuz parents: diff changeset	926
1d019706d866 LLVM10 anatofuz parents: diff changeset	927 * Vector select --- Each element of the vector is chosen from either of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	928 corresponding elements of the 2 input vectors. This operation may also be
1d019706d866 LLVM10 anatofuz parents: diff changeset	929 known as a "blend" or "bitwise select" in target assembly. This type of shuffle
1d019706d866 LLVM10 anatofuz parents: diff changeset	930 maps directly to the ``shuffle_vector`` SelectionDAG node.
1d019706d866 LLVM10 anatofuz parents: diff changeset	931
1d019706d866 LLVM10 anatofuz parents: diff changeset	932 * Insert subvector --- A vector is placed into a longer vector type starting
1d019706d866 LLVM10 anatofuz parents: diff changeset	933 at index 0. This type of shuffle maps directly to the ``insert_subvector``
1d019706d866 LLVM10 anatofuz parents: diff changeset	934 SelectionDAG node with the ``index`` operand set to 0.
1d019706d866 LLVM10 anatofuz parents: diff changeset	935
1d019706d866 LLVM10 anatofuz parents: diff changeset	936 * Extract subvector --- A vector is pulled from a longer vector type starting
1d019706d866 LLVM10 anatofuz parents: diff changeset	937 at index 0. This type of shuffle maps directly to the ``extract_subvector``
1d019706d866 LLVM10 anatofuz parents: diff changeset	938 SelectionDAG node with the ``index`` operand set to 0.
1d019706d866 LLVM10 anatofuz parents: diff changeset	939
1d019706d866 LLVM10 anatofuz parents: diff changeset	940 * Splat --- All elements of the vector have identical scalar elements. This
1d019706d866 LLVM10 anatofuz parents: diff changeset	941 operation may also be known as a "broadcast" or "duplicate" in target assembly.
1d019706d866 LLVM10 anatofuz parents: diff changeset	942 The shufflevector IR instruction may change the vector length, so this operation
1d019706d866 LLVM10 anatofuz parents: diff changeset	943 may map to multiple SelectionDAG nodes including ``shuffle_vector``,
1d019706d866 LLVM10 anatofuz parents: diff changeset	944 ``concat_vectors``, ``insert_subvector``, and ``extract_subvector``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	945
1d019706d866 LLVM10 anatofuz parents: diff changeset	946 Prior to the existence of the Legalize passes, we required that every target
1d019706d866 LLVM10 anatofuz parents: diff changeset	947 `selector`_ supported and handled every operator and type even if they are not
1d019706d866 LLVM10 anatofuz parents: diff changeset	948 natively supported. The introduction of the Legalize phases allows all of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	949 canonicalization patterns to be shared across targets, and makes it very easy to
1d019706d866 LLVM10 anatofuz parents: diff changeset	950 optimize the canonicalized code because it is still in the form of a DAG.
1d019706d866 LLVM10 anatofuz parents: diff changeset	951
1d019706d866 LLVM10 anatofuz parents: diff changeset	952 .. _optimizations:
1d019706d866 LLVM10 anatofuz parents: diff changeset	953 .. _Optimize SelectionDAG:
1d019706d866 LLVM10 anatofuz parents: diff changeset	954 .. _selector:
1d019706d866 LLVM10 anatofuz parents: diff changeset	955
1d019706d866 LLVM10 anatofuz parents: diff changeset	956 SelectionDAG Optimization Phase: the DAG Combiner
1d019706d866 LLVM10 anatofuz parents: diff changeset	957 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	958
1d019706d866 LLVM10 anatofuz parents: diff changeset	959 The SelectionDAG optimization phase is run multiple times for code generation,
1d019706d866 LLVM10 anatofuz parents: diff changeset	960 immediately after the DAG is built and once after each legalization. The first
1d019706d866 LLVM10 anatofuz parents: diff changeset	961 run of the pass allows the initial code to be cleaned up (e.g. performing
1d019706d866 LLVM10 anatofuz parents: diff changeset	962 optimizations that depend on knowing that the operators have restricted type
1d019706d866 LLVM10 anatofuz parents: diff changeset	963 inputs). Subsequent runs of the pass clean up the messy code generated by the
1d019706d866 LLVM10 anatofuz parents: diff changeset	964 Legalize passes, which allows Legalize to be very simple (it can focus on making
1d019706d866 LLVM10 anatofuz parents: diff changeset	965 code legal instead of focusing on generating good and legal code).
1d019706d866 LLVM10 anatofuz parents: diff changeset	966
1d019706d866 LLVM10 anatofuz parents: diff changeset	967 One important class of optimizations performed is optimizing inserted sign and
1d019706d866 LLVM10 anatofuz parents: diff changeset	968 zero extension instructions. We currently use ad-hoc techniques, but could move
1d019706d866 LLVM10 anatofuz parents: diff changeset	969 to more rigorous techniques in the future. Here are some good papers on the
1d019706d866 LLVM10 anatofuz parents: diff changeset	970 subject:
1d019706d866 LLVM10 anatofuz parents: diff changeset	971
1d019706d866 LLVM10 anatofuz parents: diff changeset	972 "`Widening integer arithmetic <http://www.eecs.harvard.edu/~nr/pubs/widen-abstract.html>`_" :raw-html:`<br>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	973 Kevin Redwine and Norman Ramsey :raw-html:`<br>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	974 International Conference on Compiler Construction (CC) 2004
1d019706d866 LLVM10 anatofuz parents: diff changeset	975
1d019706d866 LLVM10 anatofuz parents: diff changeset	976 "`Effective sign extension elimination <http://portal.acm.org/citation.cfm?doid=512529.512552>`_" :raw-html:`<br>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	977 Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani :raw-html:`<br>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	978 Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design
1d019706d866 LLVM10 anatofuz parents: diff changeset	979 and Implementation.
1d019706d866 LLVM10 anatofuz parents: diff changeset	980
1d019706d866 LLVM10 anatofuz parents: diff changeset	981 .. _Select instructions from DAG:
1d019706d866 LLVM10 anatofuz parents: diff changeset	982
1d019706d866 LLVM10 anatofuz parents: diff changeset	983 SelectionDAG Select Phase
1d019706d866 LLVM10 anatofuz parents: diff changeset	984 ^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	985
1d019706d866 LLVM10 anatofuz parents: diff changeset	986 The Select phase is the bulk of the target-specific code for instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	987 selection. This phase takes a legal SelectionDAG as input, pattern matches the
1d019706d866 LLVM10 anatofuz parents: diff changeset	988 instructions supported by the target to this DAG, and produces a new DAG of
1d019706d866 LLVM10 anatofuz parents: diff changeset	989 target code. For example, consider the following LLVM fragment:
1d019706d866 LLVM10 anatofuz parents: diff changeset	990
1d019706d866 LLVM10 anatofuz parents: diff changeset	991 .. code-block:: llvm
1d019706d866 LLVM10 anatofuz parents: diff changeset	992
1d019706d866 LLVM10 anatofuz parents: diff changeset	993 %t1 = fadd float %W, %X
1d019706d866 LLVM10 anatofuz parents: diff changeset	994 %t2 = fmul float %t1, %Y
1d019706d866 LLVM10 anatofuz parents: diff changeset	995 %t3 = fadd float %t2, %Z
1d019706d866 LLVM10 anatofuz parents: diff changeset	996
1d019706d866 LLVM10 anatofuz parents: diff changeset	997 This LLVM code corresponds to a SelectionDAG that looks basically like this:
1d019706d866 LLVM10 anatofuz parents: diff changeset	998
1d019706d866 LLVM10 anatofuz parents: diff changeset	999 .. code-block:: text
1d019706d866 LLVM10 anatofuz parents: diff changeset	1000
1d019706d866 LLVM10 anatofuz parents: diff changeset	1001 (fadd:f32 (fmul:f32 (fadd:f32 W, X), Y), Z)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1002
1d019706d866 LLVM10 anatofuz parents: diff changeset	1003 If a target supports floating point multiply-and-add (FMA) operations, one of
1d019706d866 LLVM10 anatofuz parents: diff changeset	1004 the adds can be merged with the multiply. On the PowerPC, for example, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1005 output of the instruction selector might look like this DAG:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1006
1d019706d866 LLVM10 anatofuz parents: diff changeset	1007 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1008
1d019706d866 LLVM10 anatofuz parents: diff changeset	1009 (FMADDS (FADDS W, X), Y, Z)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1010
1d019706d866 LLVM10 anatofuz parents: diff changeset	1011 The ``FMADDS`` instruction is a ternary instruction that multiplies its first
1d019706d866 LLVM10 anatofuz parents: diff changeset	1012 two operands and adds the third (as single-precision floating-point numbers).
1d019706d866 LLVM10 anatofuz parents: diff changeset	1013 The ``FADDS`` instruction is a simple binary single-precision add instruction.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1014 To perform this pattern match, the PowerPC backend includes the following
1d019706d866 LLVM10 anatofuz parents: diff changeset	1015 instruction definitions:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1016
1d019706d866 LLVM10 anatofuz parents: diff changeset	1017 .. code-block:: text
1d019706d866 LLVM10 anatofuz parents: diff changeset	1018 :emphasize-lines: 4-5,9
1d019706d866 LLVM10 anatofuz parents: diff changeset	1019
1d019706d866 LLVM10 anatofuz parents: diff changeset	1020 def FMADDS : AForm_1<59, 29,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1021 (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRC, F4RC:$FRB),
1d019706d866 LLVM10 anatofuz parents: diff changeset	1022 "fmadds $FRT, $FRA, $FRC, $FRB",
1d019706d866 LLVM10 anatofuz parents: diff changeset	1023 [(set F4RC:$FRT, (fadd (fmul F4RC:$FRA, F4RC:$FRC),
1d019706d866 LLVM10 anatofuz parents: diff changeset	1024 F4RC:$FRB))]>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1025 def FADDS : AForm_2<59, 21,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1026 (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRB),
1d019706d866 LLVM10 anatofuz parents: diff changeset	1027 "fadds $FRT, $FRA, $FRB",
1d019706d866 LLVM10 anatofuz parents: diff changeset	1028 [(set F4RC:$FRT, (fadd F4RC:$FRA, F4RC:$FRB))]>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1029
1d019706d866 LLVM10 anatofuz parents: diff changeset	1030 The highlighted portion of the instruction definitions indicates the pattern
1d019706d866 LLVM10 anatofuz parents: diff changeset	1031 used to match the instructions. The DAG operators (like ``fmul``/``fadd``)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1032 are defined in the ``include/llvm/Target/TargetSelectionDAG.td`` file.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1033 "``F4RC``" is the register class of the input and result values.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1034
1d019706d866 LLVM10 anatofuz parents: diff changeset	1035 The TableGen DAG instruction selector generator reads the instruction patterns
1d019706d866 LLVM10 anatofuz parents: diff changeset	1036 in the ``.td`` file and automatically builds parts of the pattern matching code
1d019706d866 LLVM10 anatofuz parents: diff changeset	1037 for your target. It has the following strengths:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1038
1d019706d866 LLVM10 anatofuz parents: diff changeset	1039 * At compiler-compile time, it analyzes your instruction patterns and tells you
1d019706d866 LLVM10 anatofuz parents: diff changeset	1040 if your patterns make sense or not.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1041
1d019706d866 LLVM10 anatofuz parents: diff changeset	1042 * It can handle arbitrary constraints on operands for the pattern match. In
1d019706d866 LLVM10 anatofuz parents: diff changeset	1043 particular, it is straight-forward to say things like "match any immediate
1d019706d866 LLVM10 anatofuz parents: diff changeset	1044 that is a 13-bit sign-extended value". For examples, see the ``immSExt16``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1045 and related ``tblgen`` classes in the PowerPC backend.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1046
1d019706d866 LLVM10 anatofuz parents: diff changeset	1047 * It knows several important identities for the patterns defined. For example,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1048 it knows that addition is commutative, so it allows the ``FMADDS`` pattern
1d019706d866 LLVM10 anatofuz parents: diff changeset	1049 above to match "``(fadd X, (fmul Y, Z))``" as well as "``(fadd (fmul X, Y),
1d019706d866 LLVM10 anatofuz parents: diff changeset	1050 Z)``", without the target author having to specially handle this case.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1051
1d019706d866 LLVM10 anatofuz parents: diff changeset	1052 * It has a full-featured type-inferencing system. In particular, you should
1d019706d866 LLVM10 anatofuz parents: diff changeset	1053 rarely have to explicitly tell the system what type parts of your patterns
1d019706d866 LLVM10 anatofuz parents: diff changeset	1054 are. In the ``FMADDS`` case above, we didn't have to tell ``tblgen`` that all
1d019706d866 LLVM10 anatofuz parents: diff changeset	1055 of the nodes in the pattern are of type 'f32'. It was able to infer and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1056 propagate this knowledge from the fact that ``F4RC`` has type 'f32'.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1057
1d019706d866 LLVM10 anatofuz parents: diff changeset	1058 * Targets can define their own (and rely on built-in) "pattern fragments".
1d019706d866 LLVM10 anatofuz parents: diff changeset	1059 Pattern fragments are chunks of reusable patterns that get inlined into your
1d019706d866 LLVM10 anatofuz parents: diff changeset	1060 patterns during compiler-compile time. For example, the integer "``(not
1d019706d866 LLVM10 anatofuz parents: diff changeset	1061 x)``" operation is actually defined as a pattern fragment that expands as
1d019706d866 LLVM10 anatofuz parents: diff changeset	1062 "``(xor x, -1)``", since the SelectionDAG does not have a native '``not``'
1d019706d866 LLVM10 anatofuz parents: diff changeset	1063 operation. Targets can define their own short-hand fragments as they see fit.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1064 See the definition of '``not``' and '``ineg``' for examples.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1065
1d019706d866 LLVM10 anatofuz parents: diff changeset	1066 * In addition to instructions, targets can specify arbitrary patterns that map
1d019706d866 LLVM10 anatofuz parents: diff changeset	1067 to one or more instructions using the 'Pat' class. For example, the PowerPC
1d019706d866 LLVM10 anatofuz parents: diff changeset	1068 has no way to load an arbitrary integer immediate into a register in one
1d019706d866 LLVM10 anatofuz parents: diff changeset	1069 instruction. To tell tblgen how to do this, it defines:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1070
1d019706d866 LLVM10 anatofuz parents: diff changeset	1071 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1072
1d019706d866 LLVM10 anatofuz parents: diff changeset	1073 // Arbitrary immediate support. Implement in terms of LIS/ORI.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1074 def : Pat<(i32 imm:$imm),
1d019706d866 LLVM10 anatofuz parents: diff changeset	1075 (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1076
1d019706d866 LLVM10 anatofuz parents: diff changeset	1077 If none of the single-instruction patterns for loading an immediate into a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1078 register match, this will be used. This rule says "match an arbitrary i32
1d019706d866 LLVM10 anatofuz parents: diff changeset	1079 immediate, turning it into an ``ORI`` ('or a 16-bit immediate') and an ``LIS``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1080 ('load 16-bit immediate, where the immediate is shifted to the left 16 bits')
1d019706d866 LLVM10 anatofuz parents: diff changeset	1081 instruction". To make this work, the ``LO16``/``HI16`` node transformations
1d019706d866 LLVM10 anatofuz parents: diff changeset	1082 are used to manipulate the input immediate (in this case, take the high or low
1d019706d866 LLVM10 anatofuz parents: diff changeset	1083 16-bits of the immediate).
1d019706d866 LLVM10 anatofuz parents: diff changeset	1084
1d019706d866 LLVM10 anatofuz parents: diff changeset	1085 * When using the 'Pat' class to map a pattern to an instruction that has one
1d019706d866 LLVM10 anatofuz parents: diff changeset	1086 or more complex operands (like e.g. `X86 addressing mode`_), the pattern may
1d019706d866 LLVM10 anatofuz parents: diff changeset	1087 either specify the operand as a whole using a ``ComplexPattern``, or else it
1d019706d866 LLVM10 anatofuz parents: diff changeset	1088 may specify the components of the complex operand separately. The latter is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1089 done e.g. for pre-increment instructions by the PowerPC back end:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1090
1d019706d866 LLVM10 anatofuz parents: diff changeset	1091 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1092
1d019706d866 LLVM10 anatofuz parents: diff changeset	1093 def STWU : DForm_1<37, (outs ptr_rc:$ea_res), (ins GPRC:$rS, memri:$dst),
1d019706d866 LLVM10 anatofuz parents: diff changeset	1094 "stwu $rS, $dst", LdStStoreUpd, []>,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1095 RegConstraint<"$dst.reg = $ea_res">, NoEncode<"$ea_res">;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1096
1d019706d866 LLVM10 anatofuz parents: diff changeset	1097 def : Pat<(pre_store GPRC:$rS, ptr_rc:$ptrreg, iaddroff:$ptroff),
1d019706d866 LLVM10 anatofuz parents: diff changeset	1098 (STWU GPRC:$rS, iaddroff:$ptroff, ptr_rc:$ptrreg)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1099
1d019706d866 LLVM10 anatofuz parents: diff changeset	1100 Here, the pair of ``ptroff`` and ``ptrreg`` operands is matched onto the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1101 complex operand ``dst`` of class ``memri`` in the ``STWU`` instruction.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1102
1d019706d866 LLVM10 anatofuz parents: diff changeset	1103 * While the system does automate a lot, it still allows you to write custom C++
1d019706d866 LLVM10 anatofuz parents: diff changeset	1104 code to match special cases if there is something that is hard to
1d019706d866 LLVM10 anatofuz parents: diff changeset	1105 express.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1106
1d019706d866 LLVM10 anatofuz parents: diff changeset	1107 While it has many strengths, the system currently has some limitations,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1108 primarily because it is a work in progress and is not yet finished:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1109
1d019706d866 LLVM10 anatofuz parents: diff changeset	1110 * Overall, there is no way to define or match SelectionDAG nodes that define
1d019706d866 LLVM10 anatofuz parents: diff changeset	1111 multiple values (e.g. ``SMUL_LOHI``, ``LOAD``, ``CALL``, etc). This is the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1112 biggest reason that you currently still have to write custom C++ code
1d019706d866 LLVM10 anatofuz parents: diff changeset	1113 for your instruction selector.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1114
1d019706d866 LLVM10 anatofuz parents: diff changeset	1115 * There is no great way to support matching complex addressing modes yet. In
1d019706d866 LLVM10 anatofuz parents: diff changeset	1116 the future, we will extend pattern fragments to allow them to define multiple
1d019706d866 LLVM10 anatofuz parents: diff changeset	1117 values (e.g. the four operands of the `X86 addressing mode`_, which are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1118 currently matched with custom C++ code). In addition, we'll extend fragments
1d019706d866 LLVM10 anatofuz parents: diff changeset	1119 so that a fragment can match multiple different patterns.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1120
1d019706d866 LLVM10 anatofuz parents: diff changeset	1121 * We don't automatically infer flags like ``isStore``/``isLoad`` yet.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1122
1d019706d866 LLVM10 anatofuz parents: diff changeset	1123 * We don't automatically generate the set of supported registers and operations
1d019706d866 LLVM10 anatofuz parents: diff changeset	1124 for the `Legalizer`_ yet.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1125
1d019706d866 LLVM10 anatofuz parents: diff changeset	1126 * We don't have a way of tying in custom legalized nodes yet.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1127
1d019706d866 LLVM10 anatofuz parents: diff changeset	1128 Despite these limitations, the instruction selector generator is still quite
1d019706d866 LLVM10 anatofuz parents: diff changeset	1129 useful for most of the binary and logical operations in typical instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	1130 sets. If you run into any problems or can't figure out how to do something,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1131 please let Chris know!
1d019706d866 LLVM10 anatofuz parents: diff changeset	1132
1d019706d866 LLVM10 anatofuz parents: diff changeset	1133 .. _Scheduling and Formation:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1134 .. _SelectionDAG Scheduling and Formation:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1135
1d019706d866 LLVM10 anatofuz parents: diff changeset	1136 SelectionDAG Scheduling and Formation Phase
1d019706d866 LLVM10 anatofuz parents: diff changeset	1137 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1138
1d019706d866 LLVM10 anatofuz parents: diff changeset	1139 The scheduling phase takes the DAG of target instructions from the selection
1d019706d866 LLVM10 anatofuz parents: diff changeset	1140 phase and assigns an order. The scheduler can pick an order depending on
1d019706d866 LLVM10 anatofuz parents: diff changeset	1141 various constraints of the machines (i.e. order for minimal register pressure or
1d019706d866 LLVM10 anatofuz parents: diff changeset	1142 try to cover instruction latencies). Once an order is established, the DAG is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1143 converted to a list of :raw-html:`<tt>` `MachineInstr`_\s :raw-html:`</tt>` and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1144 the SelectionDAG is destroyed.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1145
1d019706d866 LLVM10 anatofuz parents: diff changeset	1146 Note that this phase is logically separate from the instruction selection phase,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1147 but is tied to it closely in the code because it operates on SelectionDAGs.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1148
1d019706d866 LLVM10 anatofuz parents: diff changeset	1149 Future directions for the SelectionDAG
1d019706d866 LLVM10 anatofuz parents: diff changeset	1150 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1151
1d019706d866 LLVM10 anatofuz parents: diff changeset	1152 #. Optional function-at-a-time selection.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1153
1d019706d866 LLVM10 anatofuz parents: diff changeset	1154 #. Auto-generate entire selector from ``.td`` file.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1155
1d019706d866 LLVM10 anatofuz parents: diff changeset	1156 .. _SSA-based Machine Code Optimizations:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1157
1d019706d866 LLVM10 anatofuz parents: diff changeset	1158 SSA-based Machine Code Optimizations
1d019706d866 LLVM10 anatofuz parents: diff changeset	1159 ------------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1160
1d019706d866 LLVM10 anatofuz parents: diff changeset	1161 To Be Written
1d019706d866 LLVM10 anatofuz parents: diff changeset	1162
1d019706d866 LLVM10 anatofuz parents: diff changeset	1163 Live Intervals
1d019706d866 LLVM10 anatofuz parents: diff changeset	1164 --------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1165
1d019706d866 LLVM10 anatofuz parents: diff changeset	1166 Live Intervals are the ranges (intervals) where a variable is live. They are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1167 used by some `register allocator`_ passes to determine if two or more virtual
1d019706d866 LLVM10 anatofuz parents: diff changeset	1168 registers which require the same physical register are live at the same point in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1169 the program (i.e., they conflict). When this situation occurs, one virtual
1d019706d866 LLVM10 anatofuz parents: diff changeset	1170 register must be spilled.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1171
1d019706d866 LLVM10 anatofuz parents: diff changeset	1172 Live Variable Analysis
1d019706d866 LLVM10 anatofuz parents: diff changeset	1173 ^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1174
1d019706d866 LLVM10 anatofuz parents: diff changeset	1175 The first step in determining the live intervals of variables is to calculate
1d019706d866 LLVM10 anatofuz parents: diff changeset	1176 the set of registers that are immediately dead after the instruction (i.e., the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1177 instruction calculates the value, but it is never used) and the set of registers
1d019706d866 LLVM10 anatofuz parents: diff changeset	1178 that are used by the instruction, but are never used after the instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	1179 (i.e., they are killed). Live variable information is computed for
1d019706d866 LLVM10 anatofuz parents: diff changeset	1180 each virtual register and register allocatable physical register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1181 in the function. This is done in a very efficient manner because it uses SSA to
1d019706d866 LLVM10 anatofuz parents: diff changeset	1182 sparsely compute lifetime information for virtual registers (which are in SSA
1d019706d866 LLVM10 anatofuz parents: diff changeset	1183 form) and only has to track physical registers within a block. Before register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1184 allocation, LLVM can assume that physical registers are only live within a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1185 single basic block. This allows it to do a single, local analysis to resolve
1d019706d866 LLVM10 anatofuz parents: diff changeset	1186 physical register lifetimes within each basic block. If a physical register is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1187 not register allocatable (e.g., a stack pointer or condition codes), it is not
1d019706d866 LLVM10 anatofuz parents: diff changeset	1188 tracked.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1189
1d019706d866 LLVM10 anatofuz parents: diff changeset	1190 Physical registers may be live in to or out of a function. Live in values are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1191 typically arguments in registers. Live out values are typically return values in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1192 registers. Live in values are marked as such, and are given a dummy "defining"
1d019706d866 LLVM10 anatofuz parents: diff changeset	1193 instruction during live intervals analysis. If the last basic block of a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1194 function is a ``return``, then it's marked as using all live out values in the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1195 function.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1196
1d019706d866 LLVM10 anatofuz parents: diff changeset	1197 ``PHI`` nodes need to be handled specially, because the calculation of the live
1d019706d866 LLVM10 anatofuz parents: diff changeset	1198 variable information from a depth first traversal of the CFG of the function
1d019706d866 LLVM10 anatofuz parents: diff changeset	1199 won't guarantee that a virtual register used by the ``PHI`` node is defined
1d019706d866 LLVM10 anatofuz parents: diff changeset	1200 before it's used. When a ``PHI`` node is encountered, only the definition is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1201 handled, because the uses will be handled in other basic blocks.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1202
1d019706d866 LLVM10 anatofuz parents: diff changeset	1203 For each ``PHI`` node of the current basic block, we simulate an assignment at
1d019706d866 LLVM10 anatofuz parents: diff changeset	1204 the end of the current basic block and traverse the successor basic blocks. If a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1205 successor basic block has a ``PHI`` node and one of the ``PHI`` node's operands
1d019706d866 LLVM10 anatofuz parents: diff changeset	1206 is coming from the current basic block, then the variable is marked as alive
1d019706d866 LLVM10 anatofuz parents: diff changeset	1207 within the current basic block and all of its predecessor basic blocks, until
1d019706d866 LLVM10 anatofuz parents: diff changeset	1208 the basic block with the defining instruction is encountered.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1209
1d019706d866 LLVM10 anatofuz parents: diff changeset	1210 Live Intervals Analysis
1d019706d866 LLVM10 anatofuz parents: diff changeset	1211 ^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1212
1d019706d866 LLVM10 anatofuz parents: diff changeset	1213 We now have the information available to perform the live intervals analysis and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1214 build the live intervals themselves. We start off by numbering the basic blocks
1d019706d866 LLVM10 anatofuz parents: diff changeset	1215 and machine instructions. We then handle the "live-in" values. These are in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1216 physical registers, so the physical register is assumed to be killed by the end
1d019706d866 LLVM10 anatofuz parents: diff changeset	1217 of the basic block. Live intervals for virtual registers are computed for some
1d019706d866 LLVM10 anatofuz parents: diff changeset	1218 ordering of the machine instructions ``[1, N]``. A live interval is an interval
1d019706d866 LLVM10 anatofuz parents: diff changeset	1219 ``[i, j)``, where ``1 >= i >= j > N``, for which a variable is live.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1220
1d019706d866 LLVM10 anatofuz parents: diff changeset	1221 .. note::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1222 More to come...
1d019706d866 LLVM10 anatofuz parents: diff changeset	1223
1d019706d866 LLVM10 anatofuz parents: diff changeset	1224 .. _Register Allocation:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1225 .. _register allocator:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1226
1d019706d866 LLVM10 anatofuz parents: diff changeset	1227 Register Allocation
1d019706d866 LLVM10 anatofuz parents: diff changeset	1228 -------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1229
1d019706d866 LLVM10 anatofuz parents: diff changeset	1230 The Register Allocation problem consists in mapping a program
1d019706d866 LLVM10 anatofuz parents: diff changeset	1231 :raw-html:`<b><tt>` P\ :sub:`v`\ :raw-html:`</tt></b>`, that can use an unbounded
1d019706d866 LLVM10 anatofuz parents: diff changeset	1232 number of virtual registers, to a program :raw-html:`<b><tt>` P\ :sub:`p`\
1d019706d866 LLVM10 anatofuz parents: diff changeset	1233 :raw-html:`</tt></b>` that contains a finite (possibly small) number of physical
1d019706d866 LLVM10 anatofuz parents: diff changeset	1234 registers. Each target architecture has a different number of physical
1d019706d866 LLVM10 anatofuz parents: diff changeset	1235 registers. If the number of physical registers is not enough to accommodate all
1d019706d866 LLVM10 anatofuz parents: diff changeset	1236 the virtual registers, some of them will have to be mapped into memory. These
1d019706d866 LLVM10 anatofuz parents: diff changeset	1237 virtuals are called spilled virtuals.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1238
1d019706d866 LLVM10 anatofuz parents: diff changeset	1239 How registers are represented in LLVM
1d019706d866 LLVM10 anatofuz parents: diff changeset	1240 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1241
1d019706d866 LLVM10 anatofuz parents: diff changeset	1242 In LLVM, physical registers are denoted by integer numbers that normally range
1d019706d866 LLVM10 anatofuz parents: diff changeset	1243 from 1 to 1023. To see how this numbering is defined for a particular
1d019706d866 LLVM10 anatofuz parents: diff changeset	1244 architecture, you can read the ``GenRegisterNames.inc`` file for that
1d019706d866 LLVM10 anatofuz parents: diff changeset	1245 architecture. For instance, by inspecting
1d019706d866 LLVM10 anatofuz parents: diff changeset	1246 ``lib/Target/X86/X86GenRegisterInfo.inc`` we see that the 32-bit register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1247 ``EAX`` is denoted by 43, and the MMX register ``MM0`` is mapped to 65.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1248
1d019706d866 LLVM10 anatofuz parents: diff changeset	1249 Some architectures contain registers that share the same physical location. A
1d019706d866 LLVM10 anatofuz parents: diff changeset	1250 notable example is the X86 platform. For instance, in the X86 architecture, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1251 registers ``EAX``, ``AX`` and ``AL`` share the first eight bits. These physical
1d019706d866 LLVM10 anatofuz parents: diff changeset	1252 registers are marked as aliased in LLVM. Given a particular architecture, you
1d019706d866 LLVM10 anatofuz parents: diff changeset	1253 can check which registers are aliased by inspecting its ``RegisterInfo.td``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1254 file. Moreover, the class ``MCRegAliasIterator`` enumerates all the physical
1d019706d866 LLVM10 anatofuz parents: diff changeset	1255 registers aliased to a register.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1256
1d019706d866 LLVM10 anatofuz parents: diff changeset	1257 Physical registers, in LLVM, are grouped in Register Classes. Elements in the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1258 same register class are functionally equivalent, and can be interchangeably
1d019706d866 LLVM10 anatofuz parents: diff changeset	1259 used. Each virtual register can only be mapped to physical registers of a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1260 particular class. For instance, in the X86 architecture, some virtuals can only
1d019706d866 LLVM10 anatofuz parents: diff changeset	1261 be allocated to 8 bit registers. A register class is described by
1d019706d866 LLVM10 anatofuz parents: diff changeset	1262 ``TargetRegisterClass`` objects. To discover if a virtual register is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1263 compatible with a given physical, this code can be used:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1264
1d019706d866 LLVM10 anatofuz parents: diff changeset	1265 .. code-block:: c++
1d019706d866 LLVM10 anatofuz parents: diff changeset	1266
1d019706d866 LLVM10 anatofuz parents: diff changeset	1267 bool RegMapping_Fer::compatible_class(MachineFunction &mf,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1268 unsigned v_reg,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1269 unsigned p_reg) {
1d019706d866 LLVM10 anatofuz parents: diff changeset	1270 assert(TargetRegisterInfo::isPhysicalRegister(p_reg) &&
1d019706d866 LLVM10 anatofuz parents: diff changeset	1271 "Target register must be physical");
1d019706d866 LLVM10 anatofuz parents: diff changeset	1272 const TargetRegisterClass *trc = mf.getRegInfo().getRegClass(v_reg);
1d019706d866 LLVM10 anatofuz parents: diff changeset	1273 return trc->contains(p_reg);
1d019706d866 LLVM10 anatofuz parents: diff changeset	1274 }
1d019706d866 LLVM10 anatofuz parents: diff changeset	1275
1d019706d866 LLVM10 anatofuz parents: diff changeset	1276 Sometimes, mostly for debugging purposes, it is useful to change the number of
1d019706d866 LLVM10 anatofuz parents: diff changeset	1277 physical registers available in the target architecture. This must be done
1d019706d866 LLVM10 anatofuz parents: diff changeset	1278 statically, inside the ``TargetRegisterInfo.td`` file. Just ``grep`` for
1d019706d866 LLVM10 anatofuz parents: diff changeset	1279 ``RegisterClass``, the last parameter of which is a list of registers. Just
1d019706d866 LLVM10 anatofuz parents: diff changeset	1280 commenting some out is one simple way to avoid them being used. A more polite
1d019706d866 LLVM10 anatofuz parents: diff changeset	1281 way is to explicitly exclude some registers from the allocation order. See the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1282 definition of the ``GR8`` register class in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1283 ``lib/Target/X86/X86RegisterInfo.td`` for an example of this.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1284
1d019706d866 LLVM10 anatofuz parents: diff changeset	1285 Virtual registers are also denoted by integer numbers. Contrary to physical
1d019706d866 LLVM10 anatofuz parents: diff changeset	1286 registers, different virtual registers never share the same number. Whereas
1d019706d866 LLVM10 anatofuz parents: diff changeset	1287 physical registers are statically defined in a ``TargetRegisterInfo.td`` file
1d019706d866 LLVM10 anatofuz parents: diff changeset	1288 and cannot be created by the application developer, that is not the case with
1d019706d866 LLVM10 anatofuz parents: diff changeset	1289 virtual registers. In order to create new virtual registers, use the method
1d019706d866 LLVM10 anatofuz parents: diff changeset	1290 ``MachineRegisterInfo::createVirtualRegister()``. This method will return a new
1d019706d866 LLVM10 anatofuz parents: diff changeset	1291 virtual register. Use an ``IndexedMap<Foo, VirtReg2IndexFunctor>`` to hold
1d019706d866 LLVM10 anatofuz parents: diff changeset	1292 information per virtual register. If you need to enumerate all virtual
1d019706d866 LLVM10 anatofuz parents: diff changeset	1293 registers, use the function ``TargetRegisterInfo::index2VirtReg()`` to find the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1294 virtual register numbers:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1295
1d019706d866 LLVM10 anatofuz parents: diff changeset	1296 .. code-block:: c++
1d019706d866 LLVM10 anatofuz parents: diff changeset	1297
1d019706d866 LLVM10 anatofuz parents: diff changeset	1298 for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) {
1d019706d866 LLVM10 anatofuz parents: diff changeset	1299 unsigned VirtReg = TargetRegisterInfo::index2VirtReg(i);
1d019706d866 LLVM10 anatofuz parents: diff changeset	1300 stuff(VirtReg);
1d019706d866 LLVM10 anatofuz parents: diff changeset	1301 }
1d019706d866 LLVM10 anatofuz parents: diff changeset	1302
1d019706d866 LLVM10 anatofuz parents: diff changeset	1303 Before register allocation, the operands of an instruction are mostly virtual
1d019706d866 LLVM10 anatofuz parents: diff changeset	1304 registers, although physical registers may also be used. In order to check if a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1305 given machine operand is a register, use the boolean function
1d019706d866 LLVM10 anatofuz parents: diff changeset	1306 ``MachineOperand::isRegister()``. To obtain the integer code of a register, use
1d019706d866 LLVM10 anatofuz parents: diff changeset	1307 ``MachineOperand::getReg()``. An instruction may define or use a register. For
1d019706d866 LLVM10 anatofuz parents: diff changeset	1308 instance, ``ADD reg:1026 := reg:1025 reg:1024`` defines the registers 1024, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1309 uses registers 1025 and 1026. Given a register operand, the method
1d019706d866 LLVM10 anatofuz parents: diff changeset	1310 ``MachineOperand::isUse()`` informs if that register is being used by the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1311 instruction. The method ``MachineOperand::isDef()`` informs if that registers is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1312 being defined.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1313
1d019706d866 LLVM10 anatofuz parents: diff changeset	1314 We will call physical registers present in the LLVM bitcode before register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1315 allocation pre-colored registers. Pre-colored registers are used in many
1d019706d866 LLVM10 anatofuz parents: diff changeset	1316 different situations, for instance, to pass parameters of functions calls, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1317 to store results of particular instructions. There are two types of pre-colored
1d019706d866 LLVM10 anatofuz parents: diff changeset	1318 registers: the ones implicitly defined, and those explicitly
1d019706d866 LLVM10 anatofuz parents: diff changeset	1319 defined. Explicitly defined registers are normal operands, and can be accessed
1d019706d866 LLVM10 anatofuz parents: diff changeset	1320 with ``MachineInstr::getOperand(int)::getReg()``. In order to check which
1d019706d866 LLVM10 anatofuz parents: diff changeset	1321 registers are implicitly defined by an instruction, use the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1322 ``TargetInstrInfo::get(opcode)::ImplicitDefs``, where ``opcode`` is the opcode
1d019706d866 LLVM10 anatofuz parents: diff changeset	1323 of the target instruction. One important difference between explicit and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1324 implicit physical registers is that the latter are defined statically for each
1d019706d866 LLVM10 anatofuz parents: diff changeset	1325 instruction, whereas the former may vary depending on the program being
1d019706d866 LLVM10 anatofuz parents: diff changeset	1326 compiled. For example, an instruction that represents a function call will
1d019706d866 LLVM10 anatofuz parents: diff changeset	1327 always implicitly define or use the same set of physical registers. To read the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1328 registers implicitly used by an instruction, use
1d019706d866 LLVM10 anatofuz parents: diff changeset	1329 ``TargetInstrInfo::get(opcode)::ImplicitUses``. Pre-colored registers impose
1d019706d866 LLVM10 anatofuz parents: diff changeset	1330 constraints on any register allocation algorithm. The register allocator must
1d019706d866 LLVM10 anatofuz parents: diff changeset	1331 make sure that none of them are overwritten by the values of virtual registers
1d019706d866 LLVM10 anatofuz parents: diff changeset	1332 while still alive.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1333
1d019706d866 LLVM10 anatofuz parents: diff changeset	1334 Mapping virtual registers to physical registers
1d019706d866 LLVM10 anatofuz parents: diff changeset	1335 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1336
1d019706d866 LLVM10 anatofuz parents: diff changeset	1337 There are two ways to map virtual registers to physical registers (or to memory
1d019706d866 LLVM10 anatofuz parents: diff changeset	1338 slots). The first way, that we will call direct mapping, is based on the use
1d019706d866 LLVM10 anatofuz parents: diff changeset	1339 of methods of the classes ``TargetRegisterInfo``, and ``MachineOperand``. The
1d019706d866 LLVM10 anatofuz parents: diff changeset	1340 second way, that we will call indirect mapping, relies on the ``VirtRegMap``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1341 class in order to insert loads and stores sending and getting values to and from
1d019706d866 LLVM10 anatofuz parents: diff changeset	1342 memory.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1343
1d019706d866 LLVM10 anatofuz parents: diff changeset	1344 The direct mapping provides more flexibility to the developer of the register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1345 allocator; however, it is more error prone, and demands more implementation
1d019706d866 LLVM10 anatofuz parents: diff changeset	1346 work. Basically, the programmer will have to specify where load and store
1d019706d866 LLVM10 anatofuz parents: diff changeset	1347 instructions should be inserted in the target function being compiled in order
1d019706d866 LLVM10 anatofuz parents: diff changeset	1348 to get and store values in memory. To assign a physical register to a virtual
1d019706d866 LLVM10 anatofuz parents: diff changeset	1349 register present in a given operand, use ``MachineOperand::setReg(p_reg)``. To
1d019706d866 LLVM10 anatofuz parents: diff changeset	1350 insert a store instruction, use ``TargetInstrInfo::storeRegToStackSlot(...)``,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1351 and to insert a load instruction, use ``TargetInstrInfo::loadRegFromStackSlot``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1352
1d019706d866 LLVM10 anatofuz parents: diff changeset	1353 The indirect mapping shields the application developer from the complexities of
1d019706d866 LLVM10 anatofuz parents: diff changeset	1354 inserting load and store instructions. In order to map a virtual register to a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1355 physical one, use ``VirtRegMap::assignVirt2Phys(vreg, preg)``. In order to map
1d019706d866 LLVM10 anatofuz parents: diff changeset	1356 a certain virtual register to memory, use
1d019706d866 LLVM10 anatofuz parents: diff changeset	1357 ``VirtRegMap::assignVirt2StackSlot(vreg)``. This method will return the stack
1d019706d866 LLVM10 anatofuz parents: diff changeset	1358 slot where ``vreg``'s value will be located. If it is necessary to map another
1d019706d866 LLVM10 anatofuz parents: diff changeset	1359 virtual register to the same stack slot, use
1d019706d866 LLVM10 anatofuz parents: diff changeset	1360 ``VirtRegMap::assignVirt2StackSlot(vreg, stack_location)``. One important point
1d019706d866 LLVM10 anatofuz parents: diff changeset	1361 to consider when using the indirect mapping, is that even if a virtual register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1362 is mapped to memory, it still needs to be mapped to a physical register. This
1d019706d866 LLVM10 anatofuz parents: diff changeset	1363 physical register is the location where the virtual register is supposed to be
1d019706d866 LLVM10 anatofuz parents: diff changeset	1364 found before being stored or after being reloaded.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1365
1d019706d866 LLVM10 anatofuz parents: diff changeset	1366 If the indirect strategy is used, after all the virtual registers have been
1d019706d866 LLVM10 anatofuz parents: diff changeset	1367 mapped to physical registers or stack slots, it is necessary to use a spiller
1d019706d866 LLVM10 anatofuz parents: diff changeset	1368 object to place load and store instructions in the code. Every virtual that has
1d019706d866 LLVM10 anatofuz parents: diff changeset	1369 been mapped to a stack slot will be stored to memory after being defined and will
1d019706d866 LLVM10 anatofuz parents: diff changeset	1370 be loaded before being used. The implementation of the spiller tries to recycle
1d019706d866 LLVM10 anatofuz parents: diff changeset	1371 load/store instructions, avoiding unnecessary instructions. For an example of
1d019706d866 LLVM10 anatofuz parents: diff changeset	1372 how to invoke the spiller, see ``RegAllocLinearScan::runOnMachineFunction`` in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1373 ``lib/CodeGen/RegAllocLinearScan.cpp``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1374
1d019706d866 LLVM10 anatofuz parents: diff changeset	1375 Handling two address instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	1376 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1377
1d019706d866 LLVM10 anatofuz parents: diff changeset	1378 With very rare exceptions (e.g., function calls), the LLVM machine code
1d019706d866 LLVM10 anatofuz parents: diff changeset	1379 instructions are three address instructions. That is, each instruction is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1380 expected to define at most one register, and to use at most two registers.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1381 However, some architectures use two address instructions. In this case, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1382 defined register is also one of the used registers. For instance, an instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	1383 such as ``ADD %EAX, %EBX``, in X86 is actually equivalent to ``%EAX = %EAX +
1d019706d866 LLVM10 anatofuz parents: diff changeset	1384 %EBX``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1385
1d019706d866 LLVM10 anatofuz parents: diff changeset	1386 In order to produce correct code, LLVM must convert three address instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	1387 that represent two address instructions into true two address instructions. LLVM
1d019706d866 LLVM10 anatofuz parents: diff changeset	1388 provides the pass ``TwoAddressInstructionPass`` for this specific purpose. It
1d019706d866 LLVM10 anatofuz parents: diff changeset	1389 must be run before register allocation takes place. After its execution, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1390 resulting code may no longer be in SSA form. This happens, for instance, in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1391 situations where an instruction such as ``%a = ADD %b %c`` is converted to two
1d019706d866 LLVM10 anatofuz parents: diff changeset	1392 instructions such as:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1393
1d019706d866 LLVM10 anatofuz parents: diff changeset	1394 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1395
1d019706d866 LLVM10 anatofuz parents: diff changeset	1396 %a = MOVE %b
1d019706d866 LLVM10 anatofuz parents: diff changeset	1397 %a = ADD %a %c
1d019706d866 LLVM10 anatofuz parents: diff changeset	1398
1d019706d866 LLVM10 anatofuz parents: diff changeset	1399 Notice that, internally, the second instruction is represented as ``ADD
1d019706d866 LLVM10 anatofuz parents: diff changeset	1400 %a[def/use] %c``. I.e., the register operand ``%a`` is both used and defined by
1d019706d866 LLVM10 anatofuz parents: diff changeset	1401 the instruction.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1402
1d019706d866 LLVM10 anatofuz parents: diff changeset	1403 The SSA deconstruction phase
1d019706d866 LLVM10 anatofuz parents: diff changeset	1404 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1405
1d019706d866 LLVM10 anatofuz parents: diff changeset	1406 An important transformation that happens during register allocation is called
1d019706d866 LLVM10 anatofuz parents: diff changeset	1407 the SSA Deconstruction Phase. The SSA form simplifies many analyses that are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1408 performed on the control flow graph of programs. However, traditional
1d019706d866 LLVM10 anatofuz parents: diff changeset	1409 instruction sets do not implement PHI instructions. Thus, in order to generate
1d019706d866 LLVM10 anatofuz parents: diff changeset	1410 executable code, compilers must replace PHI instructions with other instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	1411 that preserve their semantics.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1412
1d019706d866 LLVM10 anatofuz parents: diff changeset	1413 There are many ways in which PHI instructions can safely be removed from the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1414 target code. The most traditional PHI deconstruction algorithm replaces PHI
1d019706d866 LLVM10 anatofuz parents: diff changeset	1415 instructions with copy instructions. That is the strategy adopted by LLVM. The
1d019706d866 LLVM10 anatofuz parents: diff changeset	1416 SSA deconstruction algorithm is implemented in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1417 ``lib/CodeGen/PHIElimination.cpp``. In order to invoke this pass, the identifier
1d019706d866 LLVM10 anatofuz parents: diff changeset	1418 ``PHIEliminationID`` must be marked as required in the code of the register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1419 allocator.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1420
1d019706d866 LLVM10 anatofuz parents: diff changeset	1421 Instruction folding
1d019706d866 LLVM10 anatofuz parents: diff changeset	1422 ^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1423
1d019706d866 LLVM10 anatofuz parents: diff changeset	1424 Instruction folding is an optimization performed during register allocation
1d019706d866 LLVM10 anatofuz parents: diff changeset	1425 that removes unnecessary copy instructions. For instance, a sequence of
1d019706d866 LLVM10 anatofuz parents: diff changeset	1426 instructions such as:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1427
1d019706d866 LLVM10 anatofuz parents: diff changeset	1428 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1429
1d019706d866 LLVM10 anatofuz parents: diff changeset	1430 %EBX = LOAD %mem_address
1d019706d866 LLVM10 anatofuz parents: diff changeset	1431 %EAX = COPY %EBX
1d019706d866 LLVM10 anatofuz parents: diff changeset	1432
1d019706d866 LLVM10 anatofuz parents: diff changeset	1433 can be safely substituted by the single instruction:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1434
1d019706d866 LLVM10 anatofuz parents: diff changeset	1435 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1436
1d019706d866 LLVM10 anatofuz parents: diff changeset	1437 %EAX = LOAD %mem_address
1d019706d866 LLVM10 anatofuz parents: diff changeset	1438
1d019706d866 LLVM10 anatofuz parents: diff changeset	1439 Instructions can be folded with the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1440 ``TargetRegisterInfo::foldMemoryOperand(...)`` method. Care must be taken when
1d019706d866 LLVM10 anatofuz parents: diff changeset	1441 folding instructions; a folded instruction can be quite different from the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1442 original instruction. See ``LiveIntervals::addIntervalsForSpills`` in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1443 ``lib/CodeGen/LiveIntervalAnalysis.cpp`` for an example of its use.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1444
1d019706d866 LLVM10 anatofuz parents: diff changeset	1445 Built in register allocators
1d019706d866 LLVM10 anatofuz parents: diff changeset	1446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1447
1d019706d866 LLVM10 anatofuz parents: diff changeset	1448 The LLVM infrastructure provides the application developer with three different
1d019706d866 LLVM10 anatofuz parents: diff changeset	1449 register allocators:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1450
1d019706d866 LLVM10 anatofuz parents: diff changeset	1451 * Fast --- This register allocator is the default for debug builds. It
1d019706d866 LLVM10 anatofuz parents: diff changeset	1452 allocates registers on a basic block level, attempting to keep values in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1453 registers and reusing registers as appropriate.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1454
1d019706d866 LLVM10 anatofuz parents: diff changeset	1455 * Basic --- This is an incremental approach to register allocation. Live
1d019706d866 LLVM10 anatofuz parents: diff changeset	1456 ranges are assigned to registers one at a time in an order that is driven by
1d019706d866 LLVM10 anatofuz parents: diff changeset	1457 heuristics. Since code can be rewritten on-the-fly during allocation, this
1d019706d866 LLVM10 anatofuz parents: diff changeset	1458 framework allows interesting allocators to be developed as extensions. It is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1459 not itself a production register allocator but is a potentially useful
1d019706d866 LLVM10 anatofuz parents: diff changeset	1460 stand-alone mode for triaging bugs and as a performance baseline.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1461
1d019706d866 LLVM10 anatofuz parents: diff changeset	1462 * Greedy --- The default allocator. This is a highly tuned implementation of
1d019706d866 LLVM10 anatofuz parents: diff changeset	1463 the Basic allocator that incorporates global live range splitting. This
1d019706d866 LLVM10 anatofuz parents: diff changeset	1464 allocator works hard to minimize the cost of spill code.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1465
1d019706d866 LLVM10 anatofuz parents: diff changeset	1466 * PBQP --- A Partitioned Boolean Quadratic Programming (PBQP) based register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1467 allocator. This allocator works by constructing a PBQP problem representing
1d019706d866 LLVM10 anatofuz parents: diff changeset	1468 the register allocation problem under consideration, solving this using a PBQP
1d019706d866 LLVM10 anatofuz parents: diff changeset	1469 solver, and mapping the solution back to a register assignment.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1470
1d019706d866 LLVM10 anatofuz parents: diff changeset	1471 The type of register allocator used in ``llc`` can be chosen with the command
1d019706d866 LLVM10 anatofuz parents: diff changeset	1472 line option ``-regalloc=...``:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1473
1d019706d866 LLVM10 anatofuz parents: diff changeset	1474 .. code-block:: bash
1d019706d866 LLVM10 anatofuz parents: diff changeset	1475
1d019706d866 LLVM10 anatofuz parents: diff changeset	1476 $ llc -regalloc=linearscan file.bc -o ln.s
1d019706d866 LLVM10 anatofuz parents: diff changeset	1477 $ llc -regalloc=fast file.bc -o fa.s
1d019706d866 LLVM10 anatofuz parents: diff changeset	1478 $ llc -regalloc=pbqp file.bc -o pbqp.s
1d019706d866 LLVM10 anatofuz parents: diff changeset	1479
1d019706d866 LLVM10 anatofuz parents: diff changeset	1480 .. _Prolog/Epilog Code Insertion:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1481
1d019706d866 LLVM10 anatofuz parents: diff changeset	1482 Prolog/Epilog Code Insertion
1d019706d866 LLVM10 anatofuz parents: diff changeset	1483 ----------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1484
1d019706d866 LLVM10 anatofuz parents: diff changeset	1485 Compact Unwind
1d019706d866 LLVM10 anatofuz parents: diff changeset	1486
1d019706d866 LLVM10 anatofuz parents: diff changeset	1487 Throwing an exception requires unwinding out of a function. The information on
1d019706d866 LLVM10 anatofuz parents: diff changeset	1488 how to unwind a given function is traditionally expressed in DWARF unwind
1d019706d866 LLVM10 anatofuz parents: diff changeset	1489 (a.k.a. frame) info. But that format was originally developed for debuggers to
1d019706d866 LLVM10 anatofuz parents: diff changeset	1490 backtrace, and each Frame Description Entry (FDE) requires ~20-30 bytes per
1d019706d866 LLVM10 anatofuz parents: diff changeset	1491 function. There is also the cost of mapping from an address in a function to the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1492 corresponding FDE at runtime. An alternative unwind encoding is called *compact
1d019706d866 LLVM10 anatofuz parents: diff changeset	1493 unwind* and requires just 4-bytes per function.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1494
1d019706d866 LLVM10 anatofuz parents: diff changeset	1495 The compact unwind encoding is a 32-bit value, which is encoded in an
1d019706d866 LLVM10 anatofuz parents: diff changeset	1496 architecture-specific way. It specifies which registers to restore and from
1d019706d866 LLVM10 anatofuz parents: diff changeset	1497 where, and how to unwind out of the function. When the linker creates a final
1d019706d866 LLVM10 anatofuz parents: diff changeset	1498 linked image, it will create a ``__TEXT,__unwind_info`` section. This section is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1499 a small and fast way for the runtime to access unwind info for any given
1d019706d866 LLVM10 anatofuz parents: diff changeset	1500 function. If we emit compact unwind info for the function, that compact unwind
1d019706d866 LLVM10 anatofuz parents: diff changeset	1501 info will be encoded in the ``__TEXT,__unwind_info`` section. If we emit DWARF
1d019706d866 LLVM10 anatofuz parents: diff changeset	1502 unwind info, the ``__TEXT,__unwind_info`` section will contain the offset of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1503 FDE in the ``__TEXT,__eh_frame`` section in the final linked image.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1504
1d019706d866 LLVM10 anatofuz parents: diff changeset	1505 For X86, there are three modes for the compact unwind encoding:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1506
1d019706d866 LLVM10 anatofuz parents: diff changeset	1507 Function with a Frame Pointer (``EBP`` or ``RBP``)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1508 ``EBP/RBP``-based frame, where ``EBP/RBP`` is pushed onto the stack
1d019706d866 LLVM10 anatofuz parents: diff changeset	1509 immediately after the return address, then ``ESP/RSP`` is moved to
1d019706d866 LLVM10 anatofuz parents: diff changeset	1510 ``EBP/RBP``. Thus to unwind, ``ESP/RSP`` is restored with the current
1d019706d866 LLVM10 anatofuz parents: diff changeset	1511 ``EBP/RBP`` value, then ``EBP/RBP`` is restored by popping the stack, and the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1512 return is done by popping the stack once more into the PC. All non-volatile
1d019706d866 LLVM10 anatofuz parents: diff changeset	1513 registers that need to be restored must have been saved in a small range on
1d019706d866 LLVM10 anatofuz parents: diff changeset	1514 the stack that starts ``EBP-4`` to ``EBP-1020`` (``RBP-8`` to
1d019706d866 LLVM10 anatofuz parents: diff changeset	1515 ``RBP-1020``). The offset (divided by 4 in 32-bit mode and 8 in 64-bit mode)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1516 is encoded in bits 16-23 (mask: ``0x00FF0000``). The registers saved are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1517 encoded in bits 0-14 (mask: ``0x00007FFF``) as five 3-bit entries from the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1518 following table:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1519
1d019706d866 LLVM10 anatofuz parents: diff changeset	1520 ============== ============= ===============
1d019706d866 LLVM10 anatofuz parents: diff changeset	1521 Compact Number i386 Register x86-64 Register
1d019706d866 LLVM10 anatofuz parents: diff changeset	1522 ============== ============= ===============
1d019706d866 LLVM10 anatofuz parents: diff changeset	1523 1 ``EBX`` ``RBX``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1524 2 ``ECX`` ``R12``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1525 3 ``EDX`` ``R13``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1526 4 ``EDI`` ``R14``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1527 5 ``ESI`` ``R15``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1528 6 ``EBP`` ``RBP``
1d019706d866 LLVM10 anatofuz parents: diff changeset	1529 ============== ============= ===============
1d019706d866 LLVM10 anatofuz parents: diff changeset	1530
1d019706d866 LLVM10 anatofuz parents: diff changeset	1531 Frameless with a Small Constant Stack Size (``EBP`` or ``RBP`` is not used as a frame pointer)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1532 To return, a constant (encoded in the compact unwind encoding) is added to the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1533 ``ESP/RSP``. Then the return is done by popping the stack into the PC. All
1d019706d866 LLVM10 anatofuz parents: diff changeset	1534 non-volatile registers that need to be restored must have been saved on the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1535 stack immediately after the return address. The stack size (divided by 4 in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1536 32-bit mode and 8 in 64-bit mode) is encoded in bits 16-23 (mask:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1537 ``0x00FF0000``). There is a maximum stack size of 1024 bytes in 32-bit mode
1d019706d866 LLVM10 anatofuz parents: diff changeset	1538 and 2048 in 64-bit mode. The number of registers saved is encoded in bits 9-12
1d019706d866 LLVM10 anatofuz parents: diff changeset	1539 (mask: ``0x00001C00``). Bits 0-9 (mask: ``0x000003FF``) contain which
1d019706d866 LLVM10 anatofuz parents: diff changeset	1540 registers were saved and their order. (See the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1541 ``encodeCompactUnwindRegistersWithoutFrame()`` function in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1542 ``lib/Target/X86FrameLowering.cpp`` for the encoding algorithm.)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1543
1d019706d866 LLVM10 anatofuz parents: diff changeset	1544 Frameless with a Large Constant Stack Size (``EBP`` or ``RBP`` is not used as a frame pointer)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1545 This case is like the "Frameless with a Small Constant Stack Size" case, but
1d019706d866 LLVM10 anatofuz parents: diff changeset	1546 the stack size is too large to encode in the compact unwind encoding. Instead
1d019706d866 LLVM10 anatofuz parents: diff changeset	1547 it requires that the function contains "``subl $nnnnnn, %esp``" in its
1d019706d866 LLVM10 anatofuz parents: diff changeset	1548 prolog. The compact encoding contains the offset to the ``$nnnnnn`` value in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1549 the function in bits 9-12 (mask: ``0x00001C00``).
1d019706d866 LLVM10 anatofuz parents: diff changeset	1550
1d019706d866 LLVM10 anatofuz parents: diff changeset	1551 .. _Late Machine Code Optimizations:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1552
1d019706d866 LLVM10 anatofuz parents: diff changeset	1553 Late Machine Code Optimizations
1d019706d866 LLVM10 anatofuz parents: diff changeset	1554 -------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1555
1d019706d866 LLVM10 anatofuz parents: diff changeset	1556 .. note::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1557
1d019706d866 LLVM10 anatofuz parents: diff changeset	1558 To Be Written
1d019706d866 LLVM10 anatofuz parents: diff changeset	1559
1d019706d866 LLVM10 anatofuz parents: diff changeset	1560 .. _Code Emission:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1561
1d019706d866 LLVM10 anatofuz parents: diff changeset	1562 Code Emission
1d019706d866 LLVM10 anatofuz parents: diff changeset	1563 -------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1564
1d019706d866 LLVM10 anatofuz parents: diff changeset	1565 The code emission step of code generation is responsible for lowering from the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1566 code generator abstractions (like `MachineFunction`_, `MachineInstr`_, etc) down
1d019706d866 LLVM10 anatofuz parents: diff changeset	1567 to the abstractions used by the MC layer (`MCInst`_, `MCStreamer`_, etc). This
1d019706d866 LLVM10 anatofuz parents: diff changeset	1568 is done with a combination of several different classes: the (misnamed)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1569 target-independent AsmPrinter class, target-specific subclasses of AsmPrinter
1d019706d866 LLVM10 anatofuz parents: diff changeset	1570 (such as SparcAsmPrinter), and the TargetLoweringObjectFile class.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1571
1d019706d866 LLVM10 anatofuz parents: diff changeset	1572 Since the MC layer works at the level of abstraction of object files, it doesn't
1d019706d866 LLVM10 anatofuz parents: diff changeset	1573 have a notion of functions, global variables etc. Instead, it thinks about
1d019706d866 LLVM10 anatofuz parents: diff changeset	1574 labels, directives, and instructions. A key class used at this time is the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1575 MCStreamer class. This is an abstract API that is implemented in different ways
1d019706d866 LLVM10 anatofuz parents: diff changeset	1576 (e.g. to output a .s file, output an ELF .o file, etc) that is effectively an
1d019706d866 LLVM10 anatofuz parents: diff changeset	1577 "assembler API". MCStreamer has one method per directive, such as EmitLabel,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1578 EmitSymbolAttribute, SwitchSection, etc, which directly correspond to assembly
1d019706d866 LLVM10 anatofuz parents: diff changeset	1579 level directives.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1580
1d019706d866 LLVM10 anatofuz parents: diff changeset	1581 If you are interested in implementing a code generator for a target, there are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1582 three important things that you have to implement for your target:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1583
1d019706d866 LLVM10 anatofuz parents: diff changeset	1584 #. First, you need a subclass of AsmPrinter for your target. This class
1d019706d866 LLVM10 anatofuz parents: diff changeset	1585 implements the general lowering process converting MachineFunction's into MC
1d019706d866 LLVM10 anatofuz parents: diff changeset	1586 label constructs. The AsmPrinter base class provides a number of useful
1d019706d866 LLVM10 anatofuz parents: diff changeset	1587 methods and routines, and also allows you to override the lowering process in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1588 some important ways. You should get much of the lowering for free if you are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1589 implementing an ELF, COFF, or MachO target, because the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1590 TargetLoweringObjectFile class implements much of the common logic.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1591
1d019706d866 LLVM10 anatofuz parents: diff changeset	1592 #. Second, you need to implement an instruction printer for your target. The
1d019706d866 LLVM10 anatofuz parents: diff changeset	1593 instruction printer takes an `MCInst`_ and renders it to a raw_ostream as
1d019706d866 LLVM10 anatofuz parents: diff changeset	1594 text. Most of this is automatically generated from the .td file (when you
1d019706d866 LLVM10 anatofuz parents: diff changeset	1595 specify something like "``add $dst, $src1, $src2``" in the instructions), but
1d019706d866 LLVM10 anatofuz parents: diff changeset	1596 you need to implement routines to print operands.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1597
1d019706d866 LLVM10 anatofuz parents: diff changeset	1598 #. Third, you need to implement code that lowers a `MachineInstr`_ to an MCInst,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1599 usually implemented in "<target>MCInstLower.cpp". This lowering process is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1600 often target specific, and is responsible for turning jump table entries,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1601 constant pool indices, global variable addresses, etc into MCLabels as
1d019706d866 LLVM10 anatofuz parents: diff changeset	1602 appropriate. This translation layer is also responsible for expanding pseudo
1d019706d866 LLVM10 anatofuz parents: diff changeset	1603 ops used by the code generator into the actual machine instructions they
1d019706d866 LLVM10 anatofuz parents: diff changeset	1604 correspond to. The MCInsts that are generated by this are fed into the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1605 instruction printer or the encoder.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1606
1d019706d866 LLVM10 anatofuz parents: diff changeset	1607 Finally, at your choosing, you can also implement a subclass of MCCodeEmitter
1d019706d866 LLVM10 anatofuz parents: diff changeset	1608 which lowers MCInst's into machine code bytes and relocations. This is
1d019706d866 LLVM10 anatofuz parents: diff changeset	1609 important if you want to support direct .o file emission, or would like to
1d019706d866 LLVM10 anatofuz parents: diff changeset	1610 implement an assembler for your target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1611
1d019706d866 LLVM10 anatofuz parents: diff changeset	1612 Emitting function stack size information
1d019706d866 LLVM10 anatofuz parents: diff changeset	1613 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1614
1d019706d866 LLVM10 anatofuz parents: diff changeset	1615 A section containing metadata on function stack sizes will be emitted when
1d019706d866 LLVM10 anatofuz parents: diff changeset	1616 ``TargetLoweringObjectFile::StackSizesSection`` is not null, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1617 ``TargetOptions::EmitStackSizeSection`` is set (-stack-size-section). The
1d019706d866 LLVM10 anatofuz parents: diff changeset	1618 section will contain an array of pairs of function symbol values (pointer size)
1d019706d866 LLVM10 anatofuz parents: diff changeset	1619 and stack sizes (unsigned LEB128). The stack size values only include the space
1d019706d866 LLVM10 anatofuz parents: diff changeset	1620 allocated in the function prologue. Functions with dynamic stack allocations are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1621 not included.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1622
1d019706d866 LLVM10 anatofuz parents: diff changeset	1623 VLIW Packetizer
1d019706d866 LLVM10 anatofuz parents: diff changeset	1624 ---------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1625
1d019706d866 LLVM10 anatofuz parents: diff changeset	1626 In a Very Long Instruction Word (VLIW) architecture, the compiler is responsible
1d019706d866 LLVM10 anatofuz parents: diff changeset	1627 for mapping instructions to functional-units available on the architecture. To
1d019706d866 LLVM10 anatofuz parents: diff changeset	1628 that end, the compiler creates groups of instructions called packets or
1d019706d866 LLVM10 anatofuz parents: diff changeset	1629 bundles. The VLIW packetizer in LLVM is a target-independent mechanism to
1d019706d866 LLVM10 anatofuz parents: diff changeset	1630 enable the packetization of machine instructions.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1631
1d019706d866 LLVM10 anatofuz parents: diff changeset	1632 Mapping from instructions to functional units
1d019706d866 LLVM10 anatofuz parents: diff changeset	1633 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1634
1d019706d866 LLVM10 anatofuz parents: diff changeset	1635 Instructions in a VLIW target can typically be mapped to multiple functional
1d019706d866 LLVM10 anatofuz parents: diff changeset	1636 units. During the process of packetizing, the compiler must be able to reason
1d019706d866 LLVM10 anatofuz parents: diff changeset	1637 about whether an instruction can be added to a packet. This decision can be
1d019706d866 LLVM10 anatofuz parents: diff changeset	1638 complex since the compiler has to examine all possible mappings of instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	1639 to functional units. Therefore to alleviate compilation-time complexity, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1640 VLIW packetizer parses the instruction classes of a target and generates tables
1d019706d866 LLVM10 anatofuz parents: diff changeset	1641 at compiler build time. These tables can then be queried by the provided
1d019706d866 LLVM10 anatofuz parents: diff changeset	1642 machine-independent API to determine if an instruction can be accommodated in a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1643 packet.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1644
1d019706d866 LLVM10 anatofuz parents: diff changeset	1645 How the packetization tables are generated and used
1d019706d866 LLVM10 anatofuz parents: diff changeset	1646 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1647
1d019706d866 LLVM10 anatofuz parents: diff changeset	1648 The packetizer reads instruction classes from a target's itineraries and creates
1d019706d866 LLVM10 anatofuz parents: diff changeset	1649 a deterministic finite automaton (DFA) to represent the state of a packet. A DFA
1d019706d866 LLVM10 anatofuz parents: diff changeset	1650 consists of three major elements: inputs, states, and transitions. The set of
1d019706d866 LLVM10 anatofuz parents: diff changeset	1651 inputs for the generated DFA represents the instruction being added to a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1652 packet. The states represent the possible consumption of functional units by
1d019706d866 LLVM10 anatofuz parents: diff changeset	1653 instructions in a packet. In the DFA, transitions from one state to another
1d019706d866 LLVM10 anatofuz parents: diff changeset	1654 occur on the addition of an instruction to an existing packet. If there is a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1655 legal mapping of functional units to instructions, then the DFA contains a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1656 corresponding transition. The absence of a transition indicates that a legal
1d019706d866 LLVM10 anatofuz parents: diff changeset	1657 mapping does not exist and that the instruction cannot be added to the packet.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1658
1d019706d866 LLVM10 anatofuz parents: diff changeset	1659 To generate tables for a VLIW target, add Target\ GenDFAPacketizer.inc as a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1660 target to the Makefile in the target directory. The exported API provides three
1d019706d866 LLVM10 anatofuz parents: diff changeset	1661 functions: ``DFAPacketizer::clearResources()``,
1d019706d866 LLVM10 anatofuz parents: diff changeset	1662 ``DFAPacketizer::reserveResources(MachineInstr *MI)``, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	1663 ``DFAPacketizer::canReserveResources(MachineInstr *MI)``. These functions allow
1d019706d866 LLVM10 anatofuz parents: diff changeset	1664 a target packetizer to add an instruction to an existing packet and to check
1d019706d866 LLVM10 anatofuz parents: diff changeset	1665 whether an instruction can be added to a packet. See
1d019706d866 LLVM10 anatofuz parents: diff changeset	1666 ``llvm/CodeGen/DFAPacketizer.h`` for more information.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1667
1d019706d866 LLVM10 anatofuz parents: diff changeset	1668 Implementing a Native Assembler
1d019706d866 LLVM10 anatofuz parents: diff changeset	1669 ===============================
1d019706d866 LLVM10 anatofuz parents: diff changeset	1670
1d019706d866 LLVM10 anatofuz parents: diff changeset	1671 Though you're probably reading this because you want to write or maintain a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1672 compiler backend, LLVM also fully supports building a native assembler.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1673 We've tried hard to automate the generation of the assembler from the .td files
1d019706d866 LLVM10 anatofuz parents: diff changeset	1674 (in particular the instruction syntax and encodings), which means that a large
1d019706d866 LLVM10 anatofuz parents: diff changeset	1675 part of the manual and repetitive data entry can be factored and shared with the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1676 compiler.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1677
1d019706d866 LLVM10 anatofuz parents: diff changeset	1678 Instruction Parsing
1d019706d866 LLVM10 anatofuz parents: diff changeset	1679 -------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1680
1d019706d866 LLVM10 anatofuz parents: diff changeset	1681 .. note::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1682
1d019706d866 LLVM10 anatofuz parents: diff changeset	1683 To Be Written
1d019706d866 LLVM10 anatofuz parents: diff changeset	1684
1d019706d866 LLVM10 anatofuz parents: diff changeset	1685
1d019706d866 LLVM10 anatofuz parents: diff changeset	1686 Instruction Alias Processing
1d019706d866 LLVM10 anatofuz parents: diff changeset	1687 ----------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1688
1d019706d866 LLVM10 anatofuz parents: diff changeset	1689 Once the instruction is parsed, it enters the MatchInstructionImpl function.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1690 The MatchInstructionImpl function performs alias processing and then does actual
1d019706d866 LLVM10 anatofuz parents: diff changeset	1691 matching.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1692
1d019706d866 LLVM10 anatofuz parents: diff changeset	1693 Alias processing is the phase that canonicalizes different lexical forms of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1694 same instructions down to one representation. There are several different kinds
1d019706d866 LLVM10 anatofuz parents: diff changeset	1695 of alias that are possible to implement and they are listed below in the order
1d019706d866 LLVM10 anatofuz parents: diff changeset	1696 that they are processed (which is in order from simplest/weakest to most
1d019706d866 LLVM10 anatofuz parents: diff changeset	1697 complex/powerful). Generally you want to use the first alias mechanism that
1d019706d866 LLVM10 anatofuz parents: diff changeset	1698 meets the needs of your instruction, because it will allow a more concise
1d019706d866 LLVM10 anatofuz parents: diff changeset	1699 description.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1700
1d019706d866 LLVM10 anatofuz parents: diff changeset	1701 Mnemonic Aliases
1d019706d866 LLVM10 anatofuz parents: diff changeset	1702 ^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1703
1d019706d866 LLVM10 anatofuz parents: diff changeset	1704 The first phase of alias processing is simple instruction mnemonic remapping for
1d019706d866 LLVM10 anatofuz parents: diff changeset	1705 classes of instructions which are allowed with two different mnemonics. This
1d019706d866 LLVM10 anatofuz parents: diff changeset	1706 phase is a simple and unconditionally remapping from one input mnemonic to one
1d019706d866 LLVM10 anatofuz parents: diff changeset	1707 output mnemonic. It isn't possible for this form of alias to look at the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1708 operands at all, so the remapping must apply for all forms of a given mnemonic.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1709 Mnemonic aliases are defined simply, for example X86 has:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1710
1d019706d866 LLVM10 anatofuz parents: diff changeset	1711 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1712
1d019706d866 LLVM10 anatofuz parents: diff changeset	1713 def : MnemonicAlias<"cbw", "cbtw">;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1714 def : MnemonicAlias<"smovq", "movsq">;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1715 def : MnemonicAlias<"fldcww", "fldcw">;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1716 def : MnemonicAlias<"fucompi", "fucomip">;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1717 def : MnemonicAlias<"ud2a", "ud2">;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1718
1d019706d866 LLVM10 anatofuz parents: diff changeset	1719 ... and many others. With a MnemonicAlias definition, the mnemonic is remapped
1d019706d866 LLVM10 anatofuz parents: diff changeset	1720 simply and directly. Though MnemonicAlias's can't look at any aspect of the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1721 instruction (such as the operands) they can depend on global modes (the same
1d019706d866 LLVM10 anatofuz parents: diff changeset	1722 ones supported by the matcher), through a Requires clause:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1723
1d019706d866 LLVM10 anatofuz parents: diff changeset	1724 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1725
1d019706d866 LLVM10 anatofuz parents: diff changeset	1726 def : MnemonicAlias<"pushf", "pushfq">, Requires<[In64BitMode]>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1727 def : MnemonicAlias<"pushf", "pushfl">, Requires<[In32BitMode]>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1728
1d019706d866 LLVM10 anatofuz parents: diff changeset	1729 In this example, the mnemonic gets mapped into a different one depending on
1d019706d866 LLVM10 anatofuz parents: diff changeset	1730 the current instruction set.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1731
1d019706d866 LLVM10 anatofuz parents: diff changeset	1732 Instruction Aliases
1d019706d866 LLVM10 anatofuz parents: diff changeset	1733 ^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1734
1d019706d866 LLVM10 anatofuz parents: diff changeset	1735 The most general phase of alias processing occurs while matching is happening:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1736 it provides new forms for the matcher to match along with a specific instruction
1d019706d866 LLVM10 anatofuz parents: diff changeset	1737 to generate. An instruction alias has two parts: the string to match and the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1738 instruction to generate. For example:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1739
1d019706d866 LLVM10 anatofuz parents: diff changeset	1740 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1741
1d019706d866 LLVM10 anatofuz parents: diff changeset	1742 def : InstAlias<"movsx $src, $dst", (MOVSX16rr8W GR16:$dst, GR8 :$src)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1743 def : InstAlias<"movsx $src, $dst", (MOVSX16rm8W GR16:$dst, i8mem:$src)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1744 def : InstAlias<"movsx $src, $dst", (MOVSX32rr8 GR32:$dst, GR8 :$src)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1745 def : InstAlias<"movsx $src, $dst", (MOVSX32rr16 GR32:$dst, GR16 :$src)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1746 def : InstAlias<"movsx $src, $dst", (MOVSX64rr8 GR64:$dst, GR8 :$src)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1747 def : InstAlias<"movsx $src, $dst", (MOVSX64rr16 GR64:$dst, GR16 :$src)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1748 def : InstAlias<"movsx $src, $dst", (MOVSX64rr32 GR64:$dst, GR32 :$src)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1749
1d019706d866 LLVM10 anatofuz parents: diff changeset	1750 This shows a powerful example of the instruction aliases, matching the same
1d019706d866 LLVM10 anatofuz parents: diff changeset	1751 mnemonic in multiple different ways depending on what operands are present in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1752 the assembly. The result of instruction aliases can include operands in a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1753 different order than the destination instruction, and can use an input multiple
1d019706d866 LLVM10 anatofuz parents: diff changeset	1754 times, for example:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1755
1d019706d866 LLVM10 anatofuz parents: diff changeset	1756 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1757
1d019706d866 LLVM10 anatofuz parents: diff changeset	1758 def : InstAlias<"clrb $reg", (XOR8rr GR8 :$reg, GR8 :$reg)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1759 def : InstAlias<"clrw $reg", (XOR16rr GR16:$reg, GR16:$reg)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1760 def : InstAlias<"clrl $reg", (XOR32rr GR32:$reg, GR32:$reg)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1761 def : InstAlias<"clrq $reg", (XOR64rr GR64:$reg, GR64:$reg)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1762
1d019706d866 LLVM10 anatofuz parents: diff changeset	1763 This example also shows that tied operands are only listed once. In the X86
1d019706d866 LLVM10 anatofuz parents: diff changeset	1764 backend, XOR8rr has two input GR8's and one output GR8 (where an input is tied
1d019706d866 LLVM10 anatofuz parents: diff changeset	1765 to the output). InstAliases take a flattened operand list without duplicates
1d019706d866 LLVM10 anatofuz parents: diff changeset	1766 for tied operands. The result of an instruction alias can also use immediates
1d019706d866 LLVM10 anatofuz parents: diff changeset	1767 and fixed physical registers which are added as simple immediate operands in the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1768 result, for example:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1769
1d019706d866 LLVM10 anatofuz parents: diff changeset	1770 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1771
1d019706d866 LLVM10 anatofuz parents: diff changeset	1772 // Fixed Immediate operand.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1773 def : InstAlias<"aad", (AAD8i8 10)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1774
1d019706d866 LLVM10 anatofuz parents: diff changeset	1775 // Fixed register operand.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1776 def : InstAlias<"fcomi", (COM_FIr ST1)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1777
1d019706d866 LLVM10 anatofuz parents: diff changeset	1778 // Simple alias.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1779 def : InstAlias<"fcomi $reg", (COM_FIr RST:$reg)>;
1d019706d866 LLVM10 anatofuz parents: diff changeset	1780
1d019706d866 LLVM10 anatofuz parents: diff changeset	1781 Instruction aliases can also have a Requires clause to make them subtarget
1d019706d866 LLVM10 anatofuz parents: diff changeset	1782 specific.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1783
1d019706d866 LLVM10 anatofuz parents: diff changeset	1784 If the back-end supports it, the instruction printer can automatically emit the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1785 alias rather than what's being aliased. It typically leads to better, more
1d019706d866 LLVM10 anatofuz parents: diff changeset	1786 readable code. If it's better to print out what's being aliased, then pass a '0'
1d019706d866 LLVM10 anatofuz parents: diff changeset	1787 as the third parameter to the InstAlias definition.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1788
1d019706d866 LLVM10 anatofuz parents: diff changeset	1789 Instruction Matching
1d019706d866 LLVM10 anatofuz parents: diff changeset	1790 --------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1791
1d019706d866 LLVM10 anatofuz parents: diff changeset	1792 .. note::
1d019706d866 LLVM10 anatofuz parents: diff changeset	1793
1d019706d866 LLVM10 anatofuz parents: diff changeset	1794 To Be Written
1d019706d866 LLVM10 anatofuz parents: diff changeset	1795
1d019706d866 LLVM10 anatofuz parents: diff changeset	1796 .. _Implementations of the abstract target description interfaces:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1797 .. _implement the target description:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1798
1d019706d866 LLVM10 anatofuz parents: diff changeset	1799 Target-specific Implementation Notes
1d019706d866 LLVM10 anatofuz parents: diff changeset	1800 ====================================
1d019706d866 LLVM10 anatofuz parents: diff changeset	1801
1d019706d866 LLVM10 anatofuz parents: diff changeset	1802 This section of the document explains features or design decisions that are
1d019706d866 LLVM10 anatofuz parents: diff changeset	1803 specific to the code generator for a particular target. First we start with a
1d019706d866 LLVM10 anatofuz parents: diff changeset	1804 table that summarizes what features are supported by each target.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1805
1d019706d866 LLVM10 anatofuz parents: diff changeset	1806 .. _target-feature-matrix:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1807
1d019706d866 LLVM10 anatofuz parents: diff changeset	1808 Target Feature Matrix
1d019706d866 LLVM10 anatofuz parents: diff changeset	1809 ---------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	1810
1d019706d866 LLVM10 anatofuz parents: diff changeset	1811 Note that this table does not list features that are not supported fully by any
1d019706d866 LLVM10 anatofuz parents: diff changeset	1812 target yet. It considers a feature to be supported if at least one subtarget
1d019706d866 LLVM10 anatofuz parents: diff changeset	1813 supports it. A feature being supported means that it is useful and works for
1d019706d866 LLVM10 anatofuz parents: diff changeset	1814 most cases, it does not indicate that there are zero known bugs in the
1d019706d866 LLVM10 anatofuz parents: diff changeset	1815 implementation. Here is the key:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1816
1d019706d866 LLVM10 anatofuz parents: diff changeset	1817 :raw-html:`<table border="1" cellspacing="0">`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1818 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1819 :raw-html:`<th>Unknown</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1820 :raw-html:`<th>Not Applicable</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1821 :raw-html:`<th>No support</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1822 :raw-html:`<th>Partial Support</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1823 :raw-html:`<th>Complete Support</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1824 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1825 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1826 :raw-html:`<td class="unknown"></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1827 :raw-html:`<td class="na"></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1828 :raw-html:`<td class="no"></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1829 :raw-html:`<td class="partial"></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1830 :raw-html:`<td class="yes"></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1831 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1832 :raw-html:`</table>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1833
1d019706d866 LLVM10 anatofuz parents: diff changeset	1834 Here is the table:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1835
1d019706d866 LLVM10 anatofuz parents: diff changeset	1836 :raw-html:`<table width="689" border="1" cellspacing="0">`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1837 :raw-html:`<tr><td></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1838 :raw-html:`<td colspan="13" align="center" style="background-color:#ffc">Target</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1839 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1840 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1841 :raw-html:`<th>Feature</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1842 :raw-html:`<th>ARM</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1843 :raw-html:`<th>Hexagon</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1844 :raw-html:`<th>MSP430</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1845 :raw-html:`<th>Mips</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1846 :raw-html:`<th>NVPTX</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1847 :raw-html:`<th>PowerPC</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1848 :raw-html:`<th>Sparc</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1849 :raw-html:`<th>SystemZ</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1850 :raw-html:`<th>X86</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1851 :raw-html:`<th>XCore</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1852 :raw-html:`<th>eBPF</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1853 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1854
1d019706d866 LLVM10 anatofuz parents: diff changeset	1855 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1856 :raw-html:`<td><a href="#feat_reliable">is generally reliable</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1857 :raw-html:`<td class="yes"></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1858 :raw-html:`<td class="yes"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1859 :raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1860 :raw-html:`<td class="yes"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1861 :raw-html:`<td class="yes"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1862 :raw-html:`<td class="yes"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1863 :raw-html:`<td class="yes"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1864 :raw-html:`<td class="yes"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1865 :raw-html:`<td class="yes"></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1866 :raw-html:`<td class="yes"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1867 :raw-html:`<td class="yes"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1868 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1869
1d019706d866 LLVM10 anatofuz parents: diff changeset	1870 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1871 :raw-html:`<td><a href="#feat_asmparser">assembly parser</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1872 :raw-html:`<td class="no"></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1873 :raw-html:`<td class="no"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1874 :raw-html:`<td class="no"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1875 :raw-html:`<td class="yes"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1876 :raw-html:`<td class="no"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1877 :raw-html:`<td class="yes"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1878 :raw-html:`<td class="no"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1879 :raw-html:`<td class="yes"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1880 :raw-html:`<td class="yes"></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1881 :raw-html:`<td class="no"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1882 :raw-html:`<td class="no"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1883 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1884
1d019706d866 LLVM10 anatofuz parents: diff changeset	1885 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1886 :raw-html:`<td><a href="#feat_disassembler">disassembler</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1887 :raw-html:`<td class="yes"></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1888 :raw-html:`<td class="no"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1889 :raw-html:`<td class="no"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1890 :raw-html:`<td class="yes"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1891 :raw-html:`<td class="na"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1892 :raw-html:`<td class="yes"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1893 :raw-html:`<td class="yes"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1894 :raw-html:`<td class="yes"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1895 :raw-html:`<td class="yes"></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1896 :raw-html:`<td class="yes"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1897 :raw-html:`<td class="yes"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1898 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1899
1d019706d866 LLVM10 anatofuz parents: diff changeset	1900 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1901 :raw-html:`<td><a href="#feat_inlineasm">inline asm</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1902 :raw-html:`<td class="yes"></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1903 :raw-html:`<td class="yes"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1904 :raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1905 :raw-html:`<td class="yes"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1906 :raw-html:`<td class="yes"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1907 :raw-html:`<td class="yes"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1908 :raw-html:`<td class="unknown"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1909 :raw-html:`<td class="yes"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1910 :raw-html:`<td class="yes"></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1911 :raw-html:`<td class="yes"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1912 :raw-html:`<td class="no"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1913 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1914
1d019706d866 LLVM10 anatofuz parents: diff changeset	1915 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1916 :raw-html:`<td><a href="#feat_jit">jit</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1917 :raw-html:`<td class="partial"><a href="#feat_jit_arm">*</a></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1918 :raw-html:`<td class="no"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1919 :raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1920 :raw-html:`<td class="yes"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1921 :raw-html:`<td class="na"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1922 :raw-html:`<td class="yes"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1923 :raw-html:`<td class="unknown"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1924 :raw-html:`<td class="yes"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1925 :raw-html:`<td class="yes"></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1926 :raw-html:`<td class="no"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1927 :raw-html:`<td class="yes"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1928 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1929
1d019706d866 LLVM10 anatofuz parents: diff changeset	1930 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1931 :raw-html:`<td><a href="#feat_objectwrite">.o file writing</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1932 :raw-html:`<td class="no"></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1933 :raw-html:`<td class="no"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1934 :raw-html:`<td class="no"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1935 :raw-html:`<td class="yes"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1936 :raw-html:`<td class="na"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1937 :raw-html:`<td class="yes"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1938 :raw-html:`<td class="no"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1939 :raw-html:`<td class="yes"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1940 :raw-html:`<td class="yes"></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1941 :raw-html:`<td class="no"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1942 :raw-html:`<td class="yes"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1943 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1944
1d019706d866 LLVM10 anatofuz parents: diff changeset	1945 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1946 :raw-html:`<td><a hr:raw-html:`ef="#feat_tailcall">tail calls</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1947 :raw-html:`<td class="yes"></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1948 :raw-html:`<td class="yes"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1949 :raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1950 :raw-html:`<td class="yes"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1951 :raw-html:`<td class="no"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1952 :raw-html:`<td class="yes"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1953 :raw-html:`<td class="unknown"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1954 :raw-html:`<td class="no"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1955 :raw-html:`<td class="yes"></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1956 :raw-html:`<td class="no"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1957 :raw-html:`<td class="no"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1958 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1959
1d019706d866 LLVM10 anatofuz parents: diff changeset	1960 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1961 :raw-html:`<td><a href="#feat_segstacks">segmented stacks</a></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1962 :raw-html:`<td class="no"></td> <!-- ARM -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1963 :raw-html:`<td class="no"></td> <!-- Hexagon -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1964 :raw-html:`<td class="no"></td> <!-- MSP430 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1965 :raw-html:`<td class="no"></td> <!-- Mips -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1966 :raw-html:`<td class="no"></td> <!-- NVPTX -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1967 :raw-html:`<td class="no"></td> <!-- PowerPC -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1968 :raw-html:`<td class="no"></td> <!-- Sparc -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1969 :raw-html:`<td class="no"></td> <!-- SystemZ -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1970 :raw-html:`<td class="partial"><a href="#feat_segstacks_x86">*</a></td> <!-- X86 -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1971 :raw-html:`<td class="no"></td> <!-- XCore -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1972 :raw-html:`<td class="no"></td> <!-- eBPF -->`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1973 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1974
1d019706d866 LLVM10 anatofuz parents: diff changeset	1975 :raw-html:`</table>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	1976
1d019706d866 LLVM10 anatofuz parents: diff changeset	1977 .. _feat_reliable:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1978
1d019706d866 LLVM10 anatofuz parents: diff changeset	1979 Is Generally Reliable
1d019706d866 LLVM10 anatofuz parents: diff changeset	1980 ^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1981
1d019706d866 LLVM10 anatofuz parents: diff changeset	1982 This box indicates whether the target is considered to be production quality.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1983 This indicates that the target has been used as a static compiler to compile
1d019706d866 LLVM10 anatofuz parents: diff changeset	1984 large amounts of code by a variety of different people and is in continuous use.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1985
1d019706d866 LLVM10 anatofuz parents: diff changeset	1986 .. _feat_asmparser:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1987
1d019706d866 LLVM10 anatofuz parents: diff changeset	1988 Assembly Parser
1d019706d866 LLVM10 anatofuz parents: diff changeset	1989 ^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	1990
1d019706d866 LLVM10 anatofuz parents: diff changeset	1991 This box indicates whether the target supports parsing target specific .s files
1d019706d866 LLVM10 anatofuz parents: diff changeset	1992 by implementing the MCAsmParser interface. This is required for llvm-mc to be
1d019706d866 LLVM10 anatofuz parents: diff changeset	1993 able to act as a native assembler and is required for inline assembly support in
1d019706d866 LLVM10 anatofuz parents: diff changeset	1994 the native .o file writer.
1d019706d866 LLVM10 anatofuz parents: diff changeset	1995
1d019706d866 LLVM10 anatofuz parents: diff changeset	1996 .. _feat_disassembler:
1d019706d866 LLVM10 anatofuz parents: diff changeset	1997
1d019706d866 LLVM10 anatofuz parents: diff changeset	1998 Disassembler
1d019706d866 LLVM10 anatofuz parents: diff changeset	1999 ^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2000
1d019706d866 LLVM10 anatofuz parents: diff changeset	2001 This box indicates whether the target supports the MCDisassembler API for
1d019706d866 LLVM10 anatofuz parents: diff changeset	2002 disassembling machine opcode bytes into MCInst's.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2003
1d019706d866 LLVM10 anatofuz parents: diff changeset	2004 .. _feat_inlineasm:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2005
1d019706d866 LLVM10 anatofuz parents: diff changeset	2006 Inline Asm
1d019706d866 LLVM10 anatofuz parents: diff changeset	2007 ^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2008
1d019706d866 LLVM10 anatofuz parents: diff changeset	2009 This box indicates whether the target supports most popular inline assembly
1d019706d866 LLVM10 anatofuz parents: diff changeset	2010 constraints and modifiers.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2011
1d019706d866 LLVM10 anatofuz parents: diff changeset	2012 .. _feat_jit:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2013
1d019706d866 LLVM10 anatofuz parents: diff changeset	2014 JIT Support
1d019706d866 LLVM10 anatofuz parents: diff changeset	2015 ^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2016
1d019706d866 LLVM10 anatofuz parents: diff changeset	2017 This box indicates whether the target supports the JIT compiler through the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2018 ExecutionEngine interface.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2019
1d019706d866 LLVM10 anatofuz parents: diff changeset	2020 .. _feat_jit_arm:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2021
1d019706d866 LLVM10 anatofuz parents: diff changeset	2022 The ARM backend has basic support for integer code in ARM codegen mode, but
1d019706d866 LLVM10 anatofuz parents: diff changeset	2023 lacks NEON and full Thumb support.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2024
1d019706d866 LLVM10 anatofuz parents: diff changeset	2025 .. _feat_objectwrite:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2026
1d019706d866 LLVM10 anatofuz parents: diff changeset	2027 .o File Writing
1d019706d866 LLVM10 anatofuz parents: diff changeset	2028 ^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2029
1d019706d866 LLVM10 anatofuz parents: diff changeset	2030 This box indicates whether the target supports writing .o files (e.g. MachO,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2031 ELF, and/or COFF) files directly from the target. Note that the target also
1d019706d866 LLVM10 anatofuz parents: diff changeset	2032 must include an assembly parser and general inline assembly support for full
1d019706d866 LLVM10 anatofuz parents: diff changeset	2033 inline assembly support in the .o writer.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2034
1d019706d866 LLVM10 anatofuz parents: diff changeset	2035 Targets that don't support this feature can obviously still write out .o files,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2036 they just rely on having an external assembler to translate from a .s file to a
1d019706d866 LLVM10 anatofuz parents: diff changeset	2037 .o file (as is the case for many C compilers).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2038
1d019706d866 LLVM10 anatofuz parents: diff changeset	2039 .. _feat_tailcall:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2040
1d019706d866 LLVM10 anatofuz parents: diff changeset	2041 Tail Calls
1d019706d866 LLVM10 anatofuz parents: diff changeset	2042 ^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2043
1d019706d866 LLVM10 anatofuz parents: diff changeset	2044 This box indicates whether the target supports guaranteed tail calls. These are
1d019706d866 LLVM10 anatofuz parents: diff changeset	2045 calls marked "`tail <LangRef.html#i_call>`_" and use the fastcc calling
1d019706d866 LLVM10 anatofuz parents: diff changeset	2046 convention. Please see the `tail call section`_ for more details.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2047
1d019706d866 LLVM10 anatofuz parents: diff changeset	2048 .. _feat_segstacks:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2049
1d019706d866 LLVM10 anatofuz parents: diff changeset	2050 Segmented Stacks
1d019706d866 LLVM10 anatofuz parents: diff changeset	2051 ^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2052
1d019706d866 LLVM10 anatofuz parents: diff changeset	2053 This box indicates whether the target supports segmented stacks. This replaces
1d019706d866 LLVM10 anatofuz parents: diff changeset	2054 the traditional large C stack with many linked segments. It is compatible with
1d019706d866 LLVM10 anatofuz parents: diff changeset	2055 the `gcc implementation <http://gcc.gnu.org/wiki/SplitStacks>`_ used by the Go
1d019706d866 LLVM10 anatofuz parents: diff changeset	2056 front end.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2057
1d019706d866 LLVM10 anatofuz parents: diff changeset	2058 .. _feat_segstacks_x86:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2059
1d019706d866 LLVM10 anatofuz parents: diff changeset	2060 Basic support exists on the X86 backend. Currently vararg doesn't work and the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2061 object files are not marked the way the gold linker expects, but simple Go
1d019706d866 LLVM10 anatofuz parents: diff changeset	2062 programs can be built by dragonegg.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2063
1d019706d866 LLVM10 anatofuz parents: diff changeset	2064 .. _tail call section:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2065
1d019706d866 LLVM10 anatofuz parents: diff changeset	2066 Tail call optimization
1d019706d866 LLVM10 anatofuz parents: diff changeset	2067 ----------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	2068
1d019706d866 LLVM10 anatofuz parents: diff changeset	2069 Tail call optimization, callee reusing the stack of the caller, is currently
221 79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2070 supported on x86/x86-64, PowerPC, AArch64, and WebAssembly. It is performed on
79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2071 x86/x86-64, PowerPC, and AArch64 if:
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	2072
1d019706d866 LLVM10 anatofuz parents: diff changeset	2073 * Caller and callee have the calling convention ``fastcc``, ``cc 10`` (GHC
221 79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2074 calling convention), ``cc 11`` (HiPE calling convention), ``tailcc``, or
79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2075 ``swifttailcc``.
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	2076
1d019706d866 LLVM10 anatofuz parents: diff changeset	2077 * The call is a tail call - in tail position (ret immediately follows call and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2078 ret uses value of call or is void).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2079
1d019706d866 LLVM10 anatofuz parents: diff changeset	2080 * Option ``-tailcallopt`` is enabled or the calling convention is ``tailcc``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2081
1d019706d866 LLVM10 anatofuz parents: diff changeset	2082 * Platform-specific constraints are met.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2083
1d019706d866 LLVM10 anatofuz parents: diff changeset	2084 x86/x86-64 constraints:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2085
1d019706d866 LLVM10 anatofuz parents: diff changeset	2086 * No variable argument lists are used.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2087
1d019706d866 LLVM10 anatofuz parents: diff changeset	2088 * On x86-64 when generating GOT/PIC code only module-local calls (visibility =
1d019706d866 LLVM10 anatofuz parents: diff changeset	2089 hidden or protected) are supported.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2090
1d019706d866 LLVM10 anatofuz parents: diff changeset	2091 PowerPC constraints:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2092
1d019706d866 LLVM10 anatofuz parents: diff changeset	2093 * No variable argument lists are used.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2094
1d019706d866 LLVM10 anatofuz parents: diff changeset	2095 * No byval parameters are used.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2096
1d019706d866 LLVM10 anatofuz parents: diff changeset	2097 * On ppc32/64 GOT/PIC only module-local calls (visibility = hidden or protected)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2098 are supported.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2099
1d019706d866 LLVM10 anatofuz parents: diff changeset	2100 WebAssembly constraints:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2101
1d019706d866 LLVM10 anatofuz parents: diff changeset	2102 * No variable argument lists are used
1d019706d866 LLVM10 anatofuz parents: diff changeset	2103
1d019706d866 LLVM10 anatofuz parents: diff changeset	2104 * The 'tail-call' target attribute is enabled.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2105
1d019706d866 LLVM10 anatofuz parents: diff changeset	2106 * The caller and callee's return types must match. The caller cannot
1d019706d866 LLVM10 anatofuz parents: diff changeset	2107 be void unless the callee is, too.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2108
221 79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2109 AArch64 constraints:
79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2110
79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2111 * No variable argument lists are used.
79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2112
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	2113 Example:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2114
1d019706d866 LLVM10 anatofuz parents: diff changeset	2115 Call as ``llc -tailcallopt test.ll``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2116
1d019706d866 LLVM10 anatofuz parents: diff changeset	2117 .. code-block:: llvm
1d019706d866 LLVM10 anatofuz parents: diff changeset	2118
1d019706d866 LLVM10 anatofuz parents: diff changeset	2119 declare fastcc i32 @tailcallee(i32 inreg %a1, i32 inreg %a2, i32 %a3, i32 %a4)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2120
1d019706d866 LLVM10 anatofuz parents: diff changeset	2121 define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
1d019706d866 LLVM10 anatofuz parents: diff changeset	2122 %l1 = add i32 %in1, %in2
221 79ff65ed7e25 LLVM12 Original Shinji KONO <kono@ie.u-ryukyu.ac.jp> parents: 150 diff changeset	2123 %tmp = tail call fastcc i32 @tailcallee(i32 inreg %in1, i32 inreg %in2, i32 %in1, i32 %l1)
150 1d019706d866 LLVM10 anatofuz parents: diff changeset	2124 ret i32 %tmp
1d019706d866 LLVM10 anatofuz parents: diff changeset	2125 }
1d019706d866 LLVM10 anatofuz parents: diff changeset	2126
1d019706d866 LLVM10 anatofuz parents: diff changeset	2127 Implications of ``-tailcallopt``:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2128
1d019706d866 LLVM10 anatofuz parents: diff changeset	2129 To support tail call optimization in situations where the callee has more
1d019706d866 LLVM10 anatofuz parents: diff changeset	2130 arguments than the caller a 'callee pops arguments' convention is used. This
1d019706d866 LLVM10 anatofuz parents: diff changeset	2131 currently causes each ``fastcc`` call that is not tail call optimized (because
1d019706d866 LLVM10 anatofuz parents: diff changeset	2132 one or more of above constraints are not met) to be followed by a readjustment
1d019706d866 LLVM10 anatofuz parents: diff changeset	2133 of the stack. So performance might be worse in such cases.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2134
1d019706d866 LLVM10 anatofuz parents: diff changeset	2135 Sibling call optimization
1d019706d866 LLVM10 anatofuz parents: diff changeset	2136 -------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	2137
1d019706d866 LLVM10 anatofuz parents: diff changeset	2138 Sibling call optimization is a restricted form of tail call optimization.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2139 Unlike tail call optimization described in the previous section, it can be
1d019706d866 LLVM10 anatofuz parents: diff changeset	2140 performed automatically on any tail calls when ``-tailcallopt`` option is not
1d019706d866 LLVM10 anatofuz parents: diff changeset	2141 specified.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2142
1d019706d866 LLVM10 anatofuz parents: diff changeset	2143 Sibling call optimization is currently performed on x86/x86-64 when the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2144 following constraints are met:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2145
1d019706d866 LLVM10 anatofuz parents: diff changeset	2146 * Caller and callee have the same calling convention. It can be either ``c`` or
1d019706d866 LLVM10 anatofuz parents: diff changeset	2147 ``fastcc``.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2148
1d019706d866 LLVM10 anatofuz parents: diff changeset	2149 * The call is a tail call - in tail position (ret immediately follows call and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2150 ret uses value of call or is void).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2151
1d019706d866 LLVM10 anatofuz parents: diff changeset	2152 * Caller and callee have matching return type or the callee result is not used.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2153
1d019706d866 LLVM10 anatofuz parents: diff changeset	2154 * If any of the callee arguments are being passed in stack, they must be
1d019706d866 LLVM10 anatofuz parents: diff changeset	2155 available in caller's own incoming argument stack and the frame offsets must
1d019706d866 LLVM10 anatofuz parents: diff changeset	2156 be the same.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2157
1d019706d866 LLVM10 anatofuz parents: diff changeset	2158 Example:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2159
1d019706d866 LLVM10 anatofuz parents: diff changeset	2160 .. code-block:: llvm
1d019706d866 LLVM10 anatofuz parents: diff changeset	2161
1d019706d866 LLVM10 anatofuz parents: diff changeset	2162 declare i32 @bar(i32, i32)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2163
1d019706d866 LLVM10 anatofuz parents: diff changeset	2164 define i32 @foo(i32 %a, i32 %b, i32 %c) {
1d019706d866 LLVM10 anatofuz parents: diff changeset	2165 entry:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2166 %0 = tail call i32 @bar(i32 %a, i32 %b)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2167 ret i32 %0
1d019706d866 LLVM10 anatofuz parents: diff changeset	2168 }
1d019706d866 LLVM10 anatofuz parents: diff changeset	2169
1d019706d866 LLVM10 anatofuz parents: diff changeset	2170 The X86 backend
1d019706d866 LLVM10 anatofuz parents: diff changeset	2171 ---------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	2172
1d019706d866 LLVM10 anatofuz parents: diff changeset	2173 The X86 code generator lives in the ``lib/Target/X86`` directory. This code
1d019706d866 LLVM10 anatofuz parents: diff changeset	2174 generator is capable of targeting a variety of x86-32 and x86-64 processors, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2175 includes support for ISA extensions such as MMX and SSE.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2176
1d019706d866 LLVM10 anatofuz parents: diff changeset	2177 X86 Target Triples supported
1d019706d866 LLVM10 anatofuz parents: diff changeset	2178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2179
1d019706d866 LLVM10 anatofuz parents: diff changeset	2180 The following are the known target triples that are supported by the X86
1d019706d866 LLVM10 anatofuz parents: diff changeset	2181 backend. This is not an exhaustive list, and it would be useful to add those
1d019706d866 LLVM10 anatofuz parents: diff changeset	2182 that people test.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2183
1d019706d866 LLVM10 anatofuz parents: diff changeset	2184 * i686-pc-linux-gnu --- Linux
1d019706d866 LLVM10 anatofuz parents: diff changeset	2185
1d019706d866 LLVM10 anatofuz parents: diff changeset	2186 * i386-unknown-freebsd5.3 --- FreeBSD 5.3
1d019706d866 LLVM10 anatofuz parents: diff changeset	2187
1d019706d866 LLVM10 anatofuz parents: diff changeset	2188 * i686-pc-cygwin --- Cygwin on Win32
1d019706d866 LLVM10 anatofuz parents: diff changeset	2189
1d019706d866 LLVM10 anatofuz parents: diff changeset	2190 * i686-pc-mingw32 --- MingW on Win32
1d019706d866 LLVM10 anatofuz parents: diff changeset	2191
1d019706d866 LLVM10 anatofuz parents: diff changeset	2192 * i386-pc-mingw32msvc --- MingW crosscompiler on Linux
1d019706d866 LLVM10 anatofuz parents: diff changeset	2193
1d019706d866 LLVM10 anatofuz parents: diff changeset	2194 * i686-apple-darwin* --- Apple Darwin on X86
1d019706d866 LLVM10 anatofuz parents: diff changeset	2195
1d019706d866 LLVM10 anatofuz parents: diff changeset	2196 * x86_64-unknown-linux-gnu --- Linux
1d019706d866 LLVM10 anatofuz parents: diff changeset	2197
1d019706d866 LLVM10 anatofuz parents: diff changeset	2198 X86 Calling Conventions supported
1d019706d866 LLVM10 anatofuz parents: diff changeset	2199 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2200
1d019706d866 LLVM10 anatofuz parents: diff changeset	2201 The following target-specific calling conventions are known to backend:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2202
1d019706d866 LLVM10 anatofuz parents: diff changeset	2203 * x86_StdCall --- stdcall calling convention seen on Microsoft Windows
1d019706d866 LLVM10 anatofuz parents: diff changeset	2204 platform (CC ID = 64).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2205
1d019706d866 LLVM10 anatofuz parents: diff changeset	2206 * x86_FastCall --- fastcall calling convention seen on Microsoft Windows
1d019706d866 LLVM10 anatofuz parents: diff changeset	2207 platform (CC ID = 65).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2208
1d019706d866 LLVM10 anatofuz parents: diff changeset	2209 * x86_ThisCall --- Similar to X86_StdCall. Passes first argument in ECX,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2210 others via stack. Callee is responsible for stack cleaning. This convention is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2211 used by MSVC by default for methods in its ABI (CC ID = 70).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2212
1d019706d866 LLVM10 anatofuz parents: diff changeset	2213 .. _X86 addressing mode:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2214
1d019706d866 LLVM10 anatofuz parents: diff changeset	2215 Representing X86 addressing modes in MachineInstrs
1d019706d866 LLVM10 anatofuz parents: diff changeset	2216 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2217
1d019706d866 LLVM10 anatofuz parents: diff changeset	2218 The x86 has a very flexible way of accessing memory. It is capable of forming
1d019706d866 LLVM10 anatofuz parents: diff changeset	2219 memory addresses of the following expression directly in integer instructions
1d019706d866 LLVM10 anatofuz parents: diff changeset	2220 (which use ModR/M addressing):
1d019706d866 LLVM10 anatofuz parents: diff changeset	2221
1d019706d866 LLVM10 anatofuz parents: diff changeset	2222 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2223
1d019706d866 LLVM10 anatofuz parents: diff changeset	2224 SegmentReg: Base + [1,2,4,8] * IndexReg + Disp32
1d019706d866 LLVM10 anatofuz parents: diff changeset	2225
1d019706d866 LLVM10 anatofuz parents: diff changeset	2226 In order to represent this, LLVM tracks no less than 5 operands for each memory
1d019706d866 LLVM10 anatofuz parents: diff changeset	2227 operand of this form. This means that the "load" form of '``mov``' has the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2228 following ``MachineOperand``\s in this order:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2229
1d019706d866 LLVM10 anatofuz parents: diff changeset	2230 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2231
1d019706d866 LLVM10 anatofuz parents: diff changeset	2232 Index: 0 \| 1 2 3 4 5
1d019706d866 LLVM10 anatofuz parents: diff changeset	2233 Meaning: DestReg, \| BaseReg, Scale, IndexReg, Displacement Segment
1d019706d866 LLVM10 anatofuz parents: diff changeset	2234 OperandTy: VirtReg, \| VirtReg, UnsImm, VirtReg, SignExtImm PhysReg
1d019706d866 LLVM10 anatofuz parents: diff changeset	2235
1d019706d866 LLVM10 anatofuz parents: diff changeset	2236 Stores, and all other instructions, treat the four memory operands in the same
1d019706d866 LLVM10 anatofuz parents: diff changeset	2237 way and in the same order. If the segment register is unspecified (regno = 0),
1d019706d866 LLVM10 anatofuz parents: diff changeset	2238 then no segment override is generated. "Lea" operations do not have a segment
1d019706d866 LLVM10 anatofuz parents: diff changeset	2239 register specified, so they only have 4 operands for their memory reference.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2240
1d019706d866 LLVM10 anatofuz parents: diff changeset	2241 X86 address spaces supported
1d019706d866 LLVM10 anatofuz parents: diff changeset	2242 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2243
1d019706d866 LLVM10 anatofuz parents: diff changeset	2244 x86 has a feature which provides the ability to perform loads and stores to
1d019706d866 LLVM10 anatofuz parents: diff changeset	2245 different address spaces via the x86 segment registers. A segment override
1d019706d866 LLVM10 anatofuz parents: diff changeset	2246 prefix byte on an instruction causes the instruction's memory access to go to
1d019706d866 LLVM10 anatofuz parents: diff changeset	2247 the specified segment. LLVM address space 0 is the default address space, which
1d019706d866 LLVM10 anatofuz parents: diff changeset	2248 includes the stack, and any unqualified memory accesses in a program. Address
1d019706d866 LLVM10 anatofuz parents: diff changeset	2249 spaces 1-255 are currently reserved for user-defined code. The GS-segment is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2250 represented by address space 256, the FS-segment is represented by address space
1d019706d866 LLVM10 anatofuz parents: diff changeset	2251 257, and the SS-segment is represented by address space 258. Other x86 segments
1d019706d866 LLVM10 anatofuz parents: diff changeset	2252 have yet to be allocated address space numbers.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2253
1d019706d866 LLVM10 anatofuz parents: diff changeset	2254 While these address spaces may seem similar to TLS via the ``thread_local``
1d019706d866 LLVM10 anatofuz parents: diff changeset	2255 keyword, and often use the same underlying hardware, there are some fundamental
1d019706d866 LLVM10 anatofuz parents: diff changeset	2256 differences.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2257
1d019706d866 LLVM10 anatofuz parents: diff changeset	2258 The ``thread_local`` keyword applies to global variables and specifies that they
1d019706d866 LLVM10 anatofuz parents: diff changeset	2259 are to be allocated in thread-local memory. There are no type qualifiers
1d019706d866 LLVM10 anatofuz parents: diff changeset	2260 involved, and these variables can be pointed to with normal pointers and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2261 accessed with normal loads and stores. The ``thread_local`` keyword is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2262 target-independent at the LLVM IR level (though LLVM doesn't yet have
1d019706d866 LLVM10 anatofuz parents: diff changeset	2263 implementations of it for some configurations)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2264
1d019706d866 LLVM10 anatofuz parents: diff changeset	2265 Special address spaces, in contrast, apply to static types. Every load and store
1d019706d866 LLVM10 anatofuz parents: diff changeset	2266 has a particular address space in its address operand type, and this is what
1d019706d866 LLVM10 anatofuz parents: diff changeset	2267 determines which address space is accessed. LLVM ignores these special address
1d019706d866 LLVM10 anatofuz parents: diff changeset	2268 space qualifiers on global variables, and does not provide a way to directly
1d019706d866 LLVM10 anatofuz parents: diff changeset	2269 allocate storage in them. At the LLVM IR level, the behavior of these special
1d019706d866 LLVM10 anatofuz parents: diff changeset	2270 address spaces depends in part on the underlying OS or runtime environment, and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2271 they are specific to x86 (and LLVM doesn't yet handle them correctly in some
1d019706d866 LLVM10 anatofuz parents: diff changeset	2272 cases).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2273
1d019706d866 LLVM10 anatofuz parents: diff changeset	2274 Some operating systems and runtime environments use (or may in the future use)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2275 the FS/GS-segment registers for various low-level purposes, so care should be
1d019706d866 LLVM10 anatofuz parents: diff changeset	2276 taken when considering them.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2277
1d019706d866 LLVM10 anatofuz parents: diff changeset	2278 Instruction naming
1d019706d866 LLVM10 anatofuz parents: diff changeset	2279 ^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2280
1d019706d866 LLVM10 anatofuz parents: diff changeset	2281 An instruction name consists of the base name, a default operand size, and a a
1d019706d866 LLVM10 anatofuz parents: diff changeset	2282 character per operand with an optional special size. For example:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2283
1d019706d866 LLVM10 anatofuz parents: diff changeset	2284 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2285
1d019706d866 LLVM10 anatofuz parents: diff changeset	2286 ADD8rr -> add, 8-bit register, 8-bit register
1d019706d866 LLVM10 anatofuz parents: diff changeset	2287 IMUL16rmi -> imul, 16-bit register, 16-bit memory, 16-bit immediate
1d019706d866 LLVM10 anatofuz parents: diff changeset	2288 IMUL16rmi8 -> imul, 16-bit register, 16-bit memory, 8-bit immediate
1d019706d866 LLVM10 anatofuz parents: diff changeset	2289 MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory
1d019706d866 LLVM10 anatofuz parents: diff changeset	2290
1d019706d866 LLVM10 anatofuz parents: diff changeset	2291 The PowerPC backend
1d019706d866 LLVM10 anatofuz parents: diff changeset	2292 -------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	2293
1d019706d866 LLVM10 anatofuz parents: diff changeset	2294 The PowerPC code generator lives in the lib/Target/PowerPC directory. The code
1d019706d866 LLVM10 anatofuz parents: diff changeset	2295 generation is retargetable to several variations or subtargets of the PowerPC
1d019706d866 LLVM10 anatofuz parents: diff changeset	2296 ISA; including ppc32, ppc64 and altivec.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2297
1d019706d866 LLVM10 anatofuz parents: diff changeset	2298 LLVM PowerPC ABI
1d019706d866 LLVM10 anatofuz parents: diff changeset	2299 ^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2300
1d019706d866 LLVM10 anatofuz parents: diff changeset	2301 LLVM follows the AIX PowerPC ABI, with two deviations. LLVM uses a PC relative
1d019706d866 LLVM10 anatofuz parents: diff changeset	2302 (PIC) or static addressing for accessing global values, so no TOC (r2) is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2303 used. Second, r31 is used as a frame pointer to allow dynamic growth of a stack
1d019706d866 LLVM10 anatofuz parents: diff changeset	2304 frame. LLVM takes advantage of having no TOC to provide space to save the frame
1d019706d866 LLVM10 anatofuz parents: diff changeset	2305 pointer in the PowerPC linkage area of the caller frame. Other details of
1d019706d866 LLVM10 anatofuz parents: diff changeset	2306 PowerPC ABI can be found at `PowerPC ABI
1d019706d866 LLVM10 anatofuz parents: diff changeset	2307 <http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/Articles/32bitPowerPC.html>`_\
1d019706d866 LLVM10 anatofuz parents: diff changeset	2308 . Note: This link describes the 32 bit ABI. The 64 bit ABI is similar except
1d019706d866 LLVM10 anatofuz parents: diff changeset	2309 space for GPRs are 8 bytes wide (not 4) and r13 is reserved for system use.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2310
1d019706d866 LLVM10 anatofuz parents: diff changeset	2311 Frame Layout
1d019706d866 LLVM10 anatofuz parents: diff changeset	2312 ^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2313
1d019706d866 LLVM10 anatofuz parents: diff changeset	2314 The size of a PowerPC frame is usually fixed for the duration of a function's
1d019706d866 LLVM10 anatofuz parents: diff changeset	2315 invocation. Since the frame is fixed size, all references into the frame can be
1d019706d866 LLVM10 anatofuz parents: diff changeset	2316 accessed via fixed offsets from the stack pointer. The exception to this is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2317 when dynamic alloca or variable sized arrays are present, then a base pointer
1d019706d866 LLVM10 anatofuz parents: diff changeset	2318 (r31) is used as a proxy for the stack pointer and stack pointer is free to grow
1d019706d866 LLVM10 anatofuz parents: diff changeset	2319 or shrink. A base pointer is also used if llvm-gcc is not passed the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2320 -fomit-frame-pointer flag. The stack pointer is always aligned to 16 bytes, so
1d019706d866 LLVM10 anatofuz parents: diff changeset	2321 that space allocated for altivec vectors will be properly aligned.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2322
1d019706d866 LLVM10 anatofuz parents: diff changeset	2323 An invocation frame is laid out as follows (low memory at top):
1d019706d866 LLVM10 anatofuz parents: diff changeset	2324
1d019706d866 LLVM10 anatofuz parents: diff changeset	2325 :raw-html:`<table border="1" cellspacing="0">`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2326 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2327 :raw-html:`<td>Linkage<br><br></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2328 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2329 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2330 :raw-html:`<td>Parameter area<br><br></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2331 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2332 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2333 :raw-html:`<td>Dynamic area<br><br></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2334 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2335 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2336 :raw-html:`<td>Locals area<br><br></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2337 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2338 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2339 :raw-html:`<td>Saved registers area<br><br></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2340 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2341 :raw-html:`<tr style="border-style: none hidden none hidden;">`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2342 :raw-html:`<td><br></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2343 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2344 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2345 :raw-html:`<td>Previous Frame<br><br></td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2346 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2347 :raw-html:`</table>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2348
1d019706d866 LLVM10 anatofuz parents: diff changeset	2349 The linkage area is used by a callee to save special registers prior to
1d019706d866 LLVM10 anatofuz parents: diff changeset	2350 allocating its own frame. Only three entries are relevant to LLVM. The first
1d019706d866 LLVM10 anatofuz parents: diff changeset	2351 entry is the previous stack pointer (sp), aka link. This allows probing tools
1d019706d866 LLVM10 anatofuz parents: diff changeset	2352 like gdb or exception handlers to quickly scan the frames in the stack. A
1d019706d866 LLVM10 anatofuz parents: diff changeset	2353 function epilog can also use the link to pop the frame from the stack. The
1d019706d866 LLVM10 anatofuz parents: diff changeset	2354 third entry in the linkage area is used to save the return address from the lr
1d019706d866 LLVM10 anatofuz parents: diff changeset	2355 register. Finally, as mentioned above, the last entry is used to save the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2356 previous frame pointer (r31.) The entries in the linkage area are the size of a
1d019706d866 LLVM10 anatofuz parents: diff changeset	2357 GPR, thus the linkage area is 24 bytes long in 32 bit mode and 48 bytes in 64
1d019706d866 LLVM10 anatofuz parents: diff changeset	2358 bit mode.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2359
1d019706d866 LLVM10 anatofuz parents: diff changeset	2360 32 bit linkage area:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2361
1d019706d866 LLVM10 anatofuz parents: diff changeset	2362 :raw-html:`<table border="1" cellspacing="0">`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2363 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2364 :raw-html:`<td>0</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2365 :raw-html:`<td>Saved SP (r1)</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2366 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2367 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2368 :raw-html:`<td>4</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2369 :raw-html:`<td>Saved CR</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2370 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2371 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2372 :raw-html:`<td>8</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2373 :raw-html:`<td>Saved LR</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2374 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2375 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2376 :raw-html:`<td>12</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2377 :raw-html:`<td>Reserved</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2378 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2379 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2380 :raw-html:`<td>16</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2381 :raw-html:`<td>Reserved</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2382 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2383 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2384 :raw-html:`<td>20</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2385 :raw-html:`<td>Saved FP (r31)</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2386 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2387 :raw-html:`</table>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2388
1d019706d866 LLVM10 anatofuz parents: diff changeset	2389 64 bit linkage area:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2390
1d019706d866 LLVM10 anatofuz parents: diff changeset	2391 :raw-html:`<table border="1" cellspacing="0">`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2392 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2393 :raw-html:`<td>0</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2394 :raw-html:`<td>Saved SP (r1)</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2395 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2396 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2397 :raw-html:`<td>8</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2398 :raw-html:`<td>Saved CR</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2399 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2400 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2401 :raw-html:`<td>16</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2402 :raw-html:`<td>Saved LR</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2403 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2404 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2405 :raw-html:`<td>24</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2406 :raw-html:`<td>Reserved</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2407 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2408 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2409 :raw-html:`<td>32</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2410 :raw-html:`<td>Reserved</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2411 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2412 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2413 :raw-html:`<td>40</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2414 :raw-html:`<td>Saved FP (r31)</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2415 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2416 :raw-html:`</table>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2417
1d019706d866 LLVM10 anatofuz parents: diff changeset	2418 The parameter area is used to store arguments being passed to a callee
1d019706d866 LLVM10 anatofuz parents: diff changeset	2419 function. Following the PowerPC ABI, the first few arguments are actually
1d019706d866 LLVM10 anatofuz parents: diff changeset	2420 passed in registers, with the space in the parameter area unused. However, if
1d019706d866 LLVM10 anatofuz parents: diff changeset	2421 there are not enough registers or the callee is a thunk or vararg function,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2422 these register arguments can be spilled into the parameter area. Thus, the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2423 parameter area must be large enough to store all the parameters for the largest
1d019706d866 LLVM10 anatofuz parents: diff changeset	2424 call sequence made by the caller. The size must also be minimally large enough
1d019706d866 LLVM10 anatofuz parents: diff changeset	2425 to spill registers r3-r10. This allows callees blind to the call signature,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2426 such as thunks and vararg functions, enough space to cache the argument
1d019706d866 LLVM10 anatofuz parents: diff changeset	2427 registers. Therefore, the parameter area is minimally 32 bytes (64 bytes in 64
1d019706d866 LLVM10 anatofuz parents: diff changeset	2428 bit mode.) Also note that since the parameter area is a fixed offset from the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2429 top of the frame, that a callee can access its split arguments using fixed
1d019706d866 LLVM10 anatofuz parents: diff changeset	2430 offsets from the stack pointer (or base pointer.)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2431
1d019706d866 LLVM10 anatofuz parents: diff changeset	2432 Combining the information about the linkage, parameter areas and alignment. A
1d019706d866 LLVM10 anatofuz parents: diff changeset	2433 stack frame is minimally 64 bytes in 32 bit mode and 128 bytes in 64 bit mode.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2434
1d019706d866 LLVM10 anatofuz parents: diff changeset	2435 The dynamic area starts out as size zero. If a function uses dynamic alloca
1d019706d866 LLVM10 anatofuz parents: diff changeset	2436 then space is added to the stack, the linkage and parameter areas are shifted to
1d019706d866 LLVM10 anatofuz parents: diff changeset	2437 top of stack, and the new space is available immediately below the linkage and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2438 parameter areas. The cost of shifting the linkage and parameter areas is minor
1d019706d866 LLVM10 anatofuz parents: diff changeset	2439 since only the link value needs to be copied. The link value can be easily
1d019706d866 LLVM10 anatofuz parents: diff changeset	2440 fetched by adding the original frame size to the base pointer. Note that
1d019706d866 LLVM10 anatofuz parents: diff changeset	2441 allocations in the dynamic space need to observe 16 byte alignment.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2442
1d019706d866 LLVM10 anatofuz parents: diff changeset	2443 The locals area is where the llvm compiler reserves space for local variables.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2444
1d019706d866 LLVM10 anatofuz parents: diff changeset	2445 The saved registers area is where the llvm compiler spills callee saved
1d019706d866 LLVM10 anatofuz parents: diff changeset	2446 registers on entry to the callee.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2447
1d019706d866 LLVM10 anatofuz parents: diff changeset	2448 Prolog/Epilog
1d019706d866 LLVM10 anatofuz parents: diff changeset	2449 ^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2450
1d019706d866 LLVM10 anatofuz parents: diff changeset	2451 The llvm prolog and epilog are the same as described in the PowerPC ABI, with
1d019706d866 LLVM10 anatofuz parents: diff changeset	2452 the following exceptions. Callee saved registers are spilled after the frame is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2453 created. This allows the llvm epilog/prolog support to be common with other
1d019706d866 LLVM10 anatofuz parents: diff changeset	2454 targets. The base pointer callee saved register r31 is saved in the TOC slot of
1d019706d866 LLVM10 anatofuz parents: diff changeset	2455 linkage area. This simplifies allocation of space for the base pointer and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2456 makes it convenient to locate programmatically and during debugging.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2457
1d019706d866 LLVM10 anatofuz parents: diff changeset	2458 Dynamic Allocation
1d019706d866 LLVM10 anatofuz parents: diff changeset	2459 ^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2460
1d019706d866 LLVM10 anatofuz parents: diff changeset	2461 .. note::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2462
1d019706d866 LLVM10 anatofuz parents: diff changeset	2463 TODO - More to come.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2464
1d019706d866 LLVM10 anatofuz parents: diff changeset	2465 The NVPTX backend
1d019706d866 LLVM10 anatofuz parents: diff changeset	2466 -----------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	2467
1d019706d866 LLVM10 anatofuz parents: diff changeset	2468 The NVPTX code generator under lib/Target/NVPTX is an open-source version of
1d019706d866 LLVM10 anatofuz parents: diff changeset	2469 the NVIDIA NVPTX code generator for LLVM. It is contributed by NVIDIA and is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2470 a port of the code generator used in the CUDA compiler (nvcc). It targets the
1d019706d866 LLVM10 anatofuz parents: diff changeset	2471 PTX 3.0/3.1 ISA and can target any compute capability greater than or equal to
1d019706d866 LLVM10 anatofuz parents: diff changeset	2472 2.0 (Fermi).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2473
1d019706d866 LLVM10 anatofuz parents: diff changeset	2474 This target is of production quality and should be completely compatible with
1d019706d866 LLVM10 anatofuz parents: diff changeset	2475 the official NVIDIA toolchain.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2476
1d019706d866 LLVM10 anatofuz parents: diff changeset	2477 Code Generator Options:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2478
1d019706d866 LLVM10 anatofuz parents: diff changeset	2479 :raw-html:`<table border="1" cellspacing="0">`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2480 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2481 :raw-html:`<th>Option</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2482 :raw-html:`<th>Description</th>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2483 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2484 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2485 :raw-html:`<td>sm_20</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2486 :raw-html:`<td align="left">Set shader model/compute capability to 2.0</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2487 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2488 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2489 :raw-html:`<td>sm_21</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2490 :raw-html:`<td align="left">Set shader model/compute capability to 2.1</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2491 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2492 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2493 :raw-html:`<td>sm_30</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2494 :raw-html:`<td align="left">Set shader model/compute capability to 3.0</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2495 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2496 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2497 :raw-html:`<td>sm_35</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2498 :raw-html:`<td align="left">Set shader model/compute capability to 3.5</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2499 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2500 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2501 :raw-html:`<td>ptx30</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2502 :raw-html:`<td align="left">Target PTX 3.0</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2503 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2504 :raw-html:`<tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2505 :raw-html:`<td>ptx31</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2506 :raw-html:`<td align="left">Target PTX 3.1</td>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2507 :raw-html:`</tr>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2508 :raw-html:`</table>`
1d019706d866 LLVM10 anatofuz parents: diff changeset	2509
1d019706d866 LLVM10 anatofuz parents: diff changeset	2510 The extended Berkeley Packet Filter (eBPF) backend
1d019706d866 LLVM10 anatofuz parents: diff changeset	2511 --------------------------------------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	2512
1d019706d866 LLVM10 anatofuz parents: diff changeset	2513 Extended BPF (or eBPF) is similar to the original ("classic") BPF (cBPF) used
1d019706d866 LLVM10 anatofuz parents: diff changeset	2514 to filter network packets. The
1d019706d866 LLVM10 anatofuz parents: diff changeset	2515 `bpf() system call <http://man7.org/linux/man-pages/man2/bpf.2.html>`_
1d019706d866 LLVM10 anatofuz parents: diff changeset	2516 performs a range of operations related to eBPF. For both cBPF and eBPF
1d019706d866 LLVM10 anatofuz parents: diff changeset	2517 programs, the Linux kernel statically analyzes the programs before loading
1d019706d866 LLVM10 anatofuz parents: diff changeset	2518 them, in order to ensure that they cannot harm the running system. eBPF is
1d019706d866 LLVM10 anatofuz parents: diff changeset	2519 a 64-bit RISC instruction set designed for one to one mapping to 64-bit CPUs.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2520 Opcodes are 8-bit encoded, and 87 instructions are defined. There are 10
1d019706d866 LLVM10 anatofuz parents: diff changeset	2521 registers, grouped by function as outlined below.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2522
1d019706d866 LLVM10 anatofuz parents: diff changeset	2523 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2524
1d019706d866 LLVM10 anatofuz parents: diff changeset	2525 R0 return value from in-kernel functions; exit value for eBPF program
1d019706d866 LLVM10 anatofuz parents: diff changeset	2526 R1 - R5 function call arguments to in-kernel functions
1d019706d866 LLVM10 anatofuz parents: diff changeset	2527 R6 - R9 callee-saved registers preserved by in-kernel functions
1d019706d866 LLVM10 anatofuz parents: diff changeset	2528 R10 stack frame pointer (read only)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2529
1d019706d866 LLVM10 anatofuz parents: diff changeset	2530 Instruction encoding (arithmetic and jump)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2531 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2532 eBPF is reusing most of the opcode encoding from classic to simplify conversion
1d019706d866 LLVM10 anatofuz parents: diff changeset	2533 of classic BPF to eBPF. For arithmetic and jump instructions the 8-bit 'code'
1d019706d866 LLVM10 anatofuz parents: diff changeset	2534 field is divided into three parts:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2535
1d019706d866 LLVM10 anatofuz parents: diff changeset	2536 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2537
1d019706d866 LLVM10 anatofuz parents: diff changeset	2538 +----------------+--------+--------------------+
1d019706d866 LLVM10 anatofuz parents: diff changeset	2539 \| 4 bits \| 1 bit \| 3 bits \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	2540 \| operation code \| source \| instruction class \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	2541 +----------------+--------+--------------------+
1d019706d866 LLVM10 anatofuz parents: diff changeset	2542 (MSB) (LSB)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2543
1d019706d866 LLVM10 anatofuz parents: diff changeset	2544 Three LSB bits store instruction class which is one of:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2545
1d019706d866 LLVM10 anatofuz parents: diff changeset	2546 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2547
1d019706d866 LLVM10 anatofuz parents: diff changeset	2548 BPF_LD 0x0
1d019706d866 LLVM10 anatofuz parents: diff changeset	2549 BPF_LDX 0x1
1d019706d866 LLVM10 anatofuz parents: diff changeset	2550 BPF_ST 0x2
1d019706d866 LLVM10 anatofuz parents: diff changeset	2551 BPF_STX 0x3
1d019706d866 LLVM10 anatofuz parents: diff changeset	2552 BPF_ALU 0x4
1d019706d866 LLVM10 anatofuz parents: diff changeset	2553 BPF_JMP 0x5
1d019706d866 LLVM10 anatofuz parents: diff changeset	2554 (unused) 0x6
1d019706d866 LLVM10 anatofuz parents: diff changeset	2555 BPF_ALU64 0x7
1d019706d866 LLVM10 anatofuz parents: diff changeset	2556
1d019706d866 LLVM10 anatofuz parents: diff changeset	2557 When BPF_CLASS(code) == BPF_ALU or BPF_ALU64 or BPF_JMP,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2558 4th bit encodes source operand
1d019706d866 LLVM10 anatofuz parents: diff changeset	2559
1d019706d866 LLVM10 anatofuz parents: diff changeset	2560 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2561
1d019706d866 LLVM10 anatofuz parents: diff changeset	2562 BPF_X 0x1 use src_reg register as source operand
1d019706d866 LLVM10 anatofuz parents: diff changeset	2563 BPF_K 0x0 use 32 bit immediate as source operand
1d019706d866 LLVM10 anatofuz parents: diff changeset	2564
1d019706d866 LLVM10 anatofuz parents: diff changeset	2565 and four MSB bits store operation code
1d019706d866 LLVM10 anatofuz parents: diff changeset	2566
1d019706d866 LLVM10 anatofuz parents: diff changeset	2567 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2568
1d019706d866 LLVM10 anatofuz parents: diff changeset	2569 BPF_ADD 0x0 add
1d019706d866 LLVM10 anatofuz parents: diff changeset	2570 BPF_SUB 0x1 subtract
1d019706d866 LLVM10 anatofuz parents: diff changeset	2571 BPF_MUL 0x2 multiply
1d019706d866 LLVM10 anatofuz parents: diff changeset	2572 BPF_DIV 0x3 divide
1d019706d866 LLVM10 anatofuz parents: diff changeset	2573 BPF_OR 0x4 bitwise logical OR
1d019706d866 LLVM10 anatofuz parents: diff changeset	2574 BPF_AND 0x5 bitwise logical AND
1d019706d866 LLVM10 anatofuz parents: diff changeset	2575 BPF_LSH 0x6 left shift
1d019706d866 LLVM10 anatofuz parents: diff changeset	2576 BPF_RSH 0x7 right shift (zero extended)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2577 BPF_NEG 0x8 arithmetic negation
1d019706d866 LLVM10 anatofuz parents: diff changeset	2578 BPF_MOD 0x9 modulo
1d019706d866 LLVM10 anatofuz parents: diff changeset	2579 BPF_XOR 0xa bitwise logical XOR
1d019706d866 LLVM10 anatofuz parents: diff changeset	2580 BPF_MOV 0xb move register to register
1d019706d866 LLVM10 anatofuz parents: diff changeset	2581 BPF_ARSH 0xc right shift (sign extended)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2582 BPF_END 0xd endianness conversion
1d019706d866 LLVM10 anatofuz parents: diff changeset	2583
1d019706d866 LLVM10 anatofuz parents: diff changeset	2584 If BPF_CLASS(code) == BPF_JMP, BPF_OP(code) is one of
1d019706d866 LLVM10 anatofuz parents: diff changeset	2585
1d019706d866 LLVM10 anatofuz parents: diff changeset	2586 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2587
1d019706d866 LLVM10 anatofuz parents: diff changeset	2588 BPF_JA 0x0 unconditional jump
1d019706d866 LLVM10 anatofuz parents: diff changeset	2589 BPF_JEQ 0x1 jump ==
1d019706d866 LLVM10 anatofuz parents: diff changeset	2590 BPF_JGT 0x2 jump >
1d019706d866 LLVM10 anatofuz parents: diff changeset	2591 BPF_JGE 0x3 jump >=
1d019706d866 LLVM10 anatofuz parents: diff changeset	2592 BPF_JSET 0x4 jump if (DST & SRC)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2593 BPF_JNE 0x5 jump !=
1d019706d866 LLVM10 anatofuz parents: diff changeset	2594 BPF_JSGT 0x6 jump signed >
1d019706d866 LLVM10 anatofuz parents: diff changeset	2595 BPF_JSGE 0x7 jump signed >=
1d019706d866 LLVM10 anatofuz parents: diff changeset	2596 BPF_CALL 0x8 function call
1d019706d866 LLVM10 anatofuz parents: diff changeset	2597 BPF_EXIT 0x9 function return
1d019706d866 LLVM10 anatofuz parents: diff changeset	2598
1d019706d866 LLVM10 anatofuz parents: diff changeset	2599 Instruction encoding (load, store)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2600 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2601 For load and store instructions the 8-bit 'code' field is divided as:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2602
1d019706d866 LLVM10 anatofuz parents: diff changeset	2603 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2604
1d019706d866 LLVM10 anatofuz parents: diff changeset	2605 +--------+--------+-------------------+
1d019706d866 LLVM10 anatofuz parents: diff changeset	2606 \| 3 bits \| 2 bits \| 3 bits \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	2607 \| mode \| size \| instruction class \|
1d019706d866 LLVM10 anatofuz parents: diff changeset	2608 +--------+--------+-------------------+
1d019706d866 LLVM10 anatofuz parents: diff changeset	2609 (MSB) (LSB)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2610
1d019706d866 LLVM10 anatofuz parents: diff changeset	2611 Size modifier is one of
1d019706d866 LLVM10 anatofuz parents: diff changeset	2612
1d019706d866 LLVM10 anatofuz parents: diff changeset	2613 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2614
1d019706d866 LLVM10 anatofuz parents: diff changeset	2615 BPF_W 0x0 word
1d019706d866 LLVM10 anatofuz parents: diff changeset	2616 BPF_H 0x1 half word
1d019706d866 LLVM10 anatofuz parents: diff changeset	2617 BPF_B 0x2 byte
1d019706d866 LLVM10 anatofuz parents: diff changeset	2618 BPF_DW 0x3 double word
1d019706d866 LLVM10 anatofuz parents: diff changeset	2619
1d019706d866 LLVM10 anatofuz parents: diff changeset	2620 Mode modifier is one of
1d019706d866 LLVM10 anatofuz parents: diff changeset	2621
1d019706d866 LLVM10 anatofuz parents: diff changeset	2622 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2623
1d019706d866 LLVM10 anatofuz parents: diff changeset	2624 BPF_IMM 0x0 immediate
1d019706d866 LLVM10 anatofuz parents: diff changeset	2625 BPF_ABS 0x1 used to access packet data
1d019706d866 LLVM10 anatofuz parents: diff changeset	2626 BPF_IND 0x2 used to access packet data
1d019706d866 LLVM10 anatofuz parents: diff changeset	2627 BPF_MEM 0x3 memory
1d019706d866 LLVM10 anatofuz parents: diff changeset	2628 (reserved) 0x4
1d019706d866 LLVM10 anatofuz parents: diff changeset	2629 (reserved) 0x5
1d019706d866 LLVM10 anatofuz parents: diff changeset	2630 BPF_XADD 0x6 exclusive add
1d019706d866 LLVM10 anatofuz parents: diff changeset	2631
1d019706d866 LLVM10 anatofuz parents: diff changeset	2632
1d019706d866 LLVM10 anatofuz parents: diff changeset	2633 Packet data access (BPF_ABS, BPF_IND)
1d019706d866 LLVM10 anatofuz parents: diff changeset	2634 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2635
1d019706d866 LLVM10 anatofuz parents: diff changeset	2636 Two non-generic instructions: (BPF_ABS \| <size> \| BPF_LD) and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2637 (BPF_IND \| <size> \| BPF_LD) which are used to access packet data.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2638 Register R6 is an implicit input that must contain pointer to sk_buff.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2639 Register R0 is an implicit output which contains the data fetched
1d019706d866 LLVM10 anatofuz parents: diff changeset	2640 from the packet. Registers R1-R5 are scratch registers and must not
1d019706d866 LLVM10 anatofuz parents: diff changeset	2641 be used to store the data across BPF_ABS \| BPF_LD or BPF_IND \| BPF_LD
1d019706d866 LLVM10 anatofuz parents: diff changeset	2642 instructions. These instructions have implicit program exit condition
1d019706d866 LLVM10 anatofuz parents: diff changeset	2643 as well. When eBPF program is trying to access the data beyond
1d019706d866 LLVM10 anatofuz parents: diff changeset	2644 the packet boundary, the interpreter will abort the execution of the program.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2645
1d019706d866 LLVM10 anatofuz parents: diff changeset	2646 BPF_IND \| BPF_W \| BPF_LD is equivalent to:
1d019706d866 LLVM10 anatofuz parents: diff changeset	2647 R0 = ntohl(\(u32 \) (((struct sk_buff \*) R6)->data + src_reg + imm32))
1d019706d866 LLVM10 anatofuz parents: diff changeset	2648
1d019706d866 LLVM10 anatofuz parents: diff changeset	2649 eBPF maps
1d019706d866 LLVM10 anatofuz parents: diff changeset	2650 ^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2651
1d019706d866 LLVM10 anatofuz parents: diff changeset	2652 eBPF maps are provided for sharing data between kernel and user-space.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2653 Currently implemented types are hash and array, with potential extension to
1d019706d866 LLVM10 anatofuz parents: diff changeset	2654 support bloom filters, radix trees, etc. A map is defined by its type,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2655 maximum number of elements, key size and value size in bytes. eBPF syscall
1d019706d866 LLVM10 anatofuz parents: diff changeset	2656 supports create, update, find and delete functions on maps.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2657
1d019706d866 LLVM10 anatofuz parents: diff changeset	2658 Function calls
1d019706d866 LLVM10 anatofuz parents: diff changeset	2659 ^^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2660
1d019706d866 LLVM10 anatofuz parents: diff changeset	2661 Function call arguments are passed using up to five registers (R1 - R5).
1d019706d866 LLVM10 anatofuz parents: diff changeset	2662 The return value is passed in a dedicated register (R0). Four additional
1d019706d866 LLVM10 anatofuz parents: diff changeset	2663 registers (R6 - R9) are callee-saved, and the values in these registers
1d019706d866 LLVM10 anatofuz parents: diff changeset	2664 are preserved within kernel functions. R0 - R5 are scratch registers within
1d019706d866 LLVM10 anatofuz parents: diff changeset	2665 kernel functions, and eBPF programs must therefor store/restore values in
1d019706d866 LLVM10 anatofuz parents: diff changeset	2666 these registers if needed across function calls. The stack can be accessed
1d019706d866 LLVM10 anatofuz parents: diff changeset	2667 using the read-only frame pointer R10. eBPF registers map 1:1 to hardware
1d019706d866 LLVM10 anatofuz parents: diff changeset	2668 registers on x86_64 and other 64-bit architectures. For example, x86_64
1d019706d866 LLVM10 anatofuz parents: diff changeset	2669 in-kernel JIT maps them as
1d019706d866 LLVM10 anatofuz parents: diff changeset	2670
1d019706d866 LLVM10 anatofuz parents: diff changeset	2671 ::
1d019706d866 LLVM10 anatofuz parents: diff changeset	2672
1d019706d866 LLVM10 anatofuz parents: diff changeset	2673 R0 - rax
1d019706d866 LLVM10 anatofuz parents: diff changeset	2674 R1 - rdi
1d019706d866 LLVM10 anatofuz parents: diff changeset	2675 R2 - rsi
1d019706d866 LLVM10 anatofuz parents: diff changeset	2676 R3 - rdx
1d019706d866 LLVM10 anatofuz parents: diff changeset	2677 R4 - rcx
1d019706d866 LLVM10 anatofuz parents: diff changeset	2678 R5 - r8
1d019706d866 LLVM10 anatofuz parents: diff changeset	2679 R6 - rbx
1d019706d866 LLVM10 anatofuz parents: diff changeset	2680 R7 - r13
1d019706d866 LLVM10 anatofuz parents: diff changeset	2681 R8 - r14
1d019706d866 LLVM10 anatofuz parents: diff changeset	2682 R9 - r15
1d019706d866 LLVM10 anatofuz parents: diff changeset	2683 R10 - rbp
1d019706d866 LLVM10 anatofuz parents: diff changeset	2684
1d019706d866 LLVM10 anatofuz parents: diff changeset	2685 since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing
1d019706d866 LLVM10 anatofuz parents: diff changeset	2686 and rbx, r12 - r15 are callee saved.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2687
1d019706d866 LLVM10 anatofuz parents: diff changeset	2688 Program start
1d019706d866 LLVM10 anatofuz parents: diff changeset	2689 ^^^^^^^^^^^^^
1d019706d866 LLVM10 anatofuz parents: diff changeset	2690
1d019706d866 LLVM10 anatofuz parents: diff changeset	2691 An eBPF program receives a single argument and contains
1d019706d866 LLVM10 anatofuz parents: diff changeset	2692 a single eBPF main routine; the program does not contain eBPF functions.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2693 Function calls are limited to a predefined set of kernel functions. The size
1d019706d866 LLVM10 anatofuz parents: diff changeset	2694 of a program is limited to 4K instructions: this ensures fast termination and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2695 a limited number of kernel function calls. Prior to running an eBPF program,
1d019706d866 LLVM10 anatofuz parents: diff changeset	2696 a verifier performs static analysis to prevent loops in the code and
1d019706d866 LLVM10 anatofuz parents: diff changeset	2697 to ensure valid register usage and operand types.
1d019706d866 LLVM10 anatofuz parents: diff changeset	2698
1d019706d866 LLVM10 anatofuz parents: diff changeset	2699 The AMDGPU backend
1d019706d866 LLVM10 anatofuz parents: diff changeset	2700 ------------------
1d019706d866 LLVM10 anatofuz parents: diff changeset	2701
1d019706d866 LLVM10 anatofuz parents: diff changeset	2702 The AMDGPU code generator lives in the ``lib/Target/AMDGPU``
1d019706d866 LLVM10 anatofuz parents: diff changeset	2703 directory. This code generator is capable of targeting a variety of
1d019706d866 LLVM10 anatofuz parents: diff changeset	2704 AMD GPU processors. Refer to :doc:`AMDGPUUsage` for more information.

Mercurial > hg > CbC > CbC_llvm

annotate llvm/docs/CodeGenerator.rst @ 235:edfff9242030 cbc-llvm13