diff docs/CodeGenerator.rst @ 148:63bd29f05246

merged
author Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date Wed, 14 Aug 2019 19:46:37 +0900
parents c2174574ed3a
children
line wrap: on
line diff
--- a/docs/CodeGenerator.rst	Sun Dec 23 19:23:36 2018 +0900
+++ b/docs/CodeGenerator.rst	Wed Aug 14 19:46:37 2019 +0900
@@ -566,7 +566,7 @@
 MachineBasicBlock and MachineInstr. All the MIs (including top level and nested
 ones) are stored as sequential list of MIs. The "bundled" MIs are marked with
 the 'InsideBundle' flag. A top level MI with the special BUNDLE opcode is used
-to represent the start of a bundle. It's legal to mix BUNDLE MIs with indiviual
+to represent the start of a bundle. It's legal to mix BUNDLE MIs with individual
 MIs that are not inside bundles nor represent bundles.
 
 MachineInstr passes should operate on a MI bundle as a single unit. Member
@@ -579,15 +579,18 @@
 register MachineOperand's that represent the cumulative inputs and outputs of
 the bundled MIs.
 
-Packing / bundling of MachineInstr's should be done as part of the register
-allocation super-pass. More specifically, the pass which determines what MIs
-should be bundled together must be done after code generator exits SSA form
-(i.e. after two-address pass, PHI elimination, and copy coalescing).  Bundles
-should only be finalized (i.e. adding BUNDLE MIs and input and output register
-MachineOperands) after virtual registers have been rewritten into physical
-registers. This requirement eliminates the need to add virtual register operands
-to BUNDLE instructions which would effectively double the virtual register def
-and use lists.
+Packing / bundling of MachineInstrs for VLIW architectures should
+generally be done as part of the register allocation super-pass. More
+specifically, the pass which determines what MIs should be bundled
+together should be done after code generator exits SSA form
+(i.e. after two-address pass, PHI elimination, and copy coalescing).
+Such bundles should be finalized (i.e. adding BUNDLE MIs and input and
+output register MachineOperands) after virtual registers have been
+rewritten into physical registers. This eliminates the need to add
+virtual register operands to BUNDLE instructions which would
+effectively double the virtual register def and use lists. Bundles may
+use virtual registers and be formed in SSA form, but may not be
+appropriate for all use cases.
 
 .. _MC Layer:
 
@@ -912,6 +915,31 @@
 (and which of the above three actions to take) by calling the
 ``setOperationAction`` method in its ``TargetLowering`` constructor.
 
+If a target has legal vector types, it is expected to produce efficient machine
+code for common forms of the shufflevector IR instruction using those types.
+This may require custom legalization for SelectionDAG vector operations that
+are created from the shufflevector IR. The shufflevector forms that should be
+handled include:
+
+* Vector select --- Each element of the vector is chosen from either of the
+  corresponding elements of the 2 input vectors. This operation may also be
+  known as a "blend" or "bitwise select" in target assembly. This type of shuffle
+  maps directly to the ``shuffle_vector`` SelectionDAG node.
+
+* Insert subvector --- A vector is placed into a longer vector type starting
+  at index 0. This type of shuffle maps directly to the ``insert_subvector``
+  SelectionDAG node with the ``index`` operand set to 0.
+
+* Extract subvector --- A vector is pulled from a longer vector type starting
+  at index 0. This type of shuffle maps directly to the ``extract_subvector``
+  SelectionDAG node with the ``index`` operand set to 0.
+
+* Splat --- All elements of the vector have identical scalar elements. This
+  operation may also be known as a "broadcast" or "duplicate" in target assembly.
+  The shufflevector IR instruction may change the vector length, so this operation
+  may map to multiple SelectionDAG nodes including ``shuffle_vector``,
+  ``concat_vectors``, ``insert_subvector``, and ``extract_subvector``.
+
 Prior to the existence of the Legalize passes, we required that every target
 `selector`_ supported and handled every operator and type even if they are not
 natively supported.  The introduction of the Legalize phases allows all of the
@@ -2036,7 +2064,8 @@
 ----------------------
 
 Tail call optimization, callee reusing the stack of the caller, is currently
-supported on x86/x86-64 and PowerPC. It is performed if:
+supported on x86/x86-64, PowerPC, and WebAssembly. It is performed on x86/x86-64
+and PowerPC if:
 
 * Caller and callee have the calling convention ``fastcc``, ``cc 10`` (GHC
   calling convention) or ``cc 11`` (HiPE calling convention).
@@ -2064,6 +2093,15 @@
 * On ppc32/64 GOT/PIC only module-local calls (visibility = hidden or protected)
   are supported.
 
+WebAssembly constraints:
+
+* No variable argument lists are used
+
+* The 'tail-call' target attribute is enabled.
+
+* The caller and callee's return types must match. The caller cannot
+  be void unless the callee is, too.
+
 Example:
 
 Call as ``llc -tailcallopt test.ll``.
@@ -2513,8 +2551,8 @@
 
 ::
 
-  BPF_X     0x0  use src_reg register as source operand
-  BPF_K     0x1  use 32 bit immediate as source operand
+  BPF_X     0x1  use src_reg register as source operand
+  BPF_K     0x0  use 32 bit immediate as source operand
 
 and four MSB bits store operation code