Mercurial > hg > CbC > CbC_llvm
diff llvm/docs/TableGen/ProgRef.rst @ 207:2e18cbf3894f
LLVM12
author | Shinji KONO <kono@ie.u-ryukyu.ac.jp> |
---|---|
date | Tue, 08 Jun 2021 06:07:14 +0900 |
parents | |
children | c4bab56944e8 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/llvm/docs/TableGen/ProgRef.rst Tue Jun 08 06:07:14 2021 +0900 @@ -0,0 +1,2009 @@ +=============================== +TableGen Programmer's Reference +=============================== + +.. sectnum:: + +.. contents:: + :local: + +Introduction +============ + +The purpose of TableGen is to generate complex output files based on +information from source files that are significantly easier to code than the +output files would be, and also easier to maintain and modify over time. The +information is coded in a declarative style involving classes and records, +which are then processed by TableGen. The internalized records are passed on +to various *backends*, which extract information from a subset of the records +and generate one or more output files. These output files are typically +``.inc`` files for C++, but may be any type of file that the backend +developer needs. + +This document describes the LLVM TableGen facility in detail. It is intended +for the programmer who is using TableGen to produce code for a project. If +you are looking for a simple overview, check out the :doc:`TableGen Overview +<./index>`. The various ``*-tblgen`` commands used to invoke TableGen are +described in :doc:`tblgen Family - Description to C++ +Code<../CommandGuide/tblgen>`. + +An example of a backend is ``RegisterInfo``, which generates the register +file information for a particular target machine, for use by the LLVM +target-independent code generator. See :doc:`TableGen Backends <./BackEnds>` +for a description of the LLVM TableGen backends, and :doc:`TableGen +Backend Developer's Guide <./BackGuide>` for a guide to writing a new +backend. + +Here are a few of the things backends can do. + +* Generate the register file information for a particular target machine. + +* Generate the instruction definitions for a target. + +* Generate the patterns that the code generator uses to match instructions + to intermediate representation (IR) nodes. + +* Generate semantic attribute identifiers for Clang. + +* Generate abstract syntax tree (AST) declaration node definitions for Clang. + +* Generate AST statement node definitions for Clang. + + +Concepts +-------- + +TableGen source files contain two primary items: *abstract records* and +*concrete records*. In this and other TableGen documents, abstract records +are called *classes.* (These classes are different from C++ classes and do +not map onto them.) In addition, concrete records are usually just called +records, although sometimes the term *record* refers to both classes and +concrete records. The distinction should be clear in context. + +Classes and concrete records have a unique *name*, either chosen by +the programmer or generated by TableGen. Associated with that name +is a list of *fields* with values and an optional list of *parent classes* +(sometimes called base or super classes). The fields are the primary data that +backends will process. Note that TableGen assigns no meanings to fields; the +meanings are entirely up to the backends and the programs that incorporate +the output of those backends. + +.. note:: + + The term "parent class" can refer to a class that is a parent of another + class, and also to a class from which a concrete record inherits. This + nonstandard use of the term arises because TableGen treats classes and + concrete records similarly. + +A backend processes some subset of the concrete records built by the +TableGen parser and emits the output files. These files are usually C++ +``.inc`` files that are included by the programs that require the data in +those records. However, a backend can produce any type of output files. For +example, it could produce a data file containing messages tagged with +identifiers and substitution parameters. In a complex use case such as the +LLVM code generator, there can be many concrete records and some of them can +have an unexpectedly large number of fields, resulting in large output files. + +In order to reduce the complexity of TableGen files, classes are used to +abstract out groups of record fields. For example, a few classes may +abstract the concept of a machine register file, while other classes may +abstract the instruction formats, and still others may abstract the +individual instructions. TableGen allows an arbitrary hierarchy of classes, +so that the abstract classes for two concepts can share a third superclass that +abstracts common "sub-concepts" from the two original concepts. + +In order to make classes more useful, a concrete record (or another class) +can request a class as a parent class and pass *template arguments* to it. +These template arguments can be used in the fields of the parent class to +initialize them in a custom manner. That is, record or class ``A`` can +request parent class ``S`` with one set of template arguments, while record or class +``B`` can request ``S`` with a different set of arguments. Without template +arguments, many more classes would be required, one for each combination of +the template arguments. + +Both classes and concrete records can include fields that are uninitialized. +The uninitialized "value" is represented by a question mark (``?``). Classes +often have uninitialized fields that are expected to be filled in when those +classes are inherited by concrete records. Even so, some fields of concrete +records may remain uninitialized. + +TableGen provides *multiclasses* to collect a group of record definitions in +one place. A multiclass is a sort of macro that can be "invoked" to define +multiple concrete records all at once. A multiclass can inherit from other +multiclasses, which means that the multiclass inherits all the definitions +from its parent multiclasses. + +`Appendix C: Sample Record`_ illustrates a complex record in the Intel X86 +target and the simple way in which it is defined. + +Source Files +============ + +TableGen source files are plain ASCII text files. The files can contain +statements, comments, and blank lines (see `Lexical Analysis`_). The standard file +extension for TableGen files is ``.td``. + +TableGen files can grow quite large, so there is an include mechanism that +allows one file to include the content of another file (see `Include +Files`_). This allows large files to be broken up into smaller ones, and +also provides a simple library mechanism where multiple source files can +include the same library file. + +TableGen supports a simple preprocessor that can be used to conditionalize +portions of ``.td`` files. See `Preprocessing Facilities`_ for more +information. + +Lexical Analysis +================ + +The lexical and syntax notation used here is intended to imitate +`Python's`_ notation. In particular, for lexical definitions, the productions +operate at the character level and there is no implied whitespace between +elements. The syntax definitions operate at the token level, so there is +implied whitespace between tokens. + +.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation + +TableGen supports BCPL-style comments (``// ...``) and nestable C-style +comments (``/* ... */``). +TableGen also provides simple `Preprocessing Facilities`_. + +Formfeed characters may be used freely in files to produce page breaks when +the file is printed for review. + +The following are the basic punctuation tokens:: + + - + [ ] { } ( ) < > : ; . ... = ? # + +Literals +-------- + +Numeric literals take one of the following forms: + +.. productionlist:: + TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` + DecimalInteger: ["+" | "-"] ("0"..."9")+ + HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ + BinInteger: "0b" ("0" | "1")+ + +Observe that the :token:`DecimalInteger` token includes the optional ``+`` +or ``-`` sign, unlike most languages where the sign would be treated as a +unary operator. + +TableGen has two kinds of string literals: + +.. productionlist:: + TokString: '"' (non-'"' characters and escapes) '"' + TokCode: "[{" (shortest text not containing "}]") "}]" + +A :token:`TokCode` is nothing more than a multi-line string literal +delimited by ``[{`` and ``}]``. It can break across lines and the +line breaks are retained in the string. + +The current implementation accepts the following escape sequences:: + + \\ \' \" \t \n + +Identifiers +----------- + +TableGen has name- and identifier-like tokens, which are case-sensitive. + +.. productionlist:: + ualpha: "a"..."z" | "A"..."Z" | "_" + TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* + TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* + +Note that, unlike most languages, TableGen allows :token:`TokIdentifier` to +begin with an integer. In case of ambiguity, a token is interpreted as a +numeric literal rather than an identifier. + +TableGen has the following reserved keywords, which cannot be used as +identifiers:: + + assert bit bits class code + dag def else false foreach + defm defset defvar field if + in include int let list + multiclass string then true + +.. warning:: + The ``field`` reserved word is deprecated. + +Bang operators +-------------- + +TableGen provides "bang operators" that have a wide variety of uses: + +.. productionlist:: + BangOperator: one of + : !add !and !cast !con !dag + : !empty !eq !filter !find !foldl + : !foreach !ge !getdagop !gt !head + : !if !interleave !isa !le !listconcat + : !listsplat !lt !mul !ne !not + : !or !setdagop !shl !size !sra + : !srl !strconcat !sub !subst !substr + : !tail !xor + +The ``!cond`` operator has a slightly different +syntax compared to other bang operators, so it is defined separately: + +.. productionlist:: + CondOperator: !cond + +See `Appendix A: Bang Operators`_ for a description of each bang operator. + +Include files +------------- + +TableGen has an include mechanism. The content of the included file +lexically replaces the ``include`` directive and is then parsed as if it was +originally in the main file. + +.. productionlist:: + IncludeDirective: "include" `TokString` + +Portions of the main file and included files can be conditionalized using +preprocessor directives. + +.. productionlist:: + PreprocessorDirective: "#define" | "#ifdef" | "#ifndef" + +Types +===== + +The TableGen language is statically typed, using a simple but complete type +system. Types are used to check for errors, to perform implicit conversions, +and to help interface designers constrain the allowed input. Every value is +required to have an associated type. + +TableGen supports a mixture of low-level types (e.g., ``bit``) and +high-level types (e.g., ``dag``). This flexibility allows you to describe a +wide range of records conveniently and compactly. + +.. productionlist:: + Type: "bit" | "int" | "string" | "dag" + :| "bits" "<" `TokInteger` ">" + :| "list" "<" `Type` ">" + :| `ClassID` + ClassID: `TokIdentifier` + +``bit`` + A ``bit`` is a boolean value that can be 0 or 1. + +``int`` + The ``int`` type represents a simple 64-bit integer value, such as 5 or + -42. + +``string`` + The ``string`` type represents an ordered sequence of characters of arbitrary + length. + +``bits<``\ *n*\ ``>`` + The ``bits`` type is a fixed-sized integer of arbitrary length *n* that + is treated as separate bits. These bits can be accessed individually. + A field of this type is useful for representing an instruction operation + code, register number, or address mode/register/displacement. The bits of + the field can be set individually or as subfields. For example, in an + instruction address, the addressing mode, base register number, and + displacement can be set separately. + +``list<``\ *type*\ ``>`` + This type represents a list whose elements are of the *type* specified in + angle brackets. The element type is arbitrary; it can even be another + list type. List elements are indexed from 0. + +``dag`` + This type represents a nestable directed acyclic graph (DAG) of nodes. + Each node has an *operator* and zero or more *arguments* (or *operands*). + An argument can be + another ``dag`` object, allowing an arbitrary tree of nodes and edges. + As an example, DAGs are used to represent code patterns for use by + the code generator instruction selection algorithms. See `Directed + acyclic graphs (DAGs)`_ for more details; + +:token:`ClassID` + Specifying a class name in a type context indicates + that the type of the defined value must + be a subclass of the specified class. This is useful in conjunction with + the ``list`` type; for example, to constrain the elements of the list to a + common base class (e.g., a ``list<Register>`` can only contain definitions + derived from the ``Register`` class). + The :token:`ClassID` must name a class that has been previously + declared or defined. + + +Values and Expressions +====================== + +There are many contexts in TableGen statements where a value is required. A +common example is in the definition of a record, where each field is +specified by a name and an optional value. TableGen allows for a reasonable +number of different forms when building up value expressions. These forms +allow the TableGen file to be written in a syntax that is natural for the +application. + +Note that all of the values have rules for converting them from one type to +another. For example, these rules allow you to assign a value like ``7`` +to an entity of type ``bits<4>``. + +.. productionlist:: + Value: `SimpleValue` `ValueSuffix`* + :| `Value` "#" `Value` + ValueSuffix: "{" `RangeList` "}" + :| "[" `RangeList` "]" + :| "." `TokIdentifier` + RangeList: `RangePiece` ("," `RangePiece`)* + RangePiece: `TokInteger` + :| `TokInteger` "..." `TokInteger` + :| `TokInteger` "-" `TokInteger` + :| `TokInteger` `TokInteger` + +.. warning:: + The peculiar last form of :token:`RangePiece` is due to the fact that the + "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as + two consecutive tokens, with values ``1`` and ``-5``, instead of "1", "-", + and "5". The use of hyphen as the range punctuation is deprecated. + +Simple values +------------- + +The :token:`SimpleValue` has a number of forms. + +.. productionlist:: + SimpleValue: `TokInteger` | `TokString`+ | `TokCode` + +A value can be an integer literal, a string literal, or a code literal. +Multiple adjacent string literals are concatenated as in C/C++; the simple +value is the concatenation of the strings. Code literals become strings and +are then indistinguishable from them. + +.. productionlist:: + SimpleValue2: "true" | "false" + +The ``true`` and ``false`` literals are essentially syntactic sugar for the +integer values 1 and 0. They improve the readability of TableGen files when +boolean values are used in field initializations, bit sequences, ``if`` +statements, etc. When parsed, these literals are converted to integers. + +.. note:: + + Although ``true`` and ``false`` are literal names for 1 and 0, we + recommend as a stylistic rule that you use them for boolean + values only. + +.. productionlist:: + SimpleValue3: "?" + +A question mark represents an uninitialized value. + +.. productionlist:: + SimpleValue4: "{" [`ValueList`] "}" + ValueList: `ValueListNE` + ValueListNE: `Value` ("," `Value`)* + +This value represents a sequence of bits, which can be used to initialize a +``bits<``\ *n*\ ``>`` field (note the braces). When doing so, the values +must represent a total of *n* bits. + +.. productionlist:: + SimpleValue5: "[" `ValueList` "]" ["<" `Type` ">"] + +This value is a list initializer (note the brackets). The values in brackets +are the elements of the list. The optional :token:`Type` can be used to +indicate a specific element type; otherwise the element type is inferred +from the given values. TableGen can usually infer the type, although +sometimes not when the value is the empty list (``[]``). + +.. productionlist:: + SimpleValue6: "(" `DagArg` [`DagArgList`] ")" + DagArgList: `DagArg` ("," `DagArg`)* + DagArg: `Value` [":" `TokVarName`] | `TokVarName` + +This represents a DAG initializer (note the parentheses). The first +:token:`DagArg` is called the "operator" of the DAG and must be a record. +See `Directed acyclic graphs (DAGs)`_ for more details. + +.. productionlist:: + SimpleValue7: `TokIdentifier` + +The resulting value is the value of the entity named by the identifier. The +possible identifiers are described here, but the descriptions will make more +sense after reading the remainder of this guide. + +.. The code for this is exceptionally abstruse. These examples are a + best-effort attempt. + +* A template argument of a ``class``, such as the use of ``Bar`` in:: + + class Foo <int Bar> { + int Baz = Bar; + } + +* The implicit template argument ``NAME`` in a ``class`` or ``multiclass`` + definition (see `NAME`_). + +* A field local to a ``class``, such as the use of ``Bar`` in:: + + class Foo { + int Bar = 5; + int Baz = Bar; + } + +* The name of a record definition, such as the use of ``Bar`` in the + definition of ``Foo``:: + + def Bar : SomeClass { + int X = 5; + } + + def Foo { + SomeClass Baz = Bar; + } + +* A field local to a record definition, such as the use of ``Bar`` in:: + + def Foo { + int Bar = 5; + int Baz = Bar; + } + + Fields inherited from the record's parent classes can be accessed the same way. + +* A template argument of a ``multiclass``, such as the use of ``Bar`` in:: + + multiclass Foo <int Bar> { + def : SomeClass<Bar>; + } + +* A variable defined with the ``defvar`` or ``defset`` statements. + +* The iteration variable of a ``foreach``, such as the use of ``i`` in:: + + foreach i = 0...5 in + def Foo#i; + +.. productionlist:: + SimpleValue8: `ClassID` "<" `ValueListNE` ">" + +This form creates a new anonymous record definition (as would be created by an +unnamed ``def`` inheriting from the given class with the given template +arguments; see `def`_) and the value is that record. A field of the record can be +obtained using a suffix; see `Suffixed Values`_. + +Invoking a class in this manner can provide a simple subroutine facility. +See `Using Classes as Subroutines`_ for more information. + +.. productionlist:: + SimpleValue9: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" + :| `CondOperator` "(" `CondClause` ("," `CondClause`)* ")" + CondClause: `Value` ":" `Value` + +The bang operators provide functions that are not available with the other +simple values. Except in the case of ``!cond``, a bang operator takes a list +of arguments enclosed in parentheses and performs some function on those +arguments, producing a value for that bang operator. The ``!cond`` operator +takes a list of pairs of arguments separated by colons. See `Appendix A: +Bang Operators`_ for a description of each bang operator. + + +Suffixed values +--------------- + +The :token:`SimpleValue` values described above can be specified with +certain suffixes. The purpose of a suffix is to obtain a subvalue of the +primary value. Here are the possible suffixes for some primary *value*. + +*value*\ ``{17}`` + The final value is bit 17 of the integer *value* (note the braces). + +*value*\ ``{8...15}`` + The final value is bits 8--15 of the integer *value*. The order of the + bits can be reversed by specifying ``{15...8}``. + +*value*\ ``[4]`` + The final value is element 4 of the list *value* (note the brackets). + In other words, the brackets act as a subscripting operator on the list. + This is the case only when a single element is specified. + +*value*\ ``[4...7,17,2...3,4]`` + The final value is a new list that is a slice of the list *value*. + The new list contains elements 4, 5, 6, 7, 17, 2, 3, and 4. + Elements may be included multiple times and in any order. This is the result + only when more than one element is specified. + +*value*\ ``.``\ *field* + The final value is the value of the specified *field* in the specified + record *value*. + +The paste operator +------------------ + +The paste operator (``#``) is the only infix operator available in TableGen +expressions. It allows you to concatenate strings or lists, but has a few +unusual features. + +The paste operator can be used when specifying the record name in a +:token:`Def` or :token:`Defm` statement, in which case it must construct a +string. If an operand is an undefined name (:token:`TokIdentifier`) or the +name of a global :token:`Defvar` or :token:`Defset`, it is treated as a +verbatim string of characters. The value of a global name is not used. + +The paste operator can be used in all other value expressions, in which case +it can construct a string or a list. Rather oddly, but consistent with the +previous case, if the *right-hand-side* operand is an undefined name or a +global name, it is treated as a verbatim string of characters. The +left-hand-side operand is treated normally. + +`Appendix B: Paste Operator Examples`_ presents examples of the behavior of +the paste operator. + +Statements +========== + +The following statements may appear at the top level of TableGen source +files. + +.. productionlist:: + TableGenFile: `Statement`* + Statement: `Assert` | `Class` | `Def` | `Defm` | `Defset` | `Defvar` + :| `Foreach` | `If` | `Let` | `MultiClass` + +The following sections describe each of these top-level statements. + + +``class`` --- define an abstract record class +--------------------------------------------- + +A ``class`` statement defines an abstract record class from which other +classes and records can inherit. + +.. productionlist:: + Class: "class" `ClassID` [`TemplateArgList`] `RecordBody` + TemplateArgList: "<" `TemplateArgDecl` ("," `TemplateArgDecl`)* ">" + TemplateArgDecl: `Type` `TokIdentifier` ["=" `Value`] + +A class can be parameterized by a list of "template arguments," whose values +can be used in the class's record body. These template arguments are +specified each time the class is inherited by another class or record. + +If a template argument is not assigned a default value with ``=``, it is +uninitialized (has the "value" ``?``) and must be specified in the template +argument list when the class is inherited (required argument). If an +argument is assigned a default value, then it need not be specified in the +argument list (optional argument). In the declaration, all required template +arguments must precede any optional arguments. The template argument default +values are evaluated from left to right. + +The :token:`RecordBody` is defined below. It can include a list of +parent classes from which the current class inherits, along with field +definitions and other statements. When a class ``C`` inherits from another +class ``D``, the fields of ``D`` are effectively merged into the fields of +``C``. + +A given class can only be defined once. A ``class`` statement is +considered to define the class if *any* of the following are true (the +:token:`RecordBody` elements are described below). + +* The :token:`TemplateArgList` is present, or +* The :token:`ParentClassList` in the :token:`RecordBody` is present, or +* The :token:`Body` in the :token:`RecordBody` is present and not empty. + +You can declare an empty class by specifying an empty :token:`TemplateArgList` +and an empty :token:`RecordBody`. This can serve as a restricted form of +forward declaration. Note that records derived from a forward-declared +class will inherit no fields from it, because those records are built when +their declarations are parsed, and thus before the class is finally defined. + +.. _NAME: + +Every class has an implicit template argument named ``NAME`` (uppercase), +which is bound to the name of the :token:`Def` or :token:`Defm` inheriting +from the class. If the class is inherited by an anonymous record, the name +is unspecified but globally unique. + +See `Examples: classes and records`_ for examples. + +Record Bodies +````````````` + +Record bodies appear in both class and record definitions. A record body can +include a parent class list, which specifies the classes from which the +current class or record inherits fields. Such classes are called the +parent classes of the class or record. The record body also +includes the main body of the definition, which contains the specification +of the fields of the class or record. + +.. productionlist:: + RecordBody: `ParentClassList` `Body` + ParentClassList: [":" `ParentClassListNE`] + ParentClassListNE: `ClassRef` ("," `ClassRef`)* + ClassRef: (`ClassID` | `MultiClassID`) ["<" [`ValueList`] ">"] + +A :token:`ParentClassList` containing a :token:`MultiClassID` is valid only +in the class list of a ``defm`` statement. In that case, the ID must be the +name of a multiclass. + +.. productionlist:: + Body: ";" | "{" `BodyItem`* "}" + BodyItem: (`Type` | "code") `TokIdentifier` ["=" `Value`] ";" + :| "let" `TokIdentifier` ["{" `RangeList` "}"] "=" `Value` ";" + :| "defvar" `TokIdentifier` "=" `Value` ";" + :| `Assert` + +A field definition in the body specifies a field to be included in the class +or record. If no initial value is specified, then the field's value is +uninitialized. The type must be specified; TableGen will not infer it from +the value. The keyword ``code`` may be used to emphasize that the field +has a string value that is code. + +The ``let`` form is used to reset a field to a new value. This can be done +for fields defined directly in the body or fields inherited from parent +classes. A :token:`RangeList` can be specified to reset certain bits in a +``bit<n>`` field. + +The ``defvar`` form defines a variable whose value can be used in other +value expressions within the body. The variable is not a field: it does not +become a field of the class or record being defined. Variables are provided +to hold temporary values while processing the body. See `Defvar in a Record +Body`_ for more details. + +When class ``C2`` inherits from class ``C1``, it acquires all the field +definitions of ``C1``. As those definitions are merged into class ``C2``, any +template arguments passed to ``C1`` by ``C2`` are substituted into the +definitions. In other words, the abstract record fields defined by ``C1`` are +expanded with the template arguments before being merged into ``C2``. + + +.. _def: + +``def`` --- define a concrete record +------------------------------------ + +A ``def`` statement defines a new concrete record. + +.. productionlist:: + Def: "def" [`NameValue`] `RecordBody` + NameValue: `Value` (parsed in a special mode) + +The name value is optional. If specified, it is parsed in a special mode +where undefined (unrecognized) identifiers are interpreted as literal +strings. In particular, global identifiers are considered unrecognized. +These include global variables defined by ``defvar`` and ``defset``. A +record name can be the null string. + +If no name value is given, the record is *anonymous*. The final name of an +anonymous record is unspecified but globally unique. + +Special handling occurs if a ``def`` appears inside a ``multiclass`` +statement. See the ``multiclass`` section below for details. + +A record can inherit from one or more classes by specifying the +:token:`ParentClassList` clause at the beginning of its record body. All of +the fields in the parent classes are added to the record. If two or more +parent classes provide the same field, the record ends up with the field value +of the last parent class. + +As a special case, the name of a record can be passed as a template argument +to that record's parent classes. For example: + +.. code-block:: text + + class A <dag d> { + dag the_dag = d; + } + + def rec1 : A<(ops rec1)> + +The DAG ``(ops rec1)`` is passed as a template argument to class ``A``. Notice +that the DAG includes ``rec1``, the record being defined. + +The steps taken to create a new record are somewhat complex. See `How +records are built`_. + +See `Examples: classes and records`_ for examples. + + +Examples: classes and records +----------------------------- + +Here is a simple TableGen file with one class and two record definitions. + +.. code-block:: text + + class C { + bit V = true; + } + + def X : C; + def Y : C { + let V = false; + string Greeting = "Hello!"; + } + +First, the abstract class ``C`` is defined. It has one field named ``V`` +that is a bit initialized to true. + +Next, two records are defined, derived from class ``C``; that is, with ``C`` +as their parent class. Thus they both inherit the ``V`` field. Record ``Y`` +also defines another string field, ``Greeting``, which is initialized to +``"Hello!"``. In addition, ``Y`` overrides the inherited ``V`` field, +setting it to false. + +A class is useful for isolating the common features of multiple records in +one place. A class can initialize common fields to default values, but +records inheriting from that class can override the defaults. + +TableGen supports the definition of parameterized classes as well as +nonparameterized ones. Parameterized classes specify a list of variable +declarations, which may optionally have defaults, that are bound when the +class is specified as a parent class of another class or record. + +.. code-block:: text + + class FPFormat <bits<3> val> { + bits<3> Value = val; + } + + def NotFP : FPFormat<0>; + def ZeroArgFP : FPFormat<1>; + def OneArgFP : FPFormat<2>; + def OneArgFPRW : FPFormat<3>; + def TwoArgFP : FPFormat<4>; + def CompareFP : FPFormat<5>; + def CondMovFP : FPFormat<6>; + def SpecialFP : FPFormat<7>; + +The purpose of the ``FPFormat`` class is to act as a sort of enumerated +type. It provides a single field, ``Value``, which holds a 3-bit number. Its +template argument, ``val``, is used to set the ``Value`` field. Each of the +eight records is defined with ``FPFormat`` as its parent class. The +enumeration value is passed in angle brackets as the template argument. Each +record will inherent the ``Value`` field with the appropriate enumeration +value. + +Here is a more complex example of classes with template arguments. First, we +define a class similar to the ``FPFormat`` class above. It takes a template +argument and uses it to initialize a field named ``Value``. Then we define +four records that inherit the ``Value`` field with its four different +integer values. + +.. code-block:: text + + class ModRefVal <bits<2> val> { + bits<2> Value = val; + } + + def None : ModRefVal<0>; + def Mod : ModRefVal<1>; + def Ref : ModRefVal<2>; + def ModRef : ModRefVal<3>; + +This is somewhat contrived, but let's say we would like to examine the two +bits of the ``Value`` field independently. We can define a class that +accepts a ``ModRefVal`` record as a template argument and splits up its +value into two fields, one bit each. Then we can define records that inherit from +``ModRefBits`` and so acquire two fields from it, one for each bit in the +``ModRefVal`` record passed as the template argument. + +.. code-block:: text + + class ModRefBits <ModRefVal mrv> { + // Break the value up into its bits, which can provide a nice + // interface to the ModRefVal values. + bit isMod = mrv.Value{0}; + bit isRef = mrv.Value{1}; + } + + // Example uses. + def foo : ModRefBits<Mod>; + def bar : ModRefBits<Ref>; + def snork : ModRefBits<ModRef>; + +This illustrates how one class can be defined to reorganize the +fields in another class, thus hiding the internal representation of that +other class. + +Running ``llvm-tblgen`` on the example prints the following definitions: + +.. code-block:: text + + def bar { // Value + bit isMod = 0; + bit isRef = 1; + } + def foo { // Value + bit isMod = 1; + bit isRef = 0; + } + def snork { // Value + bit isMod = 1; + bit isRef = 1; + } + +``let`` --- override fields in classes or records +------------------------------------------------- + +A ``let`` statement collects a set of field values (sometimes called +*bindings*) and applies them to all the classes and records defined by +statements within the scope of the ``let``. + +.. productionlist:: + Let: "let" `LetList` "in" "{" `Statement`* "}" + :| "let" `LetList` "in" `Statement` + LetList: `LetItem` ("," `LetItem`)* + LetItem: `TokIdentifier` ["<" `RangeList` ">"] "=" `Value` + +The ``let`` statement establishes a scope, which is a sequence of statements +in braces or a single statement with no braces. The bindings in the +:token:`LetList` apply to the statements in that scope. + +The field names in the :token:`LetList` must name fields in classes inherited by +the classes and records defined in the statements. The field values are +applied to the classes and records *after* the records inherit all the fields from +their parent classes. So the ``let`` acts to override inherited field +values. A ``let`` cannot override the value of a template argument. + +Top-level ``let`` statements are often useful when a few fields need to be +overriden in several records. Here are two examples. Note that ``let`` +statements can be nested. + +.. code-block:: text + + let isTerminator = true, isReturn = true, isBarrier = true, hasCtrlDep = true in + def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; + + let isCall = true in + // All calls clobber the non-callee saved registers... + let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, XMM0, XMM1, XMM2, + XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in { + def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst, variable_ops), + "call\t${dst:call}", []>; + def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), + "call\t{*}$dst", [(X86call GR32:$dst)]>; + def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), + "call\t{*}$dst", []>; + } + +Note that a top-level ``let`` will not override fields defined in the classes or records +themselves. + + +``multiclass`` --- define multiple records +------------------------------------------ + +While classes with template arguments are a good way to factor out commonality +between multiple records, multiclasses allow a convenient method for +defining many records at once. For example, consider a 3-address +instruction architecture whose instructions come in two formats: ``reg = reg +op reg`` and ``reg = reg op imm`` (e.g., SPARC). We would like to specify in +one place that these two common formats exist, then in a separate place +specify what all the operations are. The ``multiclass`` and ``defm`` +statements accomplish this goal. You can think of a multiclass as a macro or +template that expands into multiple records. + +.. productionlist:: + MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] + : [":" `ParentMultiClassList`] + : "{" `MultiClassStatement`+ "}" + ParentMultiClassList: `MultiClassID` ("," `MultiClassID`)* + MultiClassID: `TokIdentifier` + MultiClassStatement: `Assert` | `Def` | `Defm` | `Defvar` | `Foreach` | `If` | `Let` + +As with regular classes, the multiclass has a name and can accept template +arguments. A multiclass can inherit from other multiclasses, which causes +the other multiclasses to be expanded and contribute to the record +definitions in the inheriting multiclass. The body of the multiclass +contains a series of statements that define records, using :token:`Def` and +:token:`Defm`. In addition, :token:`Defvar`, :token:`Foreach`, and +:token:`Let` statements can be used to factor out even more common elements. +The :token:`If` and :token:`Assert` statements can also be used. + +Also as with regular classes, the multiclass has the implicit template +argument ``NAME`` (see NAME_). When a named (non-anonymous) record is +defined in a multiclass and the record's name does not include a use of the +template argument ``NAME``, such a use is automatically *prepended* +to the name. That is, the following are equivalent inside a multiclass:: + + def Foo ... + def NAME # Foo ... + +The records defined in a multiclass are created when the multiclass is +"instantiated" or "invoked" by a ``defm`` statement outside the multiclass +definition. Each ``def`` statement in the multiclass produces a record. As +with top-level ``def`` statements, these definitions can inherit from +multiple parent classes. + +See `Examples: multiclasses and defms`_ for examples. + + +``defm`` --- invoke multiclasses to define multiple records +----------------------------------------------------------- + +Once multiclasses have been defined, you use the ``defm`` statement to +"invoke" them and process the multiple record definitions in those +multiclasses. Those record definitions are specified by ``def`` +statements in the multiclasses, and indirectly by ``defm`` statements. + +.. productionlist:: + Defm: "defm" [`NameValue`] `ParentClassList` ";" + +The optional :token:`NameValue` is formed in the same way as the name of a +``def``. The :token:`ParentClassList` is a colon followed by a list of at +least one multiclass and any number of regular classes. The multiclasses +must precede the regular classes. Note that the ``defm`` does not have a +body. + +This statement instantiates all the records defined in all the specified +multiclasses, either directly by ``def`` statements or indirectly by +``defm`` statements. These records also receive the fields defined in any +regular classes included in the parent class list. This is useful for adding +a common set of fields to all the records created by the ``defm``. + +The name is parsed in the same special mode used by ``def``. If the name is +not included, an unspecified but globally unique name is provided. That is, +the following examples end up with different names:: + + defm : SomeMultiClass<...>; // A globally unique name. + defm "" : SomeMultiClass<...>; // An empty name. + +The ``defm`` statement can be used in a multiclass body. When this occurs, +the second variant is equivalent to:: + + defm NAME : SomeMultiClass<...>; + +More generally, when ``defm`` occurs in a multiclass and its name does not +include a use of the implicit template argument ``NAME``, then ``NAME`` will +be prepended automatically. That is, the following are equivalent inside a +multiclass:: + + defm Foo : SomeMultiClass<...>; + defm NAME # Foo : SomeMultiClass<...>; + +See `Examples: multiclasses and defms`_ for examples. + +Examples: multiclasses and defms +-------------------------------- + +Here is a simple example using ``multiclass`` and ``defm``. Consider a +3-address instruction architecture whose instructions come in two formats: +``reg = reg op reg`` and ``reg = reg op imm`` (immediate). The SPARC is an +example of such an architecture. + +.. code-block:: text + + def ops; + def GPR; + def Imm; + class inst <int opc, string asmstr, dag operandlist>; + + multiclass ri_inst <int opc, string asmstr> { + def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + } + + // Define records for each instruction in the RR and RI formats. + defm ADD : ri_inst<0b111, "add">; + defm SUB : ri_inst<0b101, "sub">; + defm MUL : ri_inst<0b100, "mul">; + +Each use of the ``ri_inst`` multiclass defines two records, one with the +``_rr`` suffix and one with ``_ri``. Recall that the name of the ``defm`` +that uses a multiclass is prepended to the names of the records defined in +that multiclass. So the resulting definitions are named:: + + ADD_rr, ADD_ri + SUB_rr, SUB_ri + MUL_rr, MUL_ri + +Without the ``multiclass`` feature, the instructions would have to be +defined as follows. + +.. code-block:: text + + def ops; + def GPR; + def Imm; + class inst <int opc, string asmstr, dag operandlist>; + + class rrinst <int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + + class riinst <int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + + // Define records for each instruction in the RR and RI formats. + def ADD_rr : rrinst<0b111, "add">; + def ADD_ri : riinst<0b111, "add">; + def SUB_rr : rrinst<0b101, "sub">; + def SUB_ri : riinst<0b101, "sub">; + def MUL_rr : rrinst<0b100, "mul">; + def MUL_ri : riinst<0b100, "mul">; + +A ``defm`` can be used in a multiclass to "invoke" other multiclasses and +create the records defined in those multiclasses in addition to the records +defined in the current multiclass. In the following example, the ``basic_s`` +and ``basic_p`` multiclasses contain ``defm`` statements that refer to the +``basic_r`` multiclass. The ``basic_r`` multiclass contains only ``def`` +statements. + +.. code-block:: text + + class Instruction <bits<4> opc, string Name> { + bits<4> opcode = opc; + string name = Name; + } + + multiclass basic_r <bits<4> opc> { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + + multiclass basic_s <bits<4> opc> { + defm SS : basic_r<opc>; + defm SD : basic_r<opc>; + def X : Instruction<opc, "x">; + } + + multiclass basic_p <bits<4> opc> { + defm PS : basic_r<opc>; + defm PD : basic_r<opc>; + def Y : Instruction<opc, "y">; + } + + defm ADD : basic_s<0xf>, basic_p<0xf>; + +The final ``defm`` creates the following records, five from the ``basic_s`` +multiclass and five from the ``basic_p`` multiclass:: + + ADDSSrr, ADDSSrm + ADDSDrr, ADDSDrm + ADDX + ADDPSrr, ADDPSrm + ADDPDrr, ADDPDrm + ADDY + +A ``defm`` statement, both at top level and in a multiclass, can inherit +from regular classes in addition to multiclasses. The rule is that the +regular classes must be listed after the multiclasses, and there must be at least +one multiclass. + +.. code-block:: text + + class XD { + bits<4> Prefix = 11; + } + class XS { + bits<4> Prefix = 12; + } + class I <bits<4> op> { + bits<4> opcode = op; + } + + multiclass R { + def rr : I<4>; + def rm : I<2>; + } + + multiclass Y { + defm SS : R, XD; // First multiclass R, then regular class XD. + defm SD : R, XS; + } + + defm Instr : Y; + +This example will create four records, shown here in alphabetical order with +their fields. + +.. code-block:: text + + def InstrSDrm { + bits<4> opcode = { 0, 0, 1, 0 }; + bits<4> Prefix = { 1, 1, 0, 0 }; + } + + def InstrSDrr { + bits<4> opcode = { 0, 1, 0, 0 }; + bits<4> Prefix = { 1, 1, 0, 0 }; + } + + def InstrSSrm { + bits<4> opcode = { 0, 0, 1, 0 }; + bits<4> Prefix = { 1, 0, 1, 1 }; + } + + def InstrSSrr { + bits<4> opcode = { 0, 1, 0, 0 }; + bits<4> Prefix = { 1, 0, 1, 1 }; + } + +It's also possible to use ``let`` statements inside multiclasses, providing +another way to factor out commonality from the records, especially when +using several levels of multiclass instantiations. + +.. code-block:: text + + multiclass basic_r <bits<4> opc> { + let Predicates = [HasSSE2] in { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + let Predicates = [HasSSE3] in + def rx : Instruction<opc, "rx">; + } + + multiclass basic_ss <bits<4> opc> { + let IsDouble = false in + defm SS : basic_r<opc>; + + let IsDouble = true in + defm SD : basic_r<opc>; + } + + defm ADD : basic_ss<0xf>; + + +``defset`` --- create a definition set +-------------------------------------- + +The ``defset`` statement is used to collect a set of records into a global +list of records. + +.. productionlist:: + Defset: "defset" `Type` `TokIdentifier` "=" "{" `Statement`* "}" + +All records defined inside the braces via ``def`` and ``defm`` are defined +as usual, and they are also collected in a global list of the given name +(:token:`TokIdentifier`). + +The specified type must be ``list<``\ *class*\ ``>``, where *class* is some +record class. The ``defset`` statement establishes a scope for its +statements. It is an error to define a record in the scope of the +``defset`` that is not of type *class*. + +The ``defset`` statement can be nested. The inner ``defset`` adds the +records to its own set, and all those records are also added to the outer +set. + +Anonymous records created inside initialization expressions using the +``ClassID<...>`` syntax are not collected in the set. + + +``defvar`` --- define a variable +-------------------------------- + +A ``defvar`` statement defines a global variable. Its value can be used +throughout the statements that follow the definition. + +.. productionlist:: + Defvar: "defvar" `TokIdentifier` "=" `Value` ";" + +The identifier on the left of the ``=`` is defined to be a global variable +whose value is given by the value expression on the right of the ``=``. The +type of the variable is automatically inferred. + +Once a variable has been defined, it cannot be set to another value. + +Variables defined in a top-level ``foreach`` go out of scope at the end of +each loop iteration, so their value in one iteration is not available in +the next iteration. The following ``defvar`` will not work:: + + defvar i = !add(i, 1) + +Variables can also be defined with ``defvar`` in a record body. See +`Defvar in a Record Body`_ for more details. + +``foreach`` --- iterate over a sequence of statements +----------------------------------------------------- + +The ``foreach`` statement iterates over a series of statements, varying a +variable over a sequence of values. + +.. productionlist:: + Foreach: "foreach" `ForeachIterator` "in" "{" `Statement`* "}" + :| "foreach" `ForeachIterator` "in" `Statement` + ForeachIterator: `TokIdentifier` "=" ("{" `RangeList` "}" | `RangePiece` | `Value`) + +The body of the ``foreach`` is a series of statements in braces or a +single statement with no braces. The statements are re-evaluated once for +each value in the range list, range piece, or single value. On each +iteration, the :token:`TokIdentifier` variable is set to the value and can +be used in the statements. + +The statement list establishes an inner scope. Variables local to a +``foreach`` go out of scope at the end of each loop iteration, so their +values do not carry over from one iteration to the next. Foreach loops may +be nested. + +The ``foreach`` statement can also be used in a record :token:`Body`. + +.. Note that the productions involving RangeList and RangePiece have precedence + over the more generic value parsing based on the first token. + +.. code-block:: text + + foreach i = [0, 1, 2, 3] in { + def R#i : Register<...>; + def F#i : Register<...>; + } + +This loop defines records named ``R0``, ``R1``, ``R2``, and ``R3``, along +with ``F0``, ``F1``, ``F2``, and ``F3``. + + +``if`` --- select statements based on a test +-------------------------------------------- + +The ``if`` statement allows one of two statement groups to be selected based +on the value of an expression. + +.. productionlist:: + If: "if" `Value` "then" `IfBody` + :| "if" `Value` "then" `IfBody` "else" `IfBody` + IfBody: "{" `Statement`* "}" | `Statement` + +The value expression is evaluated. If it evaluates to true (in the same +sense used by the bang operators), then the statements following the +``then`` reserved word are processed. Otherwise, if there is an ``else`` +reserved word, the statements following the ``else`` are processed. If the +value is false and there is no ``else`` arm, no statements are processed. + +Because the braces around the ``then`` statements are optional, this grammar rule +has the usual ambiguity with "dangling else" clauses, and it is resolved in +the usual way: in a case like ``if v1 then if v2 then {...} else {...}``, the +``else`` associates with the inner ``if`` rather than the outer one. + +The :token:`IfBody` of the then and else arms of the ``if`` establish an +inner scope. Any ``defvar`` variables defined in the bodies go out of scope +when the bodies are finished (see `Defvar in a Record Body`_ for more details). + +The ``if`` statement can also be used in a record :token:`Body`. + + +``assert`` --- check that a condition is true +--------------------------------------------- + +The ``assert`` statement checks a boolean condition to be sure that it is true +and prints an error message if it is not. + +.. productionlist:: + Assert: "assert" `condition` "," `message` ";" + +If the boolean condition is true, the statement does nothing. If the +condition is false, it prints a nonfatal error message. The **message**, which +can be an arbitrary string expression, is included in the error message as a +note. The exact behavior of the ``assert`` statement depends on its +placement. + +* At top level, the assertion is checked immediately. + +* In a record definition, the statement is saved and all assertions are + checked after the record is completely built. + +* In a class definition, the assertions are saved and inherited by all + the subclasses and records that inherit from the class. The assertions are + then checked when the records are completely built. + +* In a multiclass definition, the assertions are saved with the other + components of the multiclass and then checked each time the multiclass + is instantiated with ``defm``. + +Using assertions in TableGen files can simplify record checking in TableGen +backends. Here is an example of an ``assert`` in two class definitions. + +.. code-block:: text + + class PersonName<string name> { + assert !le(!size(name), 32), "person name is too long: " # name; + string Name = name; + } + + class Person<string name, int age> : PersonName<name> { + assert !and(!ge(age, 1), !le(age, 120)), "person age is invalid: " # age; + int Age = age; + } + + def Rec20 : Person<"Donald Knuth", 60> { + ... + } + + +Additional Details +================== + +Directed acyclic graphs (DAGs) +------------------------------ + +A directed acyclic graph can be represented directly in TableGen using the +``dag`` datatype. A DAG node consists of an operator and zero or more +arguments (or operands). Each argument can be of any desired type. By using +another DAG node as an argument, an arbitrary graph of DAG nodes can be +built. + +The syntax of a ``dag`` instance is: + + ``(`` *operator* *argument1*\ ``,`` *argument2*\ ``,`` ... ``)`` + +The operator must be present and must be a record. There can be zero or more +arguments, separated by commas. The operator and arguments can have three +formats. + +====================== ============================================= +Format Meaning +====================== ============================================= +*value* argument value +*value*\ ``:``\ *name* argument value and associated name +*name* argument name with unset (uninitialized) value +====================== ============================================= + +The *value* can be any TableGen value. The *name*, if present, must be a +:token:`TokVarName`, which starts with a dollar sign (``$``). The purpose of +a name is to tag an operator or argument in a DAG with a particular meaning, +or to associate an argument in one DAG with a like-named argument in another +DAG. + +The following bang operators are useful for working with DAGs: +``!con``, ``!dag``, ``!empty``, ``!foreach``, ``!getdagop``, ``!setdagop``, ``!size``. + +Defvar in a record body +----------------------- + +In addition to defining global variables, the ``defvar`` statement can +be used inside the :token:`Body` of a class or record definition to define +local variables. The scope of the variable extends from the ``defvar`` +statement to the end of the body. It cannot be set to a different value +within its scope. The ``defvar`` statement can also be used in the statement +list of a ``foreach``, which establishes a scope. + +A variable named ``V`` in an inner scope shadows (hides) any variables ``V`` +in outer scopes. In particular, ``V`` in a record body shadows a global +``V``, and ``V`` in a ``foreach`` statement list shadows any ``V`` in +surrounding record or global scopes. + +Variables defined in a ``foreach`` go out of scope at the end of +each loop iteration, so their value in one iteration is not available in +the next iteration. The following ``defvar`` will not work:: + + defvar i = !add(i, 1) + +How records are built +--------------------- + +The following steps are taken by TableGen when a record is built. Classes are simply +abstract records and so go through the same steps. + +1. Build the record name (:token:`NameValue`) and create an empty record. + +2. Parse the parent classes in the :token:`ParentClassList` from left to + right, visiting each parent class's ancestor classes from top to bottom. + + a. Add the fields from the parent class to the record. + b. Substitute the template arguments into those fields. + c. Add the parent class to the record's list of inherited classes. + +3. Apply any top-level ``let`` bindings to the record. Recall that top-level + bindings only apply to inherited fields. + +4. Parse the body of the record. + + * Add any fields to the record. + * Modify the values of fields according to local ``let`` statements. + * Define any ``defvar`` variables. + +5. Make a pass over all the fields to resolve any inter-field references. + +6. Add the record to the master record list. + +Because references between fields are resolved (step 5) after ``let`` bindings are +applied (step 3), the ``let`` statement has unusual power. For example: + +.. code-block:: text + + class C <int x> { + int Y = x; + int Yplus1 = !add(Y, 1); + int xplus1 = !add(x, 1); + } + + let Y = 10 in { + def rec1 : C<5> { + } + } + + def rec2 : C<5> { + let Y = 10; + } + +In both cases, one where a top-level ``let`` is used to bind ``Y`` and one +where a local ``let`` does the same thing, the results are: + +.. code-block:: text + + def rec1 { // C + int Y = 10; + int Yplus1 = 11; + int xplus1 = 6; + } + def rec2 { // C + int Y = 10; + int Yplus1 = 11; + int xplus1 = 6; + } + +``Yplus1`` is 11 because the ``let Y`` is performed before the ``!add(Y, +1)`` is resolved. Use this power wisely. + + +Using Classes as Subroutines +============================ + +As described in `Simple values`_, a class can be invoked in an expression +and passed template arguments. This causes TableGen to create a new anonymous +record inheriting from that class. As usual, the record receives all the +fields defined in the class. + +This feature can be employed as a simple subroutine facility. The class can +use the template arguments to define various variables and fields, which end +up in the anonymous record. Those fields can then be retrieved in the +expression invoking the class as follows. Assume that the field ``ret`` +contains the final value of the subroutine. + +.. code-block:: text + + int Result = ... CalcValue<arg>.ret ...; + +The ``CalcValue`` class is invoked with the template argument ``arg``. It +calculates a value for the ``ret`` field, which is then retrieved at the +"point of call" in the initialization for the Result field. The anonymous +record created in this example serves no other purpose than to carry the +result value. + +Here is a practical example. The class ``isValidSize`` determines whether a +specified number of bytes represents a valid data size. The bit ``ret`` is +set appropriately. The field ``ValidSize`` obtains its initial value by +invoking ``isValidSize`` with the data size and retrieving the ``ret`` field +from the resulting anonymous record. + +.. code-block:: text + + class isValidSize<int size> { + bit ret = !cond(!eq(size, 1): 1, + !eq(size, 2): 1, + !eq(size, 4): 1, + !eq(size, 8): 1, + !eq(size, 16): 1, + true: 0); + } + + def Data1 { + int Size = ...; + bit ValidSize = isValidSize<Size>.ret; + } + +Preprocessing Facilities +======================== + +The preprocessor embedded in TableGen is intended only for simple +conditional compilation. It supports the following directives, which are +specified somewhat informally. + +.. productionlist:: + LineBegin: beginning of line + LineEnd: newline | return | EOF + WhiteSpace: space | tab + CComment: "/*" ... "*/" + BCPLComment: "//" ... `LineEnd` + WhiteSpaceOrCComment: `WhiteSpace` | `CComment` + WhiteSpaceOrAnyComment: `WhiteSpace` | `CComment` | `BCPLComment` + MacroName: `ualpha` (`ualpha` | "0"..."9")* + PreDefine: `LineBegin` (`WhiteSpaceOrCComment`)* + : "#define" (`WhiteSpace`)+ `MacroName` + : (`WhiteSpaceOrAnyComment`)* `LineEnd` + PreIfdef: `LineBegin` (`WhiteSpaceOrCComment`)* + : ("#ifdef" | "#ifndef") (`WhiteSpace`)+ `MacroName` + : (`WhiteSpaceOrAnyComment`)* `LineEnd` + PreElse: `LineBegin` (`WhiteSpaceOrCComment`)* + : "#else" (`WhiteSpaceOrAnyComment`)* `LineEnd` + PreEndif: `LineBegin` (`WhiteSpaceOrCComment`)* + : "#endif" (`WhiteSpaceOrAnyComment`)* `LineEnd` + +.. + PreRegContentException: `PreIfdef` | `PreElse` | `PreEndif` | EOF + PreRegion: .* - `PreRegContentException` + :| `PreIfdef` + : (`PreRegion`)* + : [`PreElse`] + : (`PreRegion`)* + : `PreEndif` + +A :token:`MacroName` can be defined anywhere in a TableGen file. The name has +no value; it can only be tested to see whether it is defined. + +A macro test region begins with an ``#ifdef`` or ``#ifndef`` directive. If +the macro name is defined (``#ifdef``) or undefined (``#ifndef``), then the +source code between the directive and the corresponding ``#else`` or +``#endif`` is processed. If the test fails but there is an ``#else`` +clause, the source code between the ``#else`` and the ``#endif`` is +processed. If the test fails and there is no ``#else`` clause, then no +source code in the test region is processed. + +Test regions may be nested, but they must be properly nested. A region +started in a file must end in that file; that is, must have its +``#endif`` in the same file. + +A :token:`MacroName` may be defined externally using the ``-D`` option on the +``*-tblgen`` command line:: + + llvm-tblgen self-reference.td -Dmacro1 -Dmacro3 + +Appendix A: Bang Operators +========================== + +Bang operators act as functions in value expressions. A bang operator takes +one or more arguments, operates on them, and produces a result. If the +operator produces a boolean result, the result value will be 1 for true or 0 +for false. When an operator tests a boolean argument, it interprets 0 as false +and non-0 as true. + +.. warning:: + The ``!getop`` and ``!setop`` bang operators are deprecated in favor of + ``!getdagop`` and ``!setdagop``. + +``!add(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator adds *a*, *b*, etc., and produces the sum. + +``!and(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator does a bitwise AND on *a*, *b*, etc., and produces the + result. A logical AND can be performed if all the arguments are either + 0 or 1. + +``!cast<``\ *type*\ ``>(``\ *a*\ ``)`` + This operator performs a cast on *a* and produces the result. + If *a* is not a string, then a straightforward cast is performed, say + between an ``int`` and a ``bit``, or between record types. This allows + casting a record to a class. If a record is cast to ``string``, the + record's name is produced. + + If *a* is a string, then it is treated as a record name and looked up in + the list of all defined records. The resulting record is expected to be of + the specified *type*. + + For example, if ``!cast<``\ *type*\ ``>(``\ *name*\ ``)`` + appears in a multiclass definition, or in a + class instantiated inside a multiclass definition, and the *name* does not + reference any template arguments of the multiclass, then a record by + that name must have been instantiated earlier + in the source file. If *name* does reference + a template argument, then the lookup is delayed until ``defm`` statements + instantiating the multiclass (or later, if the defm occurs in another + multiclass and template arguments of the inner multiclass that are + referenced by *name* are substituted by values that themselves contain + references to template arguments of the outer multiclass). + + If the type of *a* does not match *type*, TableGen raises an error. + +``!con(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator concatenates the DAG nodes *a*, *b*, etc. Their operations + must equal. + + ``!con((op a1:$name1, a2:$name2), (op b1:$name3))`` + + results in the DAG node ``(op a1:$name1, a2:$name2, b1:$name3)``. + +``!cond(``\ *cond1* ``:`` *val1*\ ``,`` *cond2* ``:`` *val2*\ ``, ...,`` *condn* ``:`` *valn*\ ``)`` + This operator tests *cond1* and returns *val1* if the result is true. + If false, the operator tests *cond2* and returns *val2* if the result is + true. And so forth. An error is reported if no conditions are true. + + This example produces the sign word for an integer:: + + !cond(!lt(x, 0) : "negative", !eq(x, 0) : "zero", true : "positive") + +``!dag(``\ *op*\ ``,`` *arguments*\ ``,`` *names*\ ``)`` + This operator creates a DAG node with the given operator and + arguments. The *arguments* and *names* arguments must be lists + of equal length or uninitialized (``?``). The *names* argument + must be of type ``list<string>``. + + Due to limitations of the type system, *arguments* must be a list of items + of a common type. In practice, this means that they should either have the + same type or be records with a common parent class. Mixing ``dag`` and + non-``dag`` items is not possible. However, ``?`` can be used. + + Example: ``!dag(op, [a1, a2, ?], ["name1", "name2", "name3"])`` results in + ``(op a1-value:$name1, a2-value:$name2, ?:$name3)``. + +``!empty(``\ *a*\ ``)`` + This operator produces 1 if the string, list, or DAG *a* is empty; 0 otherwise. + A dag is empty if it has no arguments; the operator does not count. + +``!eq(`` *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``bits``, ``int``, ``string``, or + record values. Use ``!cast<string>`` to compare other types of objects. + +``!filter(``\ *var*\ ``,`` *list*\ ``,`` *predicate*\ ``)`` + + This operator creates a new ``list`` by filtering the elements in + *list*. To perform the filtering, TableGen binds the variable *var* to each + element and then evaluates the *predicate* expression, which presumably + refers to *var*. The predicate must + produce a boolean value (``bit``, ``bits``, or ``int``). The value is + interpreted as with ``!if``: + if the value is 0, the element is not included in the new list. If the value + is anything else, the element is included. + +``!find(``\ *string1*\ ``,`` *string2*\ [``,`` *start*]\ ``)`` + This operator searches for *string2* in *string1* and produces its + position. The starting position of the search may be specified by *start*, + which can range between 0 and the length of *string1*; the default is 0. + If the string is not found, the result is -1. + +``!foldl(``\ *init*\ ``,`` *list*\ ``,`` *acc*\ ``,`` *var*\ ``,`` *expr*\ ``)`` + This operator performs a left-fold over the items in *list*. The + variable *acc* acts as the accumulator and is initialized to *init*. + The variable *var* is bound to each element in the *list*. The + expression is evaluated for each element and presumably uses *acc* and + *var* to calculate the accumulated value, which ``!foldl`` stores back in + *acc*. The type of *acc* is the same as *init*; the type of *var* is the + same as the elements of *list*; *expr* must have the same type as *init*. + + The following example computes the total of the ``Number`` field in the + list of records in ``RecList``:: + + int x = !foldl(0, RecList, total, rec, !add(total, rec.Number)); + + If your goal is to filter the list and produce a new list that includes only + some of the elements, see ``!filter``. + +``!foreach(``\ *var*\ ``,`` *sequence*\ ``,`` *expr*\ ``)`` + This operator creates a new ``list``/``dag`` in which each element is a + function of the corresponding element in the *sequence* ``list``/``dag``. + To perform the function, TableGen binds the variable *var* to an element + and then evaluates the expression. The expression presumably refers + to the variable *var* and calculates the result value. + + If you simply want to create a list of a certain length containing + the same value repeated multiple times, see ``!listsplat``. + +``!ge(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is greater than or equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values. + +``!getdagop(``\ *dag*\ ``)`` --or-- ``!getdagop<``\ *type*\ ``>(``\ *dag*\ ``)`` + This operator produces the operator of the given *dag* node. + Example: ``!getdagop((foo 1, 2))`` results in ``foo``. Recall that + DAG operators are always records. + + The result of ``!getdagop`` can be used directly in a context where + any record class at all is acceptable (typically placing it into + another dag value). But in other contexts, it must be explicitly + cast to a particular class. The ``<``\ *type*\ ``>`` syntax is + provided to make this easy. + + For example, to assign the result to a value of type ``BaseClass``, you + could write either of these:: + + BaseClass b = !getdagop<BaseClass>(someDag); + BaseClass b = !cast<BaseClass>(!getdagop(someDag)); + + But to create a new DAG node that reuses the operator from another, no + cast is necessary:: + + dag d = !dag(!getdagop(someDag), args, names); + +``!gt(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is greater than *b*; 0 otherwise. + The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values. + +``!head(``\ *a*\ ``)`` + This operator produces the zeroth element of the list *a*. + (See also ``!tail``.) + +``!if(``\ *test*\ ``,`` *then*\ ``,`` *else*\ ``)`` + This operator evaluates the *test*, which must produce a ``bit`` or + ``int``. If the result is not 0, the *then* expression is produced; otherwise + the *else* expression is produced. + +``!interleave(``\ *list*\ ``,`` *delim*\ ``)`` + This operator concatenates the items in the *list*, interleaving the + *delim* string between each pair, and produces the resulting string. + The list can be a list of string, int, bits, or bit. An empty list + results in an empty string. The delimiter can be the empty string. + +``!isa<``\ *type*\ ``>(``\ *a*\ ``)`` + This operator produces 1 if the type of *a* is a subtype of the given *type*; 0 + otherwise. + +``!le(``\ *a*\ ``,`` *b*\ ``)`` + This operator produces 1 if *a* is less than or equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values. + +``!listconcat(``\ *list1*\ ``,`` *list2*\ ``, ...)`` + This operator concatenates the list arguments *list1*, *list2*, etc., and + produces the resulting list. The lists must have the same element type. + +``!listsplat(``\ *value*\ ``,`` *count*\ ``)`` + This operator produces a list of length *count* whose elements are all + equal to the *value*. For example, ``!listsplat(42, 3)`` results in + ``[42, 42, 42]``. + +``!lt(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is less than *b*; 0 otherwise. + The arguments must be ``bit``, ``bits``, ``int``, or ``string`` values. + +``!mul(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator multiplies *a*, *b*, etc., and produces the product. + +``!ne(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is not equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``bits``, ``int``, ``string``, + or record values. Use ``!cast<string>`` to compare other types of objects. + +``!not(``\ *a*\ ``)`` + This operator performs a logical NOT on *a*, which must be + an integer. The argument 0 results in 1 (true); any other + argument results in 0 (false). + +``!or(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator does a bitwise OR on *a*, *b*, etc., and produces the + result. A logical OR can be performed if all the arguments are either + 0 or 1. + +``!setdagop(``\ *dag*\ ``,`` *op*\ ``)`` + This operator produces a DAG node with the same arguments as *dag*, but with its + operator replaced with *op*. + + Example: ``!setdagop((foo 1, 2), bar)`` results in ``(bar 1, 2)``. + +``!shl(``\ *a*\ ``,`` *count*\ ``)`` + This operator shifts *a* left logically by *count* bits and produces the resulting + value. The operation is performed on a 64-bit integer; the result + is undefined for shift counts outside 0...63. + +``!size(``\ *a*\ ``)`` + This operator produces the size of the string, list, or dag *a*. + The size of a DAG is the number of arguments; the operator does not count. + +``!sra(``\ *a*\ ``,`` *count*\ ``)`` + This operator shifts *a* right arithmetically by *count* bits and produces the resulting + value. The operation is performed on a 64-bit integer; the result + is undefined for shift counts outside 0...63. + +``!srl(``\ *a*\ ``,`` *count*\ ``)`` + This operator shifts *a* right logically by *count* bits and produces the resulting + value. The operation is performed on a 64-bit integer; the result + is undefined for shift counts outside 0...63. + +``!strconcat(``\ *str1*\ ``,`` *str2*\ ``, ...)`` + This operator concatenates the string arguments *str1*, *str2*, etc., and + produces the resulting string. + +``!sub(``\ *a*\ ``,`` *b*\ ``)`` + This operator subtracts *b* from *a* and produces the arithmetic difference. + +``!subst(``\ *target*\ ``,`` *repl*\ ``,`` *value*\ ``)`` + This operator replaces all occurrences of the *target* in the *value* with + the *repl* and produces the resulting value. The *value* can + be a string, in which case substring substitution is performed. + + The *value* can be a record name, in which case the operator produces the *repl* + record if the *target* record name equals the *value* record name; otherwise it + produces the *value*. + +``!substr(``\ *string*\ ``,`` *start*\ [``,`` *length*]\ ``)`` + This operator extracts a substring of the given *string*. The starting + position of the substring is specified by *start*, which can range + between 0 and the length of the string. The length of the substring + is specified by *length*; if not specified, the rest of the string is + extracted. The *start* and *length* arguments must be integers. + +``!tail(``\ *a*\ ``)`` + This operator produces a new list with all the elements + of the list *a* except for the zeroth one. (See also ``!head``.) + +``!xor(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator does a bitwise EXCLUSIVE OR on *a*, *b*, etc., and produces + the result. A logical XOR can be performed if all the arguments are either + 0 or 1. + +Appendix B: Paste Operator Examples +=================================== + +Here is an example illustrating the use of the paste operator in record names. + +.. code-block:: text + + defvar suffix = "_suffstring"; + defvar some_ints = [0, 1, 2, 3]; + + def name # suffix { + } + + foreach i = [1, 2] in { + def rec # i { + } + } + +The first ``def`` does not use the value of the ``suffix`` variable. The +second def does use the value of the ``i`` iterator variable, because it is not a +global name. The following records are produced. + +.. code-block:: text + + def namesuffix { + } + def rec1 { + } + def rec2 { + } + +Here is a second example illustrating the paste operator in field value expressions. + +.. code-block:: text + + def test { + string strings = suffix # suffix; + list<int> integers = some_ints # [4, 5, 6]; + } + +The ``strings`` field expression uses ``suffix`` on both sides of the paste +operator. It is evaluated normally on the left hand side, but taken verbatim +on the right hand side. The ``integers`` field expression uses the value of +the ``some_ints`` variable and a literal list. The following record is +produced. + +.. code-block:: text + + def test { + string strings = "_suffstringsuffix"; + list<int> ints = [0, 1, 2, 3, 4, 5, 6]; + } + + +Appendix C: Sample Record +========================= + +One target machine supported by LLVM is the Intel x86. The following output +from TableGen shows the record that is created to represent the 32-bit +register-to-register ADD instruction. + +.. code-block:: text + + def ADD32rr { // InstructionEncoding Instruction X86Inst I ITy Sched BinOpRR BinOpRR_RF + int Size = 0; + string DecoderNamespace = ""; + list<Predicate> Predicates = []; + string DecoderMethod = ""; + bit hasCompleteDecoder = 1; + string Namespace = "X86"; + dag OutOperandList = (outs GR32:$dst); + dag InOperandList = (ins GR32:$src1, GR32:$src2); + string AsmString = "add{l} {$src2, $src1|$src1, $src2}"; + EncodingByHwMode EncodingInfos = ?; + list<dag> Pattern = [(set GR32:$dst, EFLAGS, (X86add_flag GR32:$src1, GR32:$src2))]; + list<Register> Uses = []; + list<Register> Defs = [EFLAGS]; + int CodeSize = 3; + int AddedComplexity = 0; + bit isPreISelOpcode = 0; + bit isReturn = 0; + bit isBranch = 0; + bit isEHScopeReturn = 0; + bit isIndirectBranch = 0; + bit isCompare = 0; + bit isMoveImm = 0; + bit isMoveReg = 0; + bit isBitcast = 0; + bit isSelect = 0; + bit isBarrier = 0; + bit isCall = 0; + bit isAdd = 0; + bit isTrap = 0; + bit canFoldAsLoad = 0; + bit mayLoad = ?; + bit mayStore = ?; + bit mayRaiseFPException = 0; + bit isConvertibleToThreeAddress = 1; + bit isCommutable = 1; + bit isTerminator = 0; + bit isReMaterializable = 0; + bit isPredicable = 0; + bit isUnpredicable = 0; + bit hasDelaySlot = 0; + bit usesCustomInserter = 0; + bit hasPostISelHook = 0; + bit hasCtrlDep = 0; + bit isNotDuplicable = 0; + bit isConvergent = 0; + bit isAuthenticated = 0; + bit isAsCheapAsAMove = 0; + bit hasExtraSrcRegAllocReq = 0; + bit hasExtraDefRegAllocReq = 0; + bit isRegSequence = 0; + bit isPseudo = 0; + bit isExtractSubreg = 0; + bit isInsertSubreg = 0; + bit variadicOpsAreDefs = 0; + bit hasSideEffects = ?; + bit isCodeGenOnly = 0; + bit isAsmParserOnly = 0; + bit hasNoSchedulingInfo = 0; + InstrItinClass Itinerary = NoItinerary; + list<SchedReadWrite> SchedRW = [WriteALU]; + string Constraints = "$src1 = $dst"; + string DisableEncoding = ""; + string PostEncoderMethod = ""; + bits<64> TSFlags = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0 }; + string AsmMatchConverter = ""; + string TwoOperandAliasConstraint = ""; + string AsmVariantName = ""; + bit UseNamedOperandTable = 0; + bit FastISelShouldIgnore = 0; + bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; + Format Form = MRMDestReg; + bits<7> FormBits = { 0, 1, 0, 1, 0, 0, 0 }; + ImmType ImmT = NoImm; + bit ForceDisassemble = 0; + OperandSize OpSize = OpSize32; + bits<2> OpSizeBits = { 1, 0 }; + AddressSize AdSize = AdSizeX; + bits<2> AdSizeBits = { 0, 0 }; + Prefix OpPrefix = NoPrfx; + bits<3> OpPrefixBits = { 0, 0, 0 }; + Map OpMap = OB; + bits<3> OpMapBits = { 0, 0, 0 }; + bit hasREX_WPrefix = 0; + FPFormat FPForm = NotFP; + bit hasLockPrefix = 0; + Domain ExeDomain = GenericDomain; + bit hasREPPrefix = 0; + Encoding OpEnc = EncNormal; + bits<2> OpEncBits = { 0, 0 }; + bit HasVEX_W = 0; + bit IgnoresVEX_W = 0; + bit EVEX_W1_VEX_W0 = 0; + bit hasVEX_4V = 0; + bit hasVEX_L = 0; + bit ignoresVEX_L = 0; + bit hasEVEX_K = 0; + bit hasEVEX_Z = 0; + bit hasEVEX_L2 = 0; + bit hasEVEX_B = 0; + bits<3> CD8_Form = { 0, 0, 0 }; + int CD8_EltSize = 0; + bit hasEVEX_RC = 0; + bit hasNoTrackPrefix = 0; + bits<7> VectSize = { 0, 0, 1, 0, 0, 0, 0 }; + bits<7> CD8_Scale = { 0, 0, 0, 0, 0, 0, 0 }; + string FoldGenRegForm = ?; + string EVEX2VEXOverride = ?; + bit isMemoryFoldable = 1; + bit notEVEX2VEXConvertible = 0; + } + +On the first line of the record, you can see that the ``ADD32rr`` record +inherited from eight classes. Although the inheritance hierarchy is complex, +using parent classes is much simpler than specifying the 109 individual +fields for each instruction. + +Here is the code fragment used to define ``ADD32rr`` and multiple other +``ADD`` instructions: + +.. code-block:: text + + defm ADD : ArithBinOp_RF<0x00, 0x02, 0x04, "add", MRM0r, MRM0m, + X86add_flag, add, 1, 1, 1>; + +The ``defm`` statement tells TableGen that ``ArithBinOp_RF`` is a +multiclass, which contains multiple concrete record definitions that inherit +from ``BinOpRR_RF``. That class, in turn, inherits from ``BinOpRR``, which +inherits from ``ITy`` and ``Sched``, and so forth. The fields are inherited +from all the parent classes; for example, ``IsIndirectBranch`` is inherited +from the ``Instruction`` class.