Mercurial > hg > CbC > CbC_llvm
comparison docs/LangRef.rst @ 148:63bd29f05246
merged
author | Shinji KONO <kono@ie.u-ryukyu.ac.jp> |
---|---|
date | Wed, 14 Aug 2019 19:46:37 +0900 |
parents | c2174574ed3a |
children |
comparison
equal
deleted
inserted
replaced
146:3fc4d5c3e21e | 148:63bd29f05246 |
---|---|
78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other | 78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other |
79 characters in their names can be surrounded with quotes. Special | 79 characters in their names can be surrounded with quotes. Special |
80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII | 80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII |
81 code for the character in hexadecimal. In this way, any character can | 81 code for the character in hexadecimal. In this way, any character can |
82 be used in a name value, even quotes themselves. The ``"\01"`` prefix | 82 be used in a name value, even quotes themselves. The ``"\01"`` prefix |
83 can be used on global variables to suppress mangling. | 83 can be used on global values to suppress mangling. |
84 #. Unnamed values are represented as an unsigned numeric value with | 84 #. Unnamed values are represented as an unsigned numeric value with |
85 their prefix. For example, ``%12``, ``@2``, ``%44``. | 85 their prefix. For example, ``%12``, ``@2``, ``%44``. |
86 #. Constants, which are described in the section Constants_ below. | 86 #. Constants, which are described in the section Constants_ below. |
87 | 87 |
88 LLVM requires that values start with a prefix for two reasons: Compilers | 88 LLVM requires that values start with a prefix for two reasons: Compilers |
322 used when implementing functional programming languages. At the | 322 used when implementing functional programming languages. At the |
323 moment only X86 supports this convention and it has the following | 323 moment only X86 supports this convention and it has the following |
324 limitations: | 324 limitations: |
325 | 325 |
326 - On *X86-32* only supports up to 4 bit type parameters. No | 326 - On *X86-32* only supports up to 4 bit type parameters. No |
327 floating point types are supported. | 327 floating-point types are supported. |
328 - On *X86-64* only supports up to 10 bit type parameters and 6 | 328 - On *X86-64* only supports up to 10 bit type parameters and 6 |
329 floating point parameters. | 329 floating-point parameters. |
330 | 330 |
331 This calling convention supports `tail call | 331 This calling convention supports `tail call |
332 optimization <CodeGenerator.html#id80>`_ but requires both the | 332 optimization <CodeGenerator.html#id80>`_ but requires both the |
333 caller and callee are using it. | 333 caller and callee are using it. |
334 "``cc 11``" - The HiPE calling convention | 334 "``cc 11``" - The HiPE calling convention |
419 a few TLS IR variables, each access will be lowered to a platform-specific | 419 a few TLS IR variables, each access will be lowered to a platform-specific |
420 sequence. | 420 sequence. |
421 | 421 |
422 This calling convention aims to minimize overhead in the caller by | 422 This calling convention aims to minimize overhead in the caller by |
423 preserving as many registers as possible (all the registers that are | 423 preserving as many registers as possible (all the registers that are |
424 perserved on the fast path, composed of the entry and exit blocks). | 424 preserved on the fast path, composed of the entry and exit blocks). |
425 | 425 |
426 This calling convention behaves identical to the `C` calling convention on | 426 This calling convention behaves identical to the `C` calling convention on |
427 how arguments and return values are passed, but it uses a different set of | 427 how arguments and return values are passed, but it uses a different set of |
428 caller/callee-saved registers. | 428 caller/callee-saved registers. |
429 | 429 |
673 an optional list of attached :ref:`metadata <metadata>`. | 673 an optional list of attached :ref:`metadata <metadata>`. |
674 | 674 |
675 Variables and aliases can have a | 675 Variables and aliases can have a |
676 :ref:`Thread Local Storage Model <tls_model>`. | 676 :ref:`Thread Local Storage Model <tls_model>`. |
677 | 677 |
678 :ref:`Scalable vectors <t_vector>` cannot be global variables or members of | |
679 structs or arrays because their size is unknown at compile time. | |
680 | |
678 Syntax:: | 681 Syntax:: |
679 | 682 |
680 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] | 683 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] |
681 [DLLStorageClass] [ThreadLocal] | 684 [DLLStorageClass] [ThreadLocal] |
682 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] | 685 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] |
717 an optional :ref:`calling convention <callingconv>`, | 720 an optional :ref:`calling convention <callingconv>`, |
718 an optional ``unnamed_addr`` attribute, a return type, an optional | 721 an optional ``unnamed_addr`` attribute, a return type, an optional |
719 :ref:`parameter attribute <paramattrs>` for the return type, a function | 722 :ref:`parameter attribute <paramattrs>` for the return type, a function |
720 name, a (possibly empty) argument list (each with optional :ref:`parameter | 723 name, a (possibly empty) argument list (each with optional :ref:`parameter |
721 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, | 724 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, |
722 an optional section, an optional alignment, | 725 an optional address space, an optional section, an optional alignment, |
723 an optional :ref:`comdat <langref_comdats>`, | 726 an optional :ref:`comdat <langref_comdats>`, |
724 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, | 727 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, |
725 an optional :ref:`prologue <prologuedata>`, | 728 an optional :ref:`prologue <prologuedata>`, |
726 an optional :ref:`personality <personalityfn>`, | 729 an optional :ref:`personality <personalityfn>`, |
727 an optional list of attached :ref:`metadata <metadata>`, | 730 an optional list of attached :ref:`metadata <metadata>`, |
729 | 732 |
730 LLVM function declarations consist of the "``declare``" keyword, an | 733 LLVM function declarations consist of the "``declare``" keyword, an |
731 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style | 734 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style |
732 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an | 735 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an |
733 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` | 736 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` |
734 or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter | 737 or ``local_unnamed_addr`` attribute, an optional address space, a return type, |
735 attribute <paramattrs>` for the return type, a function name, a possibly | 738 an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly |
736 empty list of arguments, an optional alignment, an optional :ref:`garbage | 739 empty list of arguments, an optional alignment, an optional :ref:`garbage |
737 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional | 740 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional |
738 :ref:`prologue <prologuedata>`. | 741 :ref:`prologue <prologuedata>`. |
739 | 742 |
740 A function definition contains a list of basic blocks, forming the CFG (Control | 743 A function definition contains a list of basic blocks, forming the CFG (Control |
741 Flow Graph) for the function. Each basic block may optionally start with a label | 744 Flow Graph) for the function. Each basic block may optionally start with a label |
742 (giving the basic block a symbol table entry), contains a list of instructions, | 745 (giving the basic block a symbol table entry), contains a list of instructions, |
743 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or | 746 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or |
744 function return). If an explicit label is not provided, a block is assigned an | 747 function return). If an explicit label name is not provided, a block is assigned |
745 implicit numbered label, using the next value from the same counter as used for | 748 an implicit numbered label, using the next value from the same counter as used |
746 unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function | 749 for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a |
747 entry block does not have an explicit label, it will be assigned label "%0", | 750 function entry block does not have an explicit label, it will be assigned label |
748 then the first unnamed temporary in that block will be "%1", etc. | 751 "%0", then the first unnamed temporary in that block will be "%1", etc. If a |
752 numeric label is explicitly specified, it must match the numeric label that | |
753 would be used implicitly. | |
749 | 754 |
750 The first basic block in a function is special in two ways: it is | 755 The first basic block in a function is special in two ways: it is |
751 immediately executed on entrance to the function, and it is not allowed | 756 immediately executed on entrance to the function, and it is not allowed |
752 to have predecessor basic blocks (i.e. there can not be any branches to | 757 to have predecessor basic blocks (i.e. there can not be any branches to |
753 the entry block of a function). Because the block can have no | 758 the entry block of a function). Because the block can have no |
767 be significant and two identical functions can be merged. | 772 be significant and two identical functions can be merged. |
768 | 773 |
769 If the ``local_unnamed_addr`` attribute is given, the address is known to | 774 If the ``local_unnamed_addr`` attribute is given, the address is known to |
770 not be significant within the module. | 775 not be significant within the module. |
771 | 776 |
777 If an explicit address space is not given, it will default to the program | |
778 address space from the :ref:`datalayout string<langref_datalayout>`. | |
779 | |
772 Syntax:: | 780 Syntax:: |
773 | 781 |
774 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] | 782 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] |
775 [cconv] [ret attrs] | 783 [cconv] [ret attrs] |
776 <ResultType> @<FunctionName> ([argument list]) | 784 <ResultType> @<FunctionName> ([argument list]) |
777 [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"] | 785 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs] |
778 [comdat [($name)]] [align N] [gc] [prefix Constant] | 786 [section "name"] [comdat [($name)]] [align N] [gc] [prefix Constant] |
779 [prologue Constant] [personality Constant] (!name !N)* { ... } | 787 [prologue Constant] [personality Constant] (!name !N)* { ... } |
780 | 788 |
781 The argument list is a comma separated sequence of arguments where each | 789 The argument list is a comma separated sequence of arguments where each |
782 argument is of the following form: | 790 argument is of the following form: |
783 | 791 |
1007 in a special target-dependent fashion while emitting code for | 1015 in a special target-dependent fashion while emitting code for |
1008 a function call or return (usually, by putting it in a register as | 1016 a function call or return (usually, by putting it in a register as |
1009 opposed to memory, though some targets use it to distinguish between | 1017 opposed to memory, though some targets use it to distinguish between |
1010 two different kinds of registers). Use of this attribute is | 1018 two different kinds of registers). Use of this attribute is |
1011 target-specific. | 1019 target-specific. |
1012 ``byval`` | 1020 ``byval`` or ``byval(<ty>)`` |
1013 This indicates that the pointer parameter should really be passed by | 1021 This indicates that the pointer parameter should really be passed by |
1014 value to the function. The attribute implies that a hidden copy of | 1022 value to the function. The attribute implies that a hidden copy of |
1015 the pointee is made between the caller and the callee, so the callee | 1023 the pointee is made between the caller and the callee, so the callee |
1016 is unable to modify the value in the caller. This attribute is only | 1024 is unable to modify the value in the caller. This attribute is only |
1017 valid on LLVM pointer arguments. It is generally used to pass | 1025 valid on LLVM pointer arguments. It is generally used to pass |
1019 scalars. The copy is considered to belong to the caller not the | 1027 scalars. The copy is considered to belong to the caller not the |
1020 callee (for example, ``readonly`` functions should not write to | 1028 callee (for example, ``readonly`` functions should not write to |
1021 ``byval`` parameters). This is not a valid attribute for return | 1029 ``byval`` parameters). This is not a valid attribute for return |
1022 values. | 1030 values. |
1023 | 1031 |
1032 The byval attribute also supports an optional type argument, which must be | |
1033 the same as the pointee type of the argument. | |
1034 | |
1024 The byval attribute also supports specifying an alignment with the | 1035 The byval attribute also supports specifying an alignment with the |
1025 align attribute. It indicates the alignment of the stack slot to | 1036 align attribute. It indicates the alignment of the stack slot to |
1026 form and the known alignment of the pointer specified to the call | 1037 form and the known alignment of the pointer specified to the call |
1027 site. If the alignment is not specified, then the code generator | 1038 site. If the alignment is not specified, then the code generator |
1028 makes a target-specific assumption. | 1039 makes a target-specific assumption. |
1046 large aggregate return values, which means that frontend authors | 1057 large aggregate return values, which means that frontend authors |
1047 must lower them with ``sret`` pointers. | 1058 must lower them with ``sret`` pointers. |
1048 | 1059 |
1049 When the call site is reached, the argument allocation must have | 1060 When the call site is reached, the argument allocation must have |
1050 been the most recent stack allocation that is still live, or the | 1061 been the most recent stack allocation that is still live, or the |
1051 results are undefined. It is possible to allocate additional stack | 1062 behavior is undefined. It is possible to allocate additional stack |
1052 space after an argument allocation and before its call site, but it | 1063 space after an argument allocation and before its call site, but it |
1053 must be cleared off with :ref:`llvm.stackrestore | 1064 must be cleared off with :ref:`llvm.stackrestore |
1054 <int_stackrestore>`. | 1065 <int_stackrestore>`. |
1055 | 1066 |
1056 See :doc:`InAlloca` for more information on how to use this | 1067 See :doc:`InAlloca` for more information on how to use this |
1066 | 1077 |
1067 .. _attr_align: | 1078 .. _attr_align: |
1068 | 1079 |
1069 ``align <n>`` | 1080 ``align <n>`` |
1070 This indicates that the pointer value may be assumed by the optimizer to | 1081 This indicates that the pointer value may be assumed by the optimizer to |
1071 have the specified alignment. | 1082 have the specified alignment. If the pointer value does not have the |
1083 specified alignment, behavior is undefined. | |
1072 | 1084 |
1073 Note that this attribute has additional semantics when combined with the | 1085 Note that this attribute has additional semantics when combined with the |
1074 ``byval`` attribute. | 1086 ``byval`` attribute, which are documented there. |
1075 | 1087 |
1076 .. _noalias: | 1088 .. _noalias: |
1077 | 1089 |
1078 ``noalias`` | 1090 ``noalias`` |
1079 This indicates that objects accessed via pointer values | 1091 This indicates that objects accessed via pointer values |
1120 return values and can only be applied to one parameter. | 1132 return values and can only be applied to one parameter. |
1121 | 1133 |
1122 ``nonnull`` | 1134 ``nonnull`` |
1123 This indicates that the parameter or return pointer is not null. This | 1135 This indicates that the parameter or return pointer is not null. This |
1124 attribute may only be applied to pointer typed parameters. This is not | 1136 attribute may only be applied to pointer typed parameters. This is not |
1125 checked or enforced by LLVM, the caller must ensure that the pointer | 1137 checked or enforced by LLVM; if the parameter or return pointer is null, |
1126 passed in is non-null, or the callee must ensure that the returned pointer | 1138 the behavior is undefined. |
1127 is non-null. | |
1128 | 1139 |
1129 ``dereferenceable(<n>)`` | 1140 ``dereferenceable(<n>)`` |
1130 This indicates that the parameter or return pointer is dereferenceable. This | 1141 This indicates that the parameter or return pointer is dereferenceable. This |
1131 attribute may only be applied to pointer typed parameters. A pointer that | 1142 attribute may only be applied to pointer typed parameters. A pointer that |
1132 is dereferenceable can be loaded from speculatively without a risk of | 1143 is dereferenceable can be loaded from speculatively without a risk of |
1172 on a parameter is not ABI-compatible with one which does not. | 1183 on a parameter is not ABI-compatible with one which does not. |
1173 | 1184 |
1174 These constraints also allow LLVM to assume that a ``swifterror`` argument | 1185 These constraints also allow LLVM to assume that a ``swifterror`` argument |
1175 does not alias any other memory visible within a function and that a | 1186 does not alias any other memory visible within a function and that a |
1176 ``swifterror`` alloca passed as an argument does not escape. | 1187 ``swifterror`` alloca passed as an argument does not escape. |
1188 | |
1189 ``immarg`` | |
1190 This indicates the parameter is required to be an immediate | |
1191 value. This must be a trivial immediate integer or floating-point | |
1192 constant. Undef or constant expressions are not valid. This is | |
1193 only valid on intrinsic declarations and cannot be applied to a | |
1194 call site or arbitrary function. | |
1177 | 1195 |
1178 .. _gc: | 1196 .. _gc: |
1179 | 1197 |
1180 Garbage Collector Strategy Names | 1198 Garbage Collector Strategy Names |
1181 -------------------------------- | 1199 -------------------------------- |
1385 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it | 1403 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it |
1386 can prove that the call/invoke cannot call a convergent function. | 1404 can prove that the call/invoke cannot call a convergent function. |
1387 ``inaccessiblememonly`` | 1405 ``inaccessiblememonly`` |
1388 This attribute indicates that the function may only access memory that | 1406 This attribute indicates that the function may only access memory that |
1389 is not accessible by the module being compiled. This is a weaker form | 1407 is not accessible by the module being compiled. This is a weaker form |
1390 of ``readnone``. | 1408 of ``readnone``. If the function reads or writes other memory, the |
1409 behavior is undefined. | |
1391 ``inaccessiblemem_or_argmemonly`` | 1410 ``inaccessiblemem_or_argmemonly`` |
1392 This attribute indicates that the function may only access memory that is | 1411 This attribute indicates that the function may only access memory that is |
1393 either not accessible by the module being compiled, or is pointed to | 1412 either not accessible by the module being compiled, or is pointed to |
1394 by its pointer arguments. This is a weaker form of ``argmemonly`` | 1413 by its pointer arguments. This is a weaker form of ``argmemonly``. If the |
1414 function reads or writes other memory, the behavior is undefined. | |
1395 ``inlinehint`` | 1415 ``inlinehint`` |
1396 This attribute indicates that the source code contained a hint that | 1416 This attribute indicates that the source code contained a hint that |
1397 inlining this function is desirable (such as the "inline" keyword in | 1417 inlining this function is desirable (such as the "inline" keyword in |
1398 C/C++). It is just a hint; it imposes no requirements on the | 1418 C/C++). It is just a hint; it imposes no requirements on the |
1399 inliner. | 1419 inliner. |
1431 A function containing a ``noduplicate`` call may still | 1451 A function containing a ``noduplicate`` call may still |
1432 be an inlining candidate, provided that the call is not | 1452 be an inlining candidate, provided that the call is not |
1433 duplicated by inlining. That implies that the function has | 1453 duplicated by inlining. That implies that the function has |
1434 internal linkage and only has one call site, so the original | 1454 internal linkage and only has one call site, so the original |
1435 call is dead after inlining. | 1455 call is dead after inlining. |
1456 ``nofree`` | |
1457 This function attribute indicates that the function does not, directly or | |
1458 indirectly, call a memory-deallocation function (free, for example). As a | |
1459 result, uncaptured pointers that are known to be dereferenceable prior to a | |
1460 call to a function with the ``nofree`` attribute are still known to be | |
1461 dereferenceable after the call (the capturing condition is necessary in | |
1462 environments where the function might communicate the pointer to another thread | |
1463 which then deallocates the memory). | |
1436 ``noimplicitfloat`` | 1464 ``noimplicitfloat`` |
1437 This attributes disables implicit floating point instructions. | 1465 This attributes disables implicit floating-point instructions. |
1438 ``noinline`` | 1466 ``noinline`` |
1439 This attribute indicates that the inliner should never inline this | 1467 This attribute indicates that the inliner should never inline this |
1440 function in any situation. This attribute may not be used together | 1468 function in any situation. This attribute may not be used together |
1441 with the ``alwaysinline`` attribute. | 1469 with the ``alwaysinline`` attribute. |
1442 ``nonlazybind`` | 1470 ``nonlazybind`` |
1444 may make calls to the function faster, at the cost of extra program | 1472 may make calls to the function faster, at the cost of extra program |
1445 startup time if the function is not called during program startup. | 1473 startup time if the function is not called during program startup. |
1446 ``noredzone`` | 1474 ``noredzone`` |
1447 This attribute indicates that the code generator should not use a | 1475 This attribute indicates that the code generator should not use a |
1448 red zone, even if the target-specific ABI normally permits it. | 1476 red zone, even if the target-specific ABI normally permits it. |
1477 ``indirect-tls-seg-refs`` | |
1478 This attribute indicates that the code generator should not use | |
1479 direct TLS access through segment registers, even if the | |
1480 target-specific ABI normally permits it. | |
1449 ``noreturn`` | 1481 ``noreturn`` |
1450 This function attribute indicates that the function never returns | 1482 This function attribute indicates that the function never returns |
1451 normally. This produces undefined behavior at runtime if the | 1483 normally, hence through a return instruction. This produces undefined |
1452 function ever does dynamically return. | 1484 behavior at runtime if the function ever does dynamically return. Annotated |
1485 functions may still raise an exception, i.a., ``nounwind`` is not implied. | |
1453 ``norecurse`` | 1486 ``norecurse`` |
1454 This function attribute indicates that the function does not call itself | 1487 This function attribute indicates that the function does not call itself |
1455 either directly or indirectly down any possible call path. This produces | 1488 either directly or indirectly down any possible call path. This produces |
1456 undefined behavior at runtime if the function ever does recurse. | 1489 undefined behavior at runtime if the function ever does recurse. |
1490 ``willreturn`` | |
1491 This function attribute indicates that a call of this function will | |
1492 either exhibit undefined behavior or comes back and continues execution | |
1493 at a point in the existing call stack that includes the current invocation. | |
1494 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied. | |
1495 If an invocation of an annotated function does not return control back | |
1496 to a point in the call stack, the behavior is undefined. | |
1497 ``nosync`` | |
1498 This function attribute indicates that the function does not communicate | |
1499 (synchronize) with another thread through memory or other well-defined means. | |
1500 Synchronization is considered possible in the presence of `atomic` accesses | |
1501 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses, | |
1502 as well as `convergent` function calls. Note that through `convergent` function calls | |
1503 non-memory communication, e.g., cross-lane operations, are possible and are also | |
1504 considered synchronization. However `convergent` does not contradict `nosync`. | |
1505 If an annotated function does ever synchronize with another thread, | |
1506 the behavior is undefined. | |
1457 ``nounwind`` | 1507 ``nounwind`` |
1458 This function attribute indicates that the function never raises an | 1508 This function attribute indicates that the function never raises an |
1459 exception. If the function does raise an exception, its runtime | 1509 exception. If the function does raise an exception, its runtime |
1460 behavior is undefined. However, functions marked nounwind may still | 1510 behavior is undefined. However, functions marked nounwind may still |
1461 trap or generate asynchronous exceptions. Exception handling schemes | 1511 trap or generate asynchronous exceptions. Exception handling schemes |
1462 that are recognized by LLVM to handle asynchronous exceptions, such | 1512 that are recognized by LLVM to handle asynchronous exceptions, such |
1463 as SEH, will still provide their implementation defined semantics. | 1513 as SEH, will still provide their implementation defined semantics. |
1514 ``"null-pointer-is-valid"`` | |
1515 If ``"null-pointer-is-valid"`` is set to ``"true"``, then ``null`` address | |
1516 in address-space 0 is considered to be a valid address for memory loads and | |
1517 stores. Any analysis or optimization should not treat dereferencing a | |
1518 pointer to ``null`` as undefined behavior in this function. | |
1519 Note: Comparing address of a global variable to ``null`` may still | |
1520 evaluate to false because of a limitation in querying this attribute inside | |
1521 constant expressions. | |
1522 ``optforfuzzing`` | |
1523 This attribute indicates that this function should be optimized | |
1524 for maximum fuzzing signal. | |
1464 ``optnone`` | 1525 ``optnone`` |
1465 This function attribute indicates that most optimization passes will skip | 1526 This function attribute indicates that most optimization passes will skip |
1466 this function, with the exception of interprocedural optimization passes. | 1527 this function, with the exception of interprocedural optimization passes. |
1467 Code generation defaults to the "fast" instruction selector. | 1528 Code generation defaults to the "fast" instruction selector. |
1468 This attribute cannot be used together with the ``alwaysinline`` | 1529 This attribute cannot be used together with the ``alwaysinline`` |
1529 visible memory. | 1590 visible memory. |
1530 | 1591 |
1531 On an argument, this attribute indicates that the function does not | 1592 On an argument, this attribute indicates that the function does not |
1532 dereference that pointer argument, even though it may read or write the | 1593 dereference that pointer argument, even though it may read or write the |
1533 memory that the pointer points to if accessed through other pointers. | 1594 memory that the pointer points to if accessed through other pointers. |
1595 | |
1596 If a readnone function reads or writes memory visible to the program, or | |
1597 has other side-effects, the behavior is undefined. If a function reads from | |
1598 or writes to a readnone pointer argument, the behavior is undefined. | |
1534 ``readonly`` | 1599 ``readonly`` |
1535 On a function, this attribute indicates that the function does not write | 1600 On a function, this attribute indicates that the function does not write |
1536 through any pointer arguments (including ``byval`` arguments) or otherwise | 1601 through any pointer arguments (including ``byval`` arguments) or otherwise |
1537 modify any state (e.g. memory, control registers, etc) visible to | 1602 modify any state (e.g. memory, control registers, etc) visible to |
1538 caller functions. It may dereference pointer arguments and read | 1603 caller functions. It may dereference pointer arguments and read |
1544 exceptions without writing to LLVM visible memory. | 1609 exceptions without writing to LLVM visible memory. |
1545 | 1610 |
1546 On an argument, this attribute indicates that the function does not write | 1611 On an argument, this attribute indicates that the function does not write |
1547 through this pointer argument, even though it may write to the memory that | 1612 through this pointer argument, even though it may write to the memory that |
1548 the pointer points to. | 1613 the pointer points to. |
1614 | |
1615 If a readonly function writes memory visible to the program, or | |
1616 has other side-effects, the behavior is undefined. If a function writes to | |
1617 a readonly pointer argument, the behavior is undefined. | |
1549 ``"stack-probe-size"`` | 1618 ``"stack-probe-size"`` |
1550 This attribute controls the behavior of stack probes: either | 1619 This attribute controls the behavior of stack probes: either |
1551 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. | 1620 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. |
1552 It defines the size of the guard region. It ensures that if the function | 1621 It defines the size of the guard region. It ensures that if the function |
1553 may use more stack space than the size of the guard region, stack probing | 1622 may use more stack space than the size of the guard region, stack probing |
1559 function has the ``"stack-probe-size"`` attribute that has the lower | 1628 function has the ``"stack-probe-size"`` attribute that has the lower |
1560 numeric value. If a function that has a ``"stack-probe-size"`` attribute is | 1629 numeric value. If a function that has a ``"stack-probe-size"`` attribute is |
1561 inlined into a function that has no ``"stack-probe-size"`` attribute | 1630 inlined into a function that has no ``"stack-probe-size"`` attribute |
1562 at all, the resulting function has the ``"stack-probe-size"`` attribute | 1631 at all, the resulting function has the ``"stack-probe-size"`` attribute |
1563 of the callee. | 1632 of the callee. |
1633 ``"no-stack-arg-probe"`` | |
1634 This attribute disables ABI-required stack probes, if any. | |
1564 ``writeonly`` | 1635 ``writeonly`` |
1565 On a function, this attribute indicates that the function may write to but | 1636 On a function, this attribute indicates that the function may write to but |
1566 does not read from memory. | 1637 does not read from memory. |
1567 | 1638 |
1568 On an argument, this attribute indicates that the function may write to but | 1639 On an argument, this attribute indicates that the function may write to but |
1569 does not read through this pointer argument (even though it may read from | 1640 does not read through this pointer argument (even though it may read from |
1570 the memory that the pointer points to). | 1641 the memory that the pointer points to). |
1642 | |
1643 If a writeonly function reads memory visible to the program, or | |
1644 has other side-effects, the behavior is undefined. If a function reads | |
1645 from a writeonly pointer argument, the behavior is undefined. | |
1571 ``argmemonly`` | 1646 ``argmemonly`` |
1572 This attribute indicates that the only memory accesses inside function are | 1647 This attribute indicates that the only memory accesses inside function are |
1573 loads and stores from objects pointed to by its pointer-typed arguments, | 1648 loads and stores from objects pointed to by its pointer-typed arguments, |
1574 with arbitrary offsets. Or in other words, all memory operations in the | 1649 with arbitrary offsets. Or in other words, all memory operations in the |
1575 function can refer to memory only using pointers based on its function | 1650 function can refer to memory only using pointers based on its function |
1576 arguments. | 1651 arguments. |
1652 | |
1577 Note that ``argmemonly`` can be used together with ``readonly`` attribute | 1653 Note that ``argmemonly`` can be used together with ``readonly`` attribute |
1578 in order to specify that function reads only from its arguments. | 1654 in order to specify that function reads only from its arguments. |
1655 | |
1656 If an argmemonly function reads or writes memory other than the pointer | |
1657 arguments, or has other side-effects, the behavior is undefined. | |
1579 ``returns_twice`` | 1658 ``returns_twice`` |
1580 This attribute indicates that this function can return twice. The C | 1659 This attribute indicates that this function can return twice. The C |
1581 ``setjmp`` is an example of such a function. The compiler disables | 1660 ``setjmp`` is an example of such a function. The compiler disables |
1582 some optimizations (like tail calls) in the caller of these | 1661 some optimizations (like tail calls) in the caller of these |
1583 functions. | 1662 functions. |
1601 (dynamic thread safety analysis) are enabled for this function. | 1680 (dynamic thread safety analysis) are enabled for this function. |
1602 ``sanitize_hwaddress`` | 1681 ``sanitize_hwaddress`` |
1603 This attribute indicates that HWAddressSanitizer checks | 1682 This attribute indicates that HWAddressSanitizer checks |
1604 (dynamic address safety analysis based on tagged pointers) are enabled for | 1683 (dynamic address safety analysis based on tagged pointers) are enabled for |
1605 this function. | 1684 this function. |
1685 ``sanitize_memtag`` | |
1686 This attribute indicates that MemTagSanitizer checks | |
1687 (dynamic address safety analysis based on Armv8 MTE) are enabled for | |
1688 this function. | |
1689 ``speculative_load_hardening`` | |
1690 This attribute indicates that | |
1691 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_ | |
1692 should be enabled for the function body. | |
1693 | |
1694 Speculative Load Hardening is a best-effort mitigation against | |
1695 information leak attacks that make use of control flow | |
1696 miss-speculation - specifically miss-speculation of whether a branch | |
1697 is taken or not. Typically vulnerabilities enabling such attacks are | |
1698 classified as "Spectre variant #1". Notably, this does not attempt to | |
1699 mitigate against miss-speculation of branch target, classified as | |
1700 "Spectre variant #2" vulnerabilities. | |
1701 | |
1702 When inlining, the attribute is sticky. Inlining a function that carries | |
1703 this attribute will cause the caller to gain the attribute. This is intended | |
1704 to provide a maximally conservative model where the code in a function | |
1705 annotated with this attribute will always (even after inlining) end up | |
1706 hardened. | |
1606 ``speculatable`` | 1707 ``speculatable`` |
1607 This function attribute indicates that the function does not have any | 1708 This function attribute indicates that the function does not have any |
1608 effects besides calculating its result and does not have undefined behavior. | 1709 effects besides calculating its result and does not have undefined behavior. |
1609 Note that ``speculatable`` is not enough to conclude that along any | 1710 Note that ``speculatable`` is not enough to conclude that along any |
1610 particular execution path the number of calls to this function will not be | 1711 particular execution path the number of calls to this function will not be |
1680 If a function that has an ``sspstrong`` attribute is inlined into a | 1781 If a function that has an ``sspstrong`` attribute is inlined into a |
1681 function that doesn't have an ``sspstrong`` attribute, then the | 1782 function that doesn't have an ``sspstrong`` attribute, then the |
1682 resulting function will have an ``sspstrong`` attribute. | 1783 resulting function will have an ``sspstrong`` attribute. |
1683 ``strictfp`` | 1784 ``strictfp`` |
1684 This attribute indicates that the function was called from a scope that | 1785 This attribute indicates that the function was called from a scope that |
1685 requires strict floating point semantics. LLVM will not attempt any | 1786 requires strict floating-point semantics. LLVM will not attempt any |
1686 optimizations that require assumptions about the floating point rounding | 1787 optimizations that require assumptions about the floating-point rounding |
1687 mode or that might alter the state of floating point status flags that | 1788 mode or that might alter the state of floating-point status flags that |
1688 might otherwise be set or cleared by calling this function. | 1789 might otherwise be set or cleared by calling this function. |
1689 ``"thunk"`` | 1790 ``"thunk"`` |
1690 This attribute indicates that the function will delegate to some other | 1791 This attribute indicates that the function will delegate to some other |
1691 function with a tail call. The prototype of a thunk should not be used for | 1792 function with a tail call. The prototype of a thunk should not be used for |
1692 optimization purposes. The caller is expected to cast the thunk prototype to | 1793 optimization purposes. The caller is expected to cast the thunk prototype to |
1695 This attribute indicates that the ABI being targeted requires that | 1796 This attribute indicates that the ABI being targeted requires that |
1696 an unwind table entry be produced for this function even if we can | 1797 an unwind table entry be produced for this function even if we can |
1697 show that no exceptions passes by it. This is normally the case for | 1798 show that no exceptions passes by it. This is normally the case for |
1698 the ELF x86-64 abi, but it can be disabled for some compilation | 1799 the ELF x86-64 abi, but it can be disabled for some compilation |
1699 units. | 1800 units. |
1801 ``nocf_check`` | |
1802 This attribute indicates that no control-flow check will be performed on | |
1803 the attributed entity. It disables -fcf-protection=<> for a specific | |
1804 entity to fine grain the HW control flow protection mechanism. The flag | |
1805 is target independent and currently appertains to a function or function | |
1806 pointer. | |
1807 ``shadowcallstack`` | |
1808 This attribute indicates that the ShadowCallStack checks are enabled for | |
1809 the function. The instrumentation checks that the return address for the | |
1810 function has not changed between the function prolog and eiplog. It is | |
1811 currently x86_64-specific. | |
1700 | 1812 |
1701 .. _glattrs: | 1813 .. _glattrs: |
1702 | 1814 |
1703 Global Attributes | 1815 Global Attributes |
1704 ----------------- | 1816 ----------------- |
1903 promotion of stack variables is limited to the natural stack | 2015 promotion of stack variables is limited to the natural stack |
1904 alignment to avoid dynamic stack realignment. The stack alignment | 2016 alignment to avoid dynamic stack realignment. The stack alignment |
1905 must be a multiple of 8-bits. If omitted, the natural stack | 2017 must be a multiple of 8-bits. If omitted, the natural stack |
1906 alignment defaults to "unspecified", which does not prevent any | 2018 alignment defaults to "unspecified", which does not prevent any |
1907 alignment promotions. | 2019 alignment promotions. |
2020 ``P<address space>`` | |
2021 Specifies the address space that corresponds to program memory. | |
2022 Harvard architectures can use this to specify what space LLVM | |
2023 should place things such as functions into. If omitted, the | |
2024 program memory space defaults to the default address space of 0, | |
2025 which corresponds to a Von Neumann architecture that has code | |
2026 and data in the same space. | |
1908 ``A<address space>`` | 2027 ``A<address space>`` |
1909 Specifies the address space of objects created by '``alloca``'. | 2028 Specifies the address space of objects created by '``alloca``'. |
1910 Defaults to the default address space of 0. | 2029 Defaults to the default address space of 0. |
1911 ``p[n]:<size>:<abi>:<pref>:<idx>`` | 2030 ``p[n]:<size>:<abi>:<pref>:<idx>`` |
1912 This specifies the *size* of a pointer and its ``<abi>`` and | 2031 This specifies the *size* of a pointer and its ``<abi>`` and |
1913 ``<pref>``\erred alignments for address space ``n``. The fourth parameter | 2032 ``<pref>``\erred alignments for address space ``n``. The fourth parameter |
1914 ``<idx>`` is a size of index that used for address calculation. If not | 2033 ``<idx>`` is a size of index that used for address calculation. If not |
1921 ``<size>``. The value of ``<size>`` must be in the range [1,2^23). | 2040 ``<size>``. The value of ``<size>`` must be in the range [1,2^23). |
1922 ``v<size>:<abi>:<pref>`` | 2041 ``v<size>:<abi>:<pref>`` |
1923 This specifies the alignment for a vector type of a given bit | 2042 This specifies the alignment for a vector type of a given bit |
1924 ``<size>``. | 2043 ``<size>``. |
1925 ``f<size>:<abi>:<pref>`` | 2044 ``f<size>:<abi>:<pref>`` |
1926 This specifies the alignment for a floating point type of a given bit | 2045 This specifies the alignment for a floating-point type of a given bit |
1927 ``<size>``. Only values of ``<size>`` that are supported by the target | 2046 ``<size>``. Only values of ``<size>`` that are supported by the target |
1928 will work. 32 (float) and 64 (double) are supported on all targets; 80 | 2047 will work. 32 (float) and 64 (double) are supported on all targets; 80 |
1929 or 128 (different flavors of long double) are also supported on some | 2048 or 128 (different flavors of long double) are also supported on some |
1930 targets. | 2049 targets. |
1931 ``a:<abi>:<pref>`` | 2050 ``a:<abi>:<pref>`` |
1932 This specifies the alignment for an object of aggregate type. | 2051 This specifies the alignment for an object of aggregate type. |
2052 ``F<type><abi>`` | |
2053 This specifies the alignment for function pointers. | |
2054 The options for ``<type>`` are: | |
2055 | |
2056 * ``i``: The alignment of function pointers is independent of the alignment | |
2057 of functions, and is a multiple of ``<abi>``. | |
2058 * ``n``: The alignment of function pointers is a multiple of the explicit | |
2059 alignment specified on the function, and is a multiple of ``<abi>``. | |
1933 ``m:<mangling>`` | 2060 ``m:<mangling>`` |
1934 If present, specifies that llvm names are mangled in the output. The | 2061 If present, specifies that llvm names are mangled in the output. Symbols |
2062 prefixed with the mangling escape character ``\01`` are passed through | |
2063 directly to the assembler without the escape character. The mangling style | |
1935 options are | 2064 options are |
1936 | 2065 |
1937 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. | 2066 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. |
1938 * ``m``: Mips mangling: Private symbols get a ``$`` prefix. | 2067 * ``m``: Mips mangling: Private symbols get a ``$`` prefix. |
1939 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other | 2068 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other |
1940 symbols get a ``_`` prefix. | 2069 symbols get a ``_`` prefix. |
1941 * ``w``: Windows COFF prefix: Similar to Mach-O, but stdcall and fastcall | 2070 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. |
1942 functions also get a suffix based on the frame size. | 2071 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, |
1943 * ``x``: Windows x86 COFF prefix: Similar to Windows COFF, but use a ``_`` | 2072 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends |
1944 prefix for ``__cdecl`` functions. | 2073 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols |
2074 starting with ``?`` are not mangled in any way. | |
2075 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C | |
2076 symbols do not receive a ``_`` prefix. | |
1945 ``n<size1>:<size2>:<size3>...`` | 2077 ``n<size1>:<size2>:<size3>...`` |
1946 This specifies a set of native integer widths for the target CPU in | 2078 This specifies a set of native integer widths for the target CPU in |
1947 bits. For example, it might contain ``n32`` for 32-bit PowerPC, | 2079 bits. For example, it might contain ``n32`` for 32-bit PowerPC, |
1948 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of | 2080 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of |
1949 this set are considered to support most general arithmetic operations | 2081 this set are considered to support most general arithmetic operations |
2052 of the variable's storage. | 2184 of the variable's storage. |
2053 - The result value of an allocation instruction is associated with the | 2185 - The result value of an allocation instruction is associated with the |
2054 address range of the allocated storage. | 2186 address range of the allocated storage. |
2055 - A null pointer in the default address-space is associated with no | 2187 - A null pointer in the default address-space is associated with no |
2056 address. | 2188 address. |
2189 - An :ref:`undef value <undefvalues>` in *any* address-space is | |
2190 associated with no address. | |
2057 - An integer constant other than zero or a pointer value returned from | 2191 - An integer constant other than zero or a pointer value returned from |
2058 a function not defined within LLVM may be associated with address | 2192 a function not defined within LLVM may be associated with address |
2059 ranges allocated through mechanisms other than those provided by | 2193 ranges allocated through mechanisms other than those provided by |
2060 LLVM. Such ranges shall not overlap with any ranges of addresses | 2194 LLVM. Such ranges shall not overlap with any ranges of addresses |
2061 allocated by mechanisms provided by LLVM. | 2195 allocated by mechanisms provided by LLVM. |
2100 marked ``volatile``. The optimizers must not change the number of | 2234 marked ``volatile``. The optimizers must not change the number of |
2101 volatile operations or change their order of execution relative to other | 2235 volatile operations or change their order of execution relative to other |
2102 volatile operations. The optimizers *may* change the order of volatile | 2236 volatile operations. The optimizers *may* change the order of volatile |
2103 operations relative to non-volatile operations. This is not Java's | 2237 operations relative to non-volatile operations. This is not Java's |
2104 "volatile" and has no cross-thread synchronization behavior. | 2238 "volatile" and has no cross-thread synchronization behavior. |
2239 | |
2240 A volatile load or store may have additional target-specific semantics. | |
2241 Any volatile operation can have side effects, and any volatile operation | |
2242 can read and/or modify state which is not accessible via a regular load | |
2243 or store in this module. Volatile operations may use addresses which do | |
2244 not point to memory (like MMIO registers). This means the compiler may | |
2245 not use a volatile operation to prove a non-volatile access to that | |
2246 address has defined behavior. | |
2247 | |
2248 The allowed side-effects for volatile accesses are limited. If a | |
2249 non-volatile store to a given address would be legal, a volatile | |
2250 operation may modify the memory at that address. A volatile operation | |
2251 may not modify any other memory accessible by the module being compiled. | |
2252 A volatile operation may not call any code in the current module. | |
2253 | |
2254 The compiler may assume execution will continue after a volatile operation, | |
2255 so operations which modify memory or may have undefined behavior can be | |
2256 hoisted past a volatile operation. | |
2105 | 2257 |
2106 IR-level volatile loads and stores cannot safely be optimized into | 2258 IR-level volatile loads and stores cannot safely be optimized into |
2107 llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are | 2259 llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are |
2108 flagged volatile. Likewise, the backend should never split or merge | 2260 flagged volatile. Likewise, the backend should never split or merge |
2109 target-legal volatile load/store instructions. | 2261 target-legal volatile load/store instructions. |
2273 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` | 2425 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` |
2274 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the | 2426 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the |
2275 seq\_cst total orderings of other operations that are not marked | 2427 seq\_cst total orderings of other operations that are not marked |
2276 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. | 2428 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. |
2277 | 2429 |
2430 .. _floatenv: | |
2431 | |
2432 Floating-Point Environment | |
2433 -------------------------- | |
2434 | |
2435 The default LLVM floating-point environment assumes that floating-point | |
2436 instructions do not have side effects. Results assume the round-to-nearest | |
2437 rounding mode. No floating-point exception state is maintained in this | |
2438 environment. Therefore, there is no attempt to create or preserve invalid | |
2439 operation (SNaN) or division-by-zero exceptions. | |
2440 | |
2441 The benefit of this exception-free assumption is that floating-point | |
2442 operations may be speculated freely without any other fast-math relaxations | |
2443 to the floating-point model. | |
2444 | |
2445 Code that requires different behavior than this should use the | |
2446 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. | |
2447 | |
2278 .. _fastmath: | 2448 .. _fastmath: |
2279 | 2449 |
2280 Fast-Math Flags | 2450 Fast-Math Flags |
2281 --------------- | 2451 --------------- |
2282 | 2452 |
2286 may use the following flags to enable otherwise unsafe | 2456 may use the following flags to enable otherwise unsafe |
2287 floating-point transformations. | 2457 floating-point transformations. |
2288 | 2458 |
2289 ``nnan`` | 2459 ``nnan`` |
2290 No NaNs - Allow optimizations to assume the arguments and result are not | 2460 No NaNs - Allow optimizations to assume the arguments and result are not |
2291 NaN. Such optimizations are required to retain defined behavior over | 2461 NaN. If an argument is a nan, or the result would be a nan, it produces |
2292 NaNs, but the value of the result is undefined. | 2462 a :ref:`poison value <poisonvalues>` instead. |
2293 | 2463 |
2294 ``ninf`` | 2464 ``ninf`` |
2295 No Infs - Allow optimizations to assume the arguments and result are not | 2465 No Infs - Allow optimizations to assume the arguments and result are not |
2296 +/-Inf. Such optimizations are required to retain defined behavior over | 2466 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it |
2297 +/-Inf, but the value of the result is undefined. | 2467 produces a :ref:`poison value <poisonvalues>` instead. |
2298 | 2468 |
2299 ``nsz`` | 2469 ``nsz`` |
2300 No Signed Zeros - Allow optimizations to treat the sign of a zero | 2470 No Signed Zeros - Allow optimizations to treat the sign of a zero |
2301 argument or result as insignificant. | 2471 argument or result as insignificant. |
2302 | 2472 |
2313 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions | 2483 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions |
2314 for places where this can apply to LLVM's intrinsic math functions. | 2484 for places where this can apply to LLVM's intrinsic math functions. |
2315 | 2485 |
2316 ``reassoc`` | 2486 ``reassoc`` |
2317 Allow reassociation transformations for floating-point instructions. | 2487 Allow reassociation transformations for floating-point instructions. |
2318 This may dramatically change results in floating point. | 2488 This may dramatically change results in floating-point. |
2319 | 2489 |
2320 ``fast`` | 2490 ``fast`` |
2321 This flag implies all of the others. | 2491 This flag implies all of the others. |
2322 | 2492 |
2323 .. _uselistorder: | 2493 .. _uselistorder: |
2502 | ``i1942652`` | a really big integer of over 1 million bits. | | 2672 | ``i1942652`` | a really big integer of over 1 million bits. | |
2503 +----------------+------------------------------------------------+ | 2673 +----------------+------------------------------------------------+ |
2504 | 2674 |
2505 .. _t_floating: | 2675 .. _t_floating: |
2506 | 2676 |
2507 Floating Point Types | 2677 Floating-Point Types |
2508 """""""""""""""""""" | 2678 """""""""""""""""""" |
2509 | 2679 |
2510 .. list-table:: | 2680 .. list-table:: |
2511 :header-rows: 1 | 2681 :header-rows: 1 |
2512 | 2682 |
2513 * - Type | 2683 * - Type |
2514 - Description | 2684 - Description |
2515 | 2685 |
2516 * - ``half`` | 2686 * - ``half`` |
2517 - 16-bit floating point value | 2687 - 16-bit floating-point value |
2518 | 2688 |
2519 * - ``float`` | 2689 * - ``float`` |
2520 - 32-bit floating point value | 2690 - 32-bit floating-point value |
2521 | 2691 |
2522 * - ``double`` | 2692 * - ``double`` |
2523 - 64-bit floating point value | 2693 - 64-bit floating-point value |
2524 | 2694 |
2525 * - ``fp128`` | 2695 * - ``fp128`` |
2526 - 128-bit floating point value (112-bit mantissa) | 2696 - 128-bit floating-point value (112-bit mantissa) |
2527 | 2697 |
2528 * - ``x86_fp80`` | 2698 * - ``x86_fp80`` |
2529 - 80-bit floating point value (X87) | 2699 - 80-bit floating-point value (X87) |
2530 | 2700 |
2531 * - ``ppc_fp128`` | 2701 * - ``ppc_fp128`` |
2532 - 128-bit floating point value (two 64-bits) | 2702 - 128-bit floating-point value (two 64-bits) |
2703 | |
2704 The binary format of half, float, double, and fp128 correspond to the | |
2705 IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 | |
2706 respectively. | |
2533 | 2707 |
2534 X86_mmx Type | 2708 X86_mmx Type |
2535 """""""""""" | 2709 """""""""""" |
2536 | 2710 |
2537 :Overview: | 2711 :Overview: |
2592 :Overview: | 2766 :Overview: |
2593 | 2767 |
2594 A vector type is a simple derived type that represents a vector of | 2768 A vector type is a simple derived type that represents a vector of |
2595 elements. Vector types are used when multiple primitive data are | 2769 elements. Vector types are used when multiple primitive data are |
2596 operated in parallel using a single instruction (SIMD). A vector type | 2770 operated in parallel using a single instruction (SIMD). A vector type |
2597 requires a size (number of elements) and an underlying primitive data | 2771 requires a size (number of elements), an underlying primitive data type, |
2598 type. Vector types are considered :ref:`first class <t_firstclass>`. | 2772 and a scalable property to represent vectors where the exact hardware |
2773 vector length is unknown at compile time. Vector types are considered | |
2774 :ref:`first class <t_firstclass>`. | |
2599 | 2775 |
2600 :Syntax: | 2776 :Syntax: |
2601 | 2777 |
2602 :: | 2778 :: |
2603 | 2779 |
2604 < <# elements> x <elementtype> > | 2780 < <# elements> x <elementtype> > ; Fixed-length vector |
2781 < vscale x <# elements> x <elementtype> > ; Scalable vector | |
2605 | 2782 |
2606 The number of elements is a constant integer value larger than 0; | 2783 The number of elements is a constant integer value larger than 0; |
2607 elementtype may be any integer, floating point or pointer type. Vectors | 2784 elementtype may be any integer, floating-point or pointer type. Vectors |
2608 of size zero are not allowed. | 2785 of size zero are not allowed. For scalable vectors, the total number of |
2786 elements is a constant multiple (called vscale) of the specified number | |
2787 of elements; vscale is a positive integer that is unknown at compile time | |
2788 and the same hardware-dependent constant for all scalable vectors at run | |
2789 time. The size of a specific scalable vector type is thus constant within | |
2790 IR, even if the exact size in bytes cannot be determined until run time. | |
2609 | 2791 |
2610 :Examples: | 2792 :Examples: |
2611 | 2793 |
2612 +-------------------+--------------------------------------------------+ | 2794 +------------------------+----------------------------------------------------+ |
2613 | ``<4 x i32>`` | Vector of 4 32-bit integer values. | | 2795 | ``<4 x i32>`` | Vector of 4 32-bit integer values. | |
2614 +-------------------+--------------------------------------------------+ | 2796 +------------------------+----------------------------------------------------+ |
2615 | ``<8 x float>`` | Vector of 8 32-bit floating-point values. | | 2797 | ``<8 x float>`` | Vector of 8 32-bit floating-point values. | |
2616 +-------------------+--------------------------------------------------+ | 2798 +------------------------+----------------------------------------------------+ |
2617 | ``<2 x i64>`` | Vector of 2 64-bit integer values. | | 2799 | ``<2 x i64>`` | Vector of 2 64-bit integer values. | |
2618 +-------------------+--------------------------------------------------+ | 2800 +------------------------+----------------------------------------------------+ |
2619 | ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | | 2801 | ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | |
2620 +-------------------+--------------------------------------------------+ | 2802 +------------------------+----------------------------------------------------+ |
2803 | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. | | |
2804 +------------------------+----------------------------------------------------+ | |
2621 | 2805 |
2622 .. _t_label: | 2806 .. _t_label: |
2623 | 2807 |
2624 Label Type | 2808 Label Type |
2625 ^^^^^^^^^^ | 2809 ^^^^^^^^^^ |
2713 Here are some examples of multidimensional arrays: | 2897 Here are some examples of multidimensional arrays: |
2714 | 2898 |
2715 +-----------------------------+----------------------------------------------------------+ | 2899 +-----------------------------+----------------------------------------------------------+ |
2716 | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | | 2900 | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | |
2717 +-----------------------------+----------------------------------------------------------+ | 2901 +-----------------------------+----------------------------------------------------------+ |
2718 | ``[12 x [10 x float]]`` | 12x10 array of single precision floating point values. | | 2902 | ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | |
2719 +-----------------------------+----------------------------------------------------------+ | 2903 +-----------------------------+----------------------------------------------------------+ |
2720 | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | | 2904 | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | |
2721 +-----------------------------+----------------------------------------------------------+ | 2905 +-----------------------------+----------------------------------------------------------+ |
2722 | 2906 |
2723 There is no restriction on indexing beyond the end of the array implied | 2907 There is no restriction on indexing beyond the end of the array implied |
2814 of the ``i1`` type. | 2998 of the ``i1`` type. |
2815 **Integer constants** | 2999 **Integer constants** |
2816 Standard integers (such as '4') are constants of the | 3000 Standard integers (such as '4') are constants of the |
2817 :ref:`integer <t_integer>` type. Negative numbers may be used with | 3001 :ref:`integer <t_integer>` type. Negative numbers may be used with |
2818 integer types. | 3002 integer types. |
2819 **Floating point constants** | 3003 **Floating-point constants** |
2820 Floating point constants use standard decimal notation (e.g. | 3004 Floating-point constants use standard decimal notation (e.g. |
2821 123.421), exponential notation (e.g. 1.23421e+2), or a more precise | 3005 123.421), exponential notation (e.g. 1.23421e+2), or a more precise |
2822 hexadecimal notation (see below). The assembler requires the exact | 3006 hexadecimal notation (see below). The assembler requires the exact |
2823 decimal value of a floating-point constant. For example, the | 3007 decimal value of a floating-point constant. For example, the |
2824 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating | 3008 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating |
2825 decimal in binary. Floating point constants must have a :ref:`floating | 3009 decimal in binary. Floating-point constants must have a |
2826 point <t_floating>` type. | 3010 :ref:`floating-point <t_floating>` type. |
2827 **Null pointer constants** | 3011 **Null pointer constants** |
2828 The identifier '``null``' is recognized as a null pointer constant | 3012 The identifier '``null``' is recognized as a null pointer constant |
2829 and must be of :ref:`pointer type <t_pointer>`. | 3013 and must be of :ref:`pointer type <t_pointer>`. |
2830 **Token constants** | 3014 **Token constants** |
2831 The identifier '``none``' is recognized as an empty token constant | 3015 The identifier '``none``' is recognized as an empty token constant |
2832 and must be of :ref:`token type <t_token>`. | 3016 and must be of :ref:`token type <t_token>`. |
2833 | 3017 |
2834 The one non-intuitive notation for constants is the hexadecimal form of | 3018 The one non-intuitive notation for constants is the hexadecimal form of |
2835 floating point constants. For example, the form | 3019 floating-point constants. For example, the form |
2836 '``double 0x432ff973cafa8000``' is equivalent to (but harder to read | 3020 '``double 0x432ff973cafa8000``' is equivalent to (but harder to read |
2837 than) '``double 4.5e+15``'. The only time hexadecimal floating point | 3021 than) '``double 4.5e+15``'. The only time hexadecimal floating-point |
2838 constants are required (and the only time that they are generated by the | 3022 constants are required (and the only time that they are generated by the |
2839 disassembler) is when a floating point constant must be emitted but it | 3023 disassembler) is when a floating-point constant must be emitted but it |
2840 cannot be represented as a decimal floating point number in a reasonable | 3024 cannot be represented as a decimal floating-point number in a reasonable |
2841 number of digits. For example, NaN's, infinities, and other special | 3025 number of digits. For example, NaN's, infinities, and other special |
2842 values are represented in their IEEE hexadecimal format so that assembly | 3026 values are represented in their IEEE hexadecimal format so that assembly |
2843 and disassembly do not cause any bits to change in the constants. | 3027 and disassembly do not cause any bits to change in the constants. |
2844 | 3028 |
2845 When using the hexadecimal form, constants of types half, float, and | 3029 When using the hexadecimal form, constants of types half, float, and |
3026 ``%C`` need to have the same semantics or the core LLVM "replace all | 3210 ``%C`` need to have the same semantics or the core LLVM "replace all |
3027 uses with" concept would not hold. | 3211 uses with" concept would not hold. |
3028 | 3212 |
3029 .. code-block:: llvm | 3213 .. code-block:: llvm |
3030 | 3214 |
3031 %A = fdiv undef, %X | 3215 %A = sdiv undef, %X |
3032 %B = fdiv %X, undef | 3216 %B = sdiv %X, undef |
3033 Safe: | 3217 Safe: |
3034 %A = undef | 3218 %A = 0 |
3035 b: unreachable | 3219 b: unreachable |
3036 | 3220 |
3037 These examples show the crucial difference between an *undefined value* | 3221 These examples show the crucial difference between an *undefined value* |
3038 and *undefined behavior*. An undefined value (like '``undef``') is | 3222 and *undefined behavior*. An undefined value (like '``undef``') is |
3039 allowed to have an arbitrary bit-pattern. This means that the ``%A`` | 3223 allowed to have an arbitrary bit-pattern. This means that the ``%A`` |
3040 operation can be constant folded to '``undef``', because the '``undef``' | 3224 operation can be constant folded to '``0``', because the '``undef``' |
3041 could be an SNaN, and ``fdiv`` is not (currently) defined on SNaN's. | 3225 could be zero, and zero divided by any value is zero. |
3042 However, in the second example, we can make a more aggressive | 3226 However, in the second example, we can make a more aggressive |
3043 assumption: because the ``undef`` is allowed to be an arbitrary value, | 3227 assumption: because the ``undef`` is allowed to be an arbitrary value, |
3044 we are allowed to assume that it could be zero. Since a divide by zero | 3228 we are allowed to assume that it could be zero. Since a divide by zero |
3045 has *undefined behavior*, we are allowed to assume that the operation | 3229 has *undefined behavior*, we are allowed to assume that the operation |
3046 does not execute at all. This allows us to delete the divide and all | 3230 does not execute at all. This allows us to delete the divide and all |
3053 b: store %X -> undef | 3237 b: store %X -> undef |
3054 Safe: | 3238 Safe: |
3055 a: <deleted> | 3239 a: <deleted> |
3056 b: unreachable | 3240 b: unreachable |
3057 | 3241 |
3058 These examples reiterate the ``fdiv`` example: a store *of* an undefined | 3242 A store *of* an undefined value can be assumed to not have any effect; |
3059 value can be assumed to not have any effect; we can assume that the | 3243 we can assume that the value is overwritten with bits that happen to |
3060 value is overwritten with bits that happen to match what was already | 3244 match what was already there. However, a store *to* an undefined |
3061 there. However, a store *to* an undefined location could clobber | 3245 location could clobber arbitrary memory, therefore, it has undefined |
3062 arbitrary memory, therefore, it has undefined behavior. | 3246 behavior. |
3063 | 3247 |
3064 .. _poisonvalues: | 3248 .. _poisonvalues: |
3065 | 3249 |
3066 Poison Values | 3250 Poison Values |
3067 ------------- | 3251 ------------- |
3068 | 3252 |
3069 Poison values are similar to :ref:`undef values <undefvalues>`, however | 3253 In order to facilitate speculative execution, many instructions do not |
3070 they also represent the fact that an instruction or constant expression | 3254 invoke immediate undefined behavior when provided with illegal operands, |
3071 that cannot evoke side effects has nevertheless detected a condition | 3255 and return a poison value instead. |
3072 that results in undefined behavior. | |
3073 | 3256 |
3074 There is currently no way of representing a poison value in the IR; they | 3257 There is currently no way of representing a poison value in the IR; they |
3075 only exist when produced by operations such as :ref:`add <i_add>` with | 3258 only exist when produced by operations such as :ref:`add <i_add>` with |
3076 the ``nsw`` flag. | 3259 the ``nsw`` flag. |
3077 | 3260 |
3104 instruction if the set of instructions it otherwise depends on would | 3287 instruction if the set of instructions it otherwise depends on would |
3105 be different if the terminator had transferred control to a different | 3288 be different if the terminator had transferred control to a different |
3106 successor. | 3289 successor. |
3107 - Dependence is transitive. | 3290 - Dependence is transitive. |
3108 | 3291 |
3109 Poison values have the same behavior as :ref:`undef values <undefvalues>`, | 3292 An instruction that *depends* on a poison value, produces a poison value |
3110 with the additional effect that any instruction that has a *dependence* | 3293 itself. A poison value may be relaxed into an |
3111 on a poison value has undefined behavior. | 3294 :ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern. |
3295 | |
3296 This means that immediate undefined behavior occurs if a poison value is | |
3297 used as an instruction operand that has any values that trigger undefined | |
3298 behavior. Notably this includes (but is not limited to): | |
3299 | |
3300 - The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or | |
3301 any other pointer dereferencing instruction (independent of address | |
3302 space). | |
3303 - The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem`` | |
3304 instruction. | |
3305 | |
3306 Additionally, undefined behavior occurs if a side effect *depends* on poison. | |
3307 This includes side effects that are control dependent on a poisoned branch. | |
3112 | 3308 |
3113 Here are some examples: | 3309 Here are some examples: |
3114 | 3310 |
3115 .. code-block:: llvm | 3311 .. code-block:: llvm |
3116 | 3312 |
3117 entry: | 3313 entry: |
3118 %poison = sub nuw i32 0, 1 ; Results in a poison value. | 3314 %poison = sub nuw i32 0, 1 ; Results in a poison value. |
3119 %still_poison = and i32 %poison, 0 ; 0, but also poison. | 3315 %still_poison = and i32 %poison, 0 ; 0, but also poison. |
3120 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison | 3316 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison |
3121 store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned | 3317 store i32 0, i32* %poison_yet_again ; Undefined behavior due to |
3318 ; store to poison. | |
3122 | 3319 |
3123 store i32 %poison, i32* @g ; Poison value stored to memory. | 3320 store i32 %poison, i32* @g ; Poison value stored to memory. |
3124 %poison2 = load i32, i32* @g ; Poison value loaded back from memory. | 3321 %poison2 = load i32, i32* @g ; Poison value loaded back from memory. |
3125 | |
3126 store volatile i32 %poison, i32* @g ; External observation; undefined behavior. | |
3127 | 3322 |
3128 %narrowaddr = bitcast i32* @g to i16* | 3323 %narrowaddr = bitcast i32* @g to i16* |
3129 %wideaddr = bitcast i32* @g to i64* | 3324 %wideaddr = bitcast i32* @g to i64* |
3130 %poison3 = load i16, i16* %narrowaddr ; Returns a poison value. | 3325 %poison3 = load i16, i16* %narrowaddr ; Returns a poison value. |
3131 %poison4 = load i64, i64* %wideaddr ; Returns a poison value. | 3326 %poison4 = load i64, i64* %wideaddr ; Returns a poison value. |
3173 The '``blockaddress``' constant computes the address of the specified | 3368 The '``blockaddress``' constant computes the address of the specified |
3174 basic block in the specified function, and always has an ``i8*`` type. | 3369 basic block in the specified function, and always has an ``i8*`` type. |
3175 Taking the address of the entry block is illegal. | 3370 Taking the address of the entry block is illegal. |
3176 | 3371 |
3177 This value only has defined behavior when used as an operand to the | 3372 This value only has defined behavior when used as an operand to the |
3178 ':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons | 3373 ':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or |
3179 against null. Pointer equality tests between labels addresses results in | 3374 for comparisons against null. Pointer equality tests between labels addresses |
3180 undefined behavior --- though, again, comparison against null is ok, and | 3375 results in undefined behavior --- though, again, comparison against null is ok, |
3181 no label is equal to the null pointer. This may be passed around as an | 3376 and no label is equal to the null pointer. This may be passed around as an |
3182 opaque pointer sized value as long as the bits are not inspected. This | 3377 opaque pointer sized value as long as the bits are not inspected. This |
3183 allows ``ptrtoint`` and arithmetic to be performed on these values so | 3378 allows ``ptrtoint`` and arithmetic to be performed on these values so |
3184 long as the original value is reconstituted before the ``indirectbr`` | 3379 long as the original value is reconstituted before the ``indirectbr`` or |
3185 instruction. | 3380 ``callbr`` instruction. |
3186 | 3381 |
3187 Finally, some targets may provide defined semantics when using the value | 3382 Finally, some targets may provide defined semantics when using the value |
3188 as the operand to an inline assembly, but that is target specific. | 3383 as the operand to an inline assembly, but that is target specific. |
3189 | 3384 |
3190 .. _constantexprs: | 3385 .. _constantexprs: |
3203 ``zext (CST to TYPE)`` | 3398 ``zext (CST to TYPE)`` |
3204 Perform the :ref:`zext operation <i_zext>` on constants. | 3399 Perform the :ref:`zext operation <i_zext>` on constants. |
3205 ``sext (CST to TYPE)`` | 3400 ``sext (CST to TYPE)`` |
3206 Perform the :ref:`sext operation <i_sext>` on constants. | 3401 Perform the :ref:`sext operation <i_sext>` on constants. |
3207 ``fptrunc (CST to TYPE)`` | 3402 ``fptrunc (CST to TYPE)`` |
3208 Truncate a floating point constant to another floating point type. | 3403 Truncate a floating-point constant to another floating-point type. |
3209 The size of CST must be larger than the size of TYPE. Both types | 3404 The size of CST must be larger than the size of TYPE. Both types |
3210 must be floating point. | 3405 must be floating-point. |
3211 ``fpext (CST to TYPE)`` | 3406 ``fpext (CST to TYPE)`` |
3212 Floating point extend a constant to another type. The size of CST | 3407 Floating-point extend a constant to another type. The size of CST |
3213 must be smaller or equal to the size of TYPE. Both types must be | 3408 must be smaller or equal to the size of TYPE. Both types must be |
3214 floating point. | 3409 floating-point. |
3215 ``fptoui (CST to TYPE)`` | 3410 ``fptoui (CST to TYPE)`` |
3216 Convert a floating point constant to the corresponding unsigned | 3411 Convert a floating-point constant to the corresponding unsigned |
3217 integer constant. TYPE must be a scalar or vector integer type. CST | 3412 integer constant. TYPE must be a scalar or vector integer type. CST |
3218 must be of scalar or vector floating point type. Both CST and TYPE | 3413 must be of scalar or vector floating-point type. Both CST and TYPE |
3219 must be scalars, or vectors of the same number of elements. If the | 3414 must be scalars, or vectors of the same number of elements. If the |
3220 value won't fit in the integer type, the results are undefined. | 3415 value won't fit in the integer type, the result is a |
3416 :ref:`poison value <poisonvalues>`. | |
3221 ``fptosi (CST to TYPE)`` | 3417 ``fptosi (CST to TYPE)`` |
3222 Convert a floating point constant to the corresponding signed | 3418 Convert a floating-point constant to the corresponding signed |
3223 integer constant. TYPE must be a scalar or vector integer type. CST | 3419 integer constant. TYPE must be a scalar or vector integer type. CST |
3224 must be of scalar or vector floating point type. Both CST and TYPE | 3420 must be of scalar or vector floating-point type. Both CST and TYPE |
3225 must be scalars, or vectors of the same number of elements. If the | 3421 must be scalars, or vectors of the same number of elements. If the |
3226 value won't fit in the integer type, the results are undefined. | 3422 value won't fit in the integer type, the result is a |
3423 :ref:`poison value <poisonvalues>`. | |
3227 ``uitofp (CST to TYPE)`` | 3424 ``uitofp (CST to TYPE)`` |
3228 Convert an unsigned integer constant to the corresponding floating | 3425 Convert an unsigned integer constant to the corresponding |
3229 point constant. TYPE must be a scalar or vector floating point type. | 3426 floating-point constant. TYPE must be a scalar or vector floating-point |
3427 type. CST must be of scalar or vector integer type. Both CST and TYPE must | |
3428 be scalars, or vectors of the same number of elements. | |
3429 ``sitofp (CST to TYPE)`` | |
3430 Convert a signed integer constant to the corresponding floating-point | |
3431 constant. TYPE must be a scalar or vector floating-point type. | |
3230 CST must be of scalar or vector integer type. Both CST and TYPE must | 3432 CST must be of scalar or vector integer type. Both CST and TYPE must |
3231 be scalars, or vectors of the same number of elements. If the value | 3433 be scalars, or vectors of the same number of elements. |
3232 won't fit in the floating point type, the results are undefined. | |
3233 ``sitofp (CST to TYPE)`` | |
3234 Convert a signed integer constant to the corresponding floating | |
3235 point constant. TYPE must be a scalar or vector floating point type. | |
3236 CST must be of scalar or vector integer type. Both CST and TYPE must | |
3237 be scalars, or vectors of the same number of elements. If the value | |
3238 won't fit in the floating point type, the results are undefined. | |
3239 ``ptrtoint (CST to TYPE)`` | 3434 ``ptrtoint (CST to TYPE)`` |
3240 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. | 3435 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. |
3241 ``inttoptr (CST to TYPE)`` | 3436 ``inttoptr (CST to TYPE)`` |
3242 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. | 3437 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. |
3243 This one is *really* dangerous! | 3438 This one is *really* dangerous! |
3282 ``OPCODE (LHS, RHS)`` | 3477 ``OPCODE (LHS, RHS)`` |
3283 Perform the specified operation of the LHS and RHS constants. OPCODE | 3478 Perform the specified operation of the LHS and RHS constants. OPCODE |
3284 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise | 3479 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise |
3285 binary <bitwiseops>` operations. The constraints on operands are | 3480 binary <bitwiseops>` operations. The constraints on operands are |
3286 the same as those for the corresponding instruction (e.g. no bitwise | 3481 the same as those for the corresponding instruction (e.g. no bitwise |
3287 operations on floating point values are allowed). | 3482 operations on floating-point values are allowed). |
3288 | 3483 |
3289 Other Values | 3484 Other Values |
3290 ============ | 3485 ============ |
3291 | 3486 |
3292 .. _inlineasmexprs: | 3487 .. _inlineasmexprs: |
3628 | 3823 |
3629 All ARM modes: | 3824 All ARM modes: |
3630 | 3825 |
3631 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address | 3826 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address |
3632 operand. Treated the same as operand ``m``, at the moment. | 3827 operand. Treated the same as operand ``m``, at the moment. |
3828 - ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14`` | |
3829 - ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11`` | |
3633 | 3830 |
3634 ARM and ARM's Thumb2 mode: | 3831 ARM and ARM's Thumb2 mode: |
3635 | 3832 |
3636 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) | 3833 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) |
3637 - ``I``: An immediate integer valid for a data-processing instruction. | 3834 - ``I``: An immediate integer valid for a data-processing instruction. |
3750 | 3947 |
3751 - ``y``: Condition register (``CR0-CR7``). | 3948 - ``y``: Condition register (``CR0-CR7``). |
3752 - ``wc``: An individual CR bit in a CR register. | 3949 - ``wc``: An individual CR bit in a CR register. |
3753 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX | 3950 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX |
3754 register set (overlapping both the floating-point and vector register files). | 3951 register set (overlapping both the floating-point and vector register files). |
3755 - ``ws``: A 32 or 64-bit floating point register, from the full VSX register | 3952 - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register |
3756 set. | 3953 set. |
3954 | |
3955 RISC-V: | |
3956 | |
3957 - ``A``: An address operand (using a general-purpose register, without an | |
3958 offset). | |
3959 - ``I``: A 12-bit signed integer immediate operand. | |
3960 - ``J``: A zero integer immediate operand. | |
3961 - ``K``: A 5-bit unsigned integer immediate operand. | |
3962 - ``f``: A 32- or 64-bit floating-point register (requires F or D extension). | |
3963 - ``r``: A 32- or 64-bit general-purpose register (depending on the platform | |
3964 ``XLEN``). | |
3757 | 3965 |
3758 Sparc: | 3966 Sparc: |
3759 | 3967 |
3760 - ``I``: An immediate 13-bit signed integer. | 3968 - ``I``: An immediate 13-bit signed integer. |
3761 - ``r``: A 32-bit integer register. | 3969 - ``r``: A 32-bit integer register. |
3762 - ``f``: Any floating-point register on SparcV8, or a floating point | 3970 - ``f``: Any floating-point register on SparcV8, or a floating-point |
3763 register in the "low" half of the registers on SparcV9. | 3971 register in the "low" half of the registers on SparcV9. |
3764 - ``e``: Any floating point register. (Same as ``f`` on SparcV8.) | 3972 - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.) |
3765 | 3973 |
3766 SystemZ: | 3974 SystemZ: |
3767 | 3975 |
3768 - ``I``: An immediate unsigned 8-bit integer. | 3976 - ``I``: An immediate unsigned 8-bit integer. |
3769 - ``J``: An immediate unsigned 12-bit integer. | 3977 - ``J``: An immediate unsigned 12-bit integer. |
3781 - ``r`` or ``d``: A 32, 64, or 128-bit integer register. | 3989 - ``r`` or ``d``: A 32, 64, or 128-bit integer register. |
3782 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an | 3990 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an |
3783 address context evaluates as zero). | 3991 address context evaluates as zero). |
3784 - ``h``: A 32-bit value in the high part of a 64bit data register | 3992 - ``h``: A 32-bit value in the high part of a 64bit data register |
3785 (LLVM-specific) | 3993 (LLVM-specific) |
3786 - ``f``: A 32, 64, or 128-bit floating point register. | 3994 - ``f``: A 32, 64, or 128-bit floating-point register. |
3787 | 3995 |
3788 X86: | 3996 X86: |
3789 | 3997 |
3790 - ``I``: An immediate integer between 0 and 31. | 3998 - ``I``: An immediate integer between 0 and 31. |
3791 - ``J``: An immediate integer between 0 and 64. | 3999 - ``J``: An immediate integer between 0 and 64. |
4315 | 4523 |
4316 - ``count: -1`` indicates an empty array. | 4524 - ``count: -1`` indicates an empty array. |
4317 - ``count: !9`` describes the count with a :ref:`DILocalVariable`. | 4525 - ``count: !9`` describes the count with a :ref:`DILocalVariable`. |
4318 - ``count: !11`` describes the count with a :ref:`DIGlobalVariable`. | 4526 - ``count: !11`` describes the count with a :ref:`DIGlobalVariable`. |
4319 | 4527 |
4320 .. code-block:: llvm | 4528 .. code-block:: text |
4321 | 4529 |
4322 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0 | 4530 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0 |
4323 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1 | 4531 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1 |
4324 !2 = !DISubrange(count: -1) ; empty array. | 4532 !2 = !DISubrange(count: -1) ; empty array. |
4325 | 4533 |
4326 ; Scopes used in rest of example | 4534 ; Scopes used in rest of example |
4327 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file") | 4535 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file") |
4328 !7 = distinct !DICompileUnit(language: DW_LANG_C99, ... | 4536 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6) |
4329 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5, ... | 4537 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5) |
4330 | 4538 |
4331 ; Use of local variable as count value | 4539 ; Use of local variable as count value |
4332 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) | 4540 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) |
4333 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9) | 4541 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9) |
4334 !11 = !DISubrange(count !10, lowerBound: 0) | 4542 !11 = !DISubrange(count: !10, lowerBound: 0) |
4335 | 4543 |
4336 ; Use of global variable as count value | 4544 ; Use of global variable as count value |
4337 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9) | 4545 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9) |
4338 !13 = !DISubrange(count !12, lowerBound: 0) | 4546 !13 = !DISubrange(count: !12, lowerBound: 0) |
4339 | 4547 |
4340 .. _DIEnumerator: | 4548 .. _DIEnumerator: |
4341 | 4549 |
4342 DIEnumerator | 4550 DIEnumerator |
4343 """""""""""" | 4551 """""""""""" |
4344 | 4552 |
4345 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type`` | 4553 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type`` |
4346 variants of :ref:`DICompositeType`. | 4554 variants of :ref:`DICompositeType`. |
4347 | 4555 |
4348 .. code-block:: llvm | 4556 .. code-block:: text |
4349 | 4557 |
4350 !0 = !DIEnumerator(name: "SixKind", value: 7) | 4558 !0 = !DIEnumerator(name: "SixKind", value: 7) |
4351 !1 = !DIEnumerator(name: "SevenKind", value: 7) | 4559 !1 = !DIEnumerator(name: "SevenKind", value: 7) |
4352 !2 = !DIEnumerator(name: "NegEightKind", value: -8) | 4560 !2 = !DIEnumerator(name: "NegEightKind", value: -8) |
4353 | 4561 |
4356 | 4564 |
4357 ``DITemplateTypeParameter`` nodes represent type parameters to generic source | 4565 ``DITemplateTypeParameter`` nodes represent type parameters to generic source |
4358 language constructs. They are used (optionally) in :ref:`DICompositeType` and | 4566 language constructs. They are used (optionally) in :ref:`DICompositeType` and |
4359 :ref:`DISubprogram` ``templateParams:`` fields. | 4567 :ref:`DISubprogram` ``templateParams:`` fields. |
4360 | 4568 |
4361 .. code-block:: llvm | 4569 .. code-block:: text |
4362 | 4570 |
4363 !0 = !DITemplateTypeParameter(name: "Ty", type: !1) | 4571 !0 = !DITemplateTypeParameter(name: "Ty", type: !1) |
4364 | 4572 |
4365 DITemplateValueParameter | 4573 DITemplateValueParameter |
4366 """""""""""""""""""""""" | 4574 """""""""""""""""""""""" |
4369 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``, | 4577 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``, |
4370 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or | 4578 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or |
4371 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in | 4579 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in |
4372 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields. | 4580 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields. |
4373 | 4581 |
4374 .. code-block:: llvm | 4582 .. code-block:: text |
4375 | 4583 |
4376 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7) | 4584 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7) |
4377 | 4585 |
4378 DINamespace | 4586 DINamespace |
4379 """"""""""" | 4587 """"""""""" |
4380 | 4588 |
4381 ``DINamespace`` nodes represent namespaces in the source language. | 4589 ``DINamespace`` nodes represent namespaces in the source language. |
4382 | 4590 |
4383 .. code-block:: llvm | 4591 .. code-block:: text |
4384 | 4592 |
4385 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7) | 4593 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7) |
4386 | 4594 |
4387 .. _DIGlobalVariable: | 4595 .. _DIGlobalVariable: |
4388 | 4596 |
4389 DIGlobalVariable | 4597 DIGlobalVariable |
4390 """""""""""""""" | 4598 """""""""""""""" |
4391 | 4599 |
4392 ``DIGlobalVariable`` nodes represent global variables in the source language. | 4600 ``DIGlobalVariable`` nodes represent global variables in the source language. |
4393 | 4601 |
4394 .. code-block:: llvm | 4602 .. code-block:: text |
4395 | 4603 |
4396 !0 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !1, | 4604 @foo = global i32, !dbg !0 |
4397 file: !2, line: 7, type: !3, isLocal: true, | 4605 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression()) |
4398 isDefinition: false, variable: i32* @foo, | 4606 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2, |
4399 declaration: !4) | 4607 file: !3, line: 7, type: !4, isLocal: true, |
4400 | 4608 isDefinition: false, declaration: !5) |
4401 All global variables should be referenced by the `globals:` field of a | 4609 |
4402 :ref:`compile unit <DICompileUnit>`. | 4610 |
4611 DIGlobalVariableExpression | |
4612 """""""""""""""""""""""""" | |
4613 | |
4614 ``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together | |
4615 with a :ref:`DIExpression`. | |
4616 | |
4617 .. code-block:: text | |
4618 | |
4619 @lower = global i32, !dbg !0 | |
4620 @upper = global i32, !dbg !1 | |
4621 !0 = !DIGlobalVariableExpression( | |
4622 var: !2, | |
4623 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32) | |
4624 ) | |
4625 !1 = !DIGlobalVariableExpression( | |
4626 var: !2, | |
4627 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32) | |
4628 ) | |
4629 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3, | |
4630 file: !4, line: 8, type: !5, declaration: !6) | |
4631 | |
4632 All global variable expressions should be referenced by the `globals:` field of | |
4633 a :ref:`compile unit <DICompileUnit>`. | |
4403 | 4634 |
4404 .. _DISubprogram: | 4635 .. _DISubprogram: |
4405 | 4636 |
4406 DISubprogram | 4637 DISubprogram |
4407 """""""""""" | 4638 """""""""""" |
4408 | 4639 |
4409 ``DISubprogram`` nodes represent functions from the source language. A | 4640 ``DISubprogram`` nodes represent functions from the source language. A |
4410 ``DISubprogram`` may be attached to a function definition using ``!dbg`` | 4641 distinct ``DISubprogram`` may be attached to a function definition using |
4411 metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>` | 4642 ``!dbg`` metadata. A unique ``DISubprogram`` may be attached to a function |
4412 that must be retained, even if their IR counterparts are optimized out of | 4643 declaration used for call site debug info. The ``variables:`` field points at |
4413 the IR. The ``type:`` field must point at an :ref:`DISubroutineType`. | 4644 :ref:`variables <DILocalVariable>` that must be retained, even if their IR |
4645 counterparts are optimized out of the IR. The ``type:`` field must point at an | |
4646 :ref:`DISubroutineType`. | |
4414 | 4647 |
4415 .. _DISubprogramDeclaration: | 4648 .. _DISubprogramDeclaration: |
4416 | 4649 |
4417 When ``isDefinition: false``, subprograms describe a declaration in the type | 4650 When ``isDefinition: false``, subprograms describe a declaration in the type |
4418 tree as opposed to a definition of a function. If the scope is a composite | 4651 tree as opposed to a definition of a function. If the scope is a composite |
4460 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a | 4693 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a |
4461 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to | 4694 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to |
4462 indicate textual inclusion, or the ``discriminator:`` field can be used to | 4695 indicate textual inclusion, or the ``discriminator:`` field can be used to |
4463 discriminate between control flow within a single block in the source language. | 4696 discriminate between control flow within a single block in the source language. |
4464 | 4697 |
4465 .. code-block:: llvm | 4698 .. code-block:: text |
4466 | 4699 |
4467 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35) | 4700 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35) |
4468 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0) | 4701 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0) |
4469 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1) | 4702 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1) |
4470 | 4703 |
4475 | 4708 |
4476 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is | 4709 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is |
4477 mandatory, and points at an :ref:`DILexicalBlockFile`, an | 4710 mandatory, and points at an :ref:`DILexicalBlockFile`, an |
4478 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`. | 4711 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`. |
4479 | 4712 |
4480 .. code-block:: llvm | 4713 .. code-block:: text |
4481 | 4714 |
4482 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2) | 4715 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2) |
4483 | 4716 |
4484 .. _DILocalVariable: | 4717 .. _DILocalVariable: |
4485 | 4718 |
4497 type: !3, flags: DIFlagArtificial) | 4730 type: !3, flags: DIFlagArtificial) |
4498 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7, | 4731 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7, |
4499 type: !3) | 4732 type: !3) |
4500 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) | 4733 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) |
4501 | 4734 |
4735 .. _DIExpression: | |
4736 | |
4502 DIExpression | 4737 DIExpression |
4503 """""""""""" | 4738 """""""""""" |
4504 | 4739 |
4505 ``DIExpression`` nodes represent expressions that are inspired by the DWARF | 4740 ``DIExpression`` nodes represent expressions that are inspired by the DWARF |
4506 expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>` | 4741 expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>` |
4507 (such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the | 4742 (such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the |
4508 referenced LLVM variable relates to the source language variable. | 4743 referenced LLVM variable relates to the source language variable. Debug |
4509 | 4744 intrinsics are interpreted left-to-right: start by pushing the value/address |
4510 The current supported vocabulary is limited: | 4745 operand of the intrinsic onto a stack, then repeatedly push and evaluate |
4746 opcodes from the DIExpression until the final variable description is produced. | |
4747 | |
4748 The current supported opcode vocabulary is limited: | |
4511 | 4749 |
4512 - ``DW_OP_deref`` dereferences the top of the expression stack. | 4750 - ``DW_OP_deref`` dereferences the top of the expression stack. |
4513 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds | 4751 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds |
4514 them together and appends the result to the expression stack. | 4752 them together and appends the result to the expression stack. |
4515 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts | 4753 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts |
4518 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. | 4756 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. |
4519 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` | 4757 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` |
4520 here, respectively) of the variable fragment from the working expression. Note | 4758 here, respectively) of the variable fragment from the working expression. Note |
4521 that contrary to DW_OP_bit_piece, the offset is describing the location | 4759 that contrary to DW_OP_bit_piece, the offset is describing the location |
4522 within the described source variable. | 4760 within the described source variable. |
4761 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding | |
4762 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the | |
4763 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation | |
4764 that references a base type constructed from the supplied values. | |
4765 - ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be | |
4766 optionally applied to the pointer. The memory tag is derived from the | |
4767 given tag offset in an implementation-defined manner. | |
4523 - ``DW_OP_swap`` swaps top two stack entries. | 4768 - ``DW_OP_swap`` swaps top two stack entries. |
4524 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top | 4769 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top |
4525 of the stack is treated as an address. The second stack entry is treated as an | 4770 of the stack is treated as an address. The second stack entry is treated as an |
4526 address space identifier. | 4771 address space identifier. |
4527 - ``DW_OP_stack_value`` marks a constant value. | 4772 - ``DW_OP_stack_value`` marks a constant value. |
4773 - If an expression is marked with ``DW_OP_entry_value`` all register and | |
4774 memory read operations refer to the respective value at the function entry. | |
4775 The first operand of ``DW_OP_entry_value`` is the size of following | |
4776 DWARF expression. | |
4777 ``DW_OP_entry_value`` may appear after the ``LiveDebugValues`` pass. | |
4778 LLVM only supports entry values for function parameters | |
4779 that are unmodified throughout a function and that are described as | |
4780 simple register location descriptions. | |
4781 ``DW_OP_entry_value`` may also appear after the ``AsmPrinter`` pass when | |
4782 a call site parameter value (``DW_AT_call_site_parameter_value``) | |
4783 is represented as entry value of the parameter. | |
4784 - ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided | |
4785 signed offset of the specified register. The opcode is only generated by the | |
4786 ``AsmPrinter`` pass to describe call site parameter value which requires an | |
4787 expression over two registers. | |
4528 | 4788 |
4529 DWARF specifies three kinds of simple location descriptions: Register, memory, | 4789 DWARF specifies three kinds of simple location descriptions: Register, memory, |
4530 and implicit location descriptions. Register and memory location descriptions | 4790 and implicit location descriptions. Note that a location description is |
4531 describe the *location* of a source variable (in the sense that a debugger might | 4791 defined over certain ranges of a program, i.e the location of a variable may |
4532 modify its value), whereas implicit locations describe merely the *value* of a | 4792 change over the course of the program. Register and memory location |
4533 source variable. DIExpressions also follow this model: A DIExpression that | 4793 descriptions describe the *concrete location* of a source variable (in the |
4534 doesn't have a trailing ``DW_OP_stack_value`` will describe an *address* when | 4794 sense that a debugger might modify its value), whereas *implicit locations* |
4535 combined with a concrete location. | 4795 describe merely the actual *value* of a source variable which might not exist |
4796 in registers or in memory (see ``DW_OP_stack_value``). | |
4797 | |
4798 A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect | |
4799 value (the address) of a source variable. The first operand of the intrinsic | |
4800 must be an address of some kind. A DIExpression attached to the intrinsic | |
4801 refines this address to produce a concrete location for the source variable. | |
4802 | |
4803 A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable. | |
4804 The first operand of the intrinsic may be a direct or indirect value. A | |
4805 DIExpresion attached to the intrinsic refines the first operand to produce a | |
4806 direct value. For example, if the first operand is an indirect value, it may be | |
4807 necessary to insert ``DW_OP_deref`` into the DIExpresion in order to produce a | |
4808 valid debug intrinsic. | |
4809 | |
4810 .. note:: | |
4811 | |
4812 A DIExpression is interpreted in the same way regardless of which kind of | |
4813 debug intrinsic it's attached to. | |
4536 | 4814 |
4537 .. code-block:: text | 4815 .. code-block:: text |
4538 | 4816 |
4539 !0 = !DIExpression(DW_OP_deref) | 4817 !0 = !DIExpression(DW_OP_deref) |
4540 !1 = !DIExpression(DW_OP_plus_uconst, 3) | 4818 !1 = !DIExpression(DW_OP_plus_uconst, 3) |
4542 !2 = !DIExpression(DW_OP_bit_piece, 3, 7) | 4820 !2 = !DIExpression(DW_OP_bit_piece, 3, 7) |
4543 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7) | 4821 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7) |
4544 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef) | 4822 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef) |
4545 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value) | 4823 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value) |
4546 | 4824 |
4825 DIFlags | |
4826 """"""""""""""" | |
4827 | |
4828 These flags encode various properties of DINodes. | |
4829 | |
4830 The `ArgumentNotModified` flag marks a function argument whose value | |
4831 is not modified throughout of a function. This flag is used to decide | |
4832 whether a DW_OP_entry_value can be used in a location description | |
4833 after the function prologue. The language frontend is expected to compute | |
4834 this property for each DILocalVariable. The flag should be used | |
4835 only in optimized code. | |
4836 | |
4547 DIObjCProperty | 4837 DIObjCProperty |
4548 """""""""""""" | 4838 """""""""""""" |
4549 | 4839 |
4550 ``DIObjCProperty`` nodes represent Objective-C property nodes. | 4840 ``DIObjCProperty`` nodes represent Objective-C property nodes. |
4551 | 4841 |
4552 .. code-block:: llvm | 4842 .. code-block:: text |
4553 | 4843 |
4554 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo", | 4844 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo", |
4555 getter: "getFoo", attributes: 7, type: !2) | 4845 getter: "getFoo", attributes: 7, type: !2) |
4556 | 4846 |
4557 DIImportedEntity | 4847 DIImportedEntity |
4682 }; | 4972 }; |
4683 | 4973 |
4684 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) { | 4974 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) { |
4685 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0) | 4975 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0) |
4686 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12) | 4976 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12) |
4687 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, IntScalarTy, 16) | 4977 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16) |
4688 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0) | 4978 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0) |
4689 } | 4979 } |
4690 | 4980 |
4691 is (note that in C and C++, ``char`` can be used to access any arbitrary | 4981 is (note that in C and C++, ``char`` can be used to access any arbitrary |
4692 type): | 4982 type): |
4847 store float %0, float* %arrayidx.i, align 4, !noalias !7 | 5137 store float %0, float* %arrayidx.i, align 4, !noalias !7 |
4848 | 5138 |
4849 '``fpmath``' Metadata | 5139 '``fpmath``' Metadata |
4850 ^^^^^^^^^^^^^^^^^^^^^ | 5140 ^^^^^^^^^^^^^^^^^^^^^ |
4851 | 5141 |
4852 ``fpmath`` metadata may be attached to any instruction of floating point | 5142 ``fpmath`` metadata may be attached to any instruction of floating-point |
4853 type. It can be used to express the maximum acceptable error in the | 5143 type. It can be used to express the maximum acceptable error in the |
4854 result of that instruction, in ULPs, thus potentially allowing the | 5144 result of that instruction, in ULPs, thus potentially allowing the |
4855 compiler to use a more efficient but less accurate method of computing | 5145 compiler to use a more efficient but less accurate method of computing |
4856 it. ULP is defined as follows: | 5146 it. ULP is defined as follows: |
4857 | 5147 |
4873 '``range``' Metadata | 5163 '``range``' Metadata |
4874 ^^^^^^^^^^^^^^^^^^^^ | 5164 ^^^^^^^^^^^^^^^^^^^^ |
4875 | 5165 |
4876 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of | 5166 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of |
4877 integer types. It expresses the possible ranges the loaded value or the value | 5167 integer types. It expresses the possible ranges the loaded value or the value |
4878 returned by the called function at this call site is in. The ranges are | 5168 returned by the called function at this call site is in. If the loaded or |
4879 represented with a flattened list of integers. The loaded value or the value | 5169 returned value is not in the specified range, the behavior is undefined. The |
4880 returned is known to be in the union of the ranges defined by each consecutive | 5170 ranges are represented with a flattened list of integers. The loaded value or |
4881 pair. Each pair has the following properties: | 5171 the value returned is known to be in the union of the ranges defined by each |
5172 consecutive pair. Each pair has the following properties: | |
4882 | 5173 |
4883 - The type must match the type loaded by the instruction. | 5174 - The type must match the type loaded by the instruction. |
4884 - The pair ``a,b`` represents the range ``[a,b)``. | 5175 - The pair ``a,b`` represents the range ``[a,b)``. |
4885 - Both ``a`` and ``b`` are constants. | 5176 - Both ``a`` and ``b`` are constants. |
4886 - The range is allowed to wrap. | 5177 - The range is allowed to wrap. |
4942 %result = call i64 %binop(i64 %x, i64 %y), !callees !0 | 5233 %result = call i64 %binop(i64 %x, i64 %y), !callees !0 |
4943 | 5234 |
4944 ... | 5235 ... |
4945 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub} | 5236 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub} |
4946 | 5237 |
5238 '``callback``' Metadata | |
5239 ^^^^^^^^^^^^^^^^^^^^^^^ | |
5240 | |
5241 ``callback`` metadata may be attached to a function declaration, or definition. | |
5242 (Call sites are excluded only due to the lack of a use case.) For ease of | |
5243 exposition, we'll refer to the function annotated w/ metadata as a broker | |
5244 function. The metadata describes how the arguments of a call to the broker are | |
5245 in turn passed to the callback function specified by the metadata. Thus, the | |
5246 ``callback`` metadata provides a partial description of a call site inside the | |
5247 broker function with regards to the arguments of a call to the broker. The only | |
5248 semantic restriction on the broker function itself is that it is not allowed to | |
5249 inspect or modify arguments referenced in the ``callback`` metadata as | |
5250 pass-through to the callback function. | |
5251 | |
5252 The broker is not required to actually invoke the callback function at runtime. | |
5253 However, the assumptions about not inspecting or modifying arguments that would | |
5254 be passed to the specified callback function still hold, even if the callback | |
5255 function is not dynamically invoked. The broker is allowed to invoke the | |
5256 callback function more than once per invocation of the broker. The broker is | |
5257 also allowed to invoke (directly or indirectly) the function passed as a | |
5258 callback through another use. Finally, the broker is also allowed to relay the | |
5259 callback callee invocation to a different thread. | |
5260 | |
5261 The metadata is structured as follows: At the outer level, ``callback`` | |
5262 metadata is a list of ``callback`` encodings. Each encoding starts with a | |
5263 constant ``i64`` which describes the argument position of the callback function | |
5264 in the call to the broker. The following elements, except the last, describe | |
5265 what arguments are passed to the callback function. Each element is again an | |
5266 ``i64`` constant identifying the argument of the broker that is passed through, | |
5267 or ``i64 -1`` to indicate an unknown or inspected argument. The order in which | |
5268 they are listed has to be the same in which they are passed to the callback | |
5269 callee. The last element of the encoding is a boolean which specifies how | |
5270 variadic arguments of the broker are handled. If it is true, all variadic | |
5271 arguments of the broker are passed through to the callback function *after* the | |
5272 arguments encoded explicitly before. | |
5273 | |
5274 In the code below, the ``pthread_create`` function is marked as a broker | |
5275 through the ``!callback !1`` metadata. In the example, there is only one | |
5276 callback encoding, namely ``!2``, associated with the broker. This encoding | |
5277 identifies the callback function as the second argument of the broker (``i64 | |
5278 2``) and the sole argument of the callback function as the third one of the | |
5279 broker function (``i64 3``). | |
5280 | |
5281 .. FIXME why does the llvm-sphinx-docs builder give a highlighting | |
5282 error if the below is set to highlight as 'llvm', despite that we | |
5283 have misc.highlighting_failure set? | |
5284 | |
5285 .. code-block:: text | |
5286 | |
5287 declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*) | |
5288 | |
5289 ... | |
5290 !2 = !{i64 2, i64 3, i1 false} | |
5291 !1 = !{!2} | |
5292 | |
5293 Another example is shown below. The callback callee is the second argument of | |
5294 the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown | |
5295 values (each identified by a ``i64 -1``) and afterwards all | |
5296 variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the | |
5297 final ``i1 true``). | |
5298 | |
5299 .. FIXME why does the llvm-sphinx-docs builder give a highlighting | |
5300 error if the below is set to highlight as 'llvm', despite that we | |
5301 have misc.highlighting_failure set? | |
5302 | |
5303 .. code-block:: text | |
5304 | |
5305 declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...) | |
5306 | |
5307 ... | |
5308 !1 = !{i64 2, i64 -1, i64 -1, i1 true} | |
5309 !0 = !{!1} | |
5310 | |
5311 | |
4947 '``unpredictable``' Metadata | 5312 '``unpredictable``' Metadata |
4948 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 5313 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
4949 | 5314 |
4950 ``unpredictable`` metadata may be attached to any branch or switch | 5315 ``unpredictable`` metadata may be attached to any branch or switch |
4951 instruction. It can be used to express the unpredictability of control | 5316 instruction. It can be used to express the unpredictability of control |
4952 flow. Similar to the llvm.expect intrinsic, it may be used to alter | 5317 flow. Similar to the llvm.expect intrinsic, it may be used to alter |
4953 optimizations related to compare and branch instructions. The metadata | 5318 optimizations related to compare and branch instructions. The metadata |
4954 is treated as a boolean value; if it exists, it signals that the branch | 5319 is treated as a boolean value; if it exists, it signals that the branch |
4955 or switch that it is attached to is completely unpredictable. | 5320 or switch that it is attached to is completely unpredictable. |
4956 | 5321 |
5322 .. _md_dereferenceable: | |
5323 | |
5324 '``dereferenceable``' Metadata | |
5325 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5326 | |
5327 The existence of the ``!dereferenceable`` metadata on the instruction | |
5328 tells the optimizer that the value loaded is known to be dereferenceable. | |
5329 The number of bytes known to be dereferenceable is specified by the integer | |
5330 value in the metadata node. This is analogous to the ''dereferenceable'' | |
5331 attribute on parameters and return values. | |
5332 | |
5333 .. _md_dereferenceable_or_null: | |
5334 | |
5335 '``dereferenceable_or_null``' Metadata | |
5336 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5337 | |
5338 The existence of the ``!dereferenceable_or_null`` metadata on the | |
5339 instruction tells the optimizer that the value loaded is known to be either | |
5340 dereferenceable or null. | |
5341 The number of bytes known to be dereferenceable is specified by the integer | |
5342 value in the metadata node. This is analogous to the ''dereferenceable_or_null'' | |
5343 attribute on parameters and return values. | |
5344 | |
5345 .. _llvm.loop: | |
5346 | |
4957 '``llvm.loop``' | 5347 '``llvm.loop``' |
4958 ^^^^^^^^^^^^^^^ | 5348 ^^^^^^^^^^^^^^^ |
4959 | 5349 |
4960 It is sometimes useful to attach information to loop constructs. Currently, | 5350 It is sometimes useful to attach information to loop constructs. Currently, |
4961 loop metadata is implemented as metadata attached to the branch instruction | 5351 loop metadata is implemented as metadata attached to the branch instruction |
4985 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 | 5375 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 |
4986 ... | 5376 ... |
4987 !0 = !{!0, !1} | 5377 !0 = !{!0, !1} |
4988 !1 = !{!"llvm.loop.unroll.count", i32 4} | 5378 !1 = !{!"llvm.loop.unroll.count", i32 4} |
4989 | 5379 |
5380 '``llvm.loop.disable_nonforced``' | |
5381 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5382 | |
5383 This metadata disables all optional loop transformations unless | |
5384 explicitly instructed using other transformation metadata such as | |
5385 ``llvm.loop.unroll.enable``. That is, no heuristic will try to determine | |
5386 whether a transformation is profitable. The purpose is to avoid that the | |
5387 loop is transformed to a different loop before an explicitly requested | |
5388 (forced) transformation is applied. For instance, loop fusion can make | |
5389 other transformations impossible. Mandatory loop canonicalizations such | |
5390 as loop rotation are still applied. | |
5391 | |
5392 It is recommended to use this metadata in addition to any llvm.loop.* | |
5393 transformation directive. Also, any loop should have at most one | |
5394 directive applied to it (and a sequence of transformations built using | |
5395 followup-attributes). Otherwise, which transformation will be applied | |
5396 depends on implementation details such as the pass pipeline order. | |
5397 | |
5398 See :ref:`transformation-metadata` for details. | |
5399 | |
4990 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``' | 5400 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``' |
4991 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 5401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
4992 | 5402 |
4993 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are | 5403 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are |
4994 used to control per-loop vectorization and interleaving parameters such as | 5404 used to control per-loop vectorization and interleaving parameters such as |
4995 vectorization width and interleave count. These metadata should be used in | 5405 vectorization width and interleave count. These metadata should be used in |
4996 conjunction with ``llvm.loop`` loop identification metadata. The | 5406 conjunction with ``llvm.loop`` loop identification metadata. The |
4997 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only | 5407 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only |
4998 optimization hints and the optimizer will only interleave and vectorize loops if | 5408 optimization hints and the optimizer will only interleave and vectorize loops if |
4999 it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata | 5409 it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata |
5000 which contains information about loop-carried memory dependencies can be helpful | 5410 which contains information about loop-carried memory dependencies can be helpful |
5001 in determining the safety of these transformations. | 5411 in determining the safety of these transformations. |
5002 | 5412 |
5003 '``llvm.loop.interleave.count``' Metadata | 5413 '``llvm.loop.interleave.count``' Metadata |
5004 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 5414 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5027 .. code-block:: llvm | 5437 .. code-block:: llvm |
5028 | 5438 |
5029 !0 = !{!"llvm.loop.vectorize.enable", i1 0} | 5439 !0 = !{!"llvm.loop.vectorize.enable", i1 0} |
5030 !1 = !{!"llvm.loop.vectorize.enable", i1 1} | 5440 !1 = !{!"llvm.loop.vectorize.enable", i1 1} |
5031 | 5441 |
5442 '``llvm.loop.vectorize.predicate.enable``' Metadata | |
5443 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5444 | |
5445 This metadata selectively enables or disables creating predicated instructions | |
5446 for the loop, which can enable folding of the scalar epilogue loop into the | |
5447 main loop. The first operand is the string | |
5448 ``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If | |
5449 the bit operand value is 1 vectorization is enabled. A value of 0 disables | |
5450 vectorization: | |
5451 | |
5452 .. code-block:: llvm | |
5453 | |
5454 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0} | |
5455 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1} | |
5456 | |
5032 '``llvm.loop.vectorize.width``' Metadata | 5457 '``llvm.loop.vectorize.width``' Metadata |
5033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 5458 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5034 | 5459 |
5035 This metadata sets the target width of the vectorizer. The first | 5460 This metadata sets the target width of the vectorizer. The first |
5036 operand is the string ``llvm.loop.vectorize.width`` and the second | 5461 operand is the string ``llvm.loop.vectorize.width`` and the second |
5042 | 5467 |
5043 Note that setting ``llvm.loop.vectorize.width`` to 1 disables | 5468 Note that setting ``llvm.loop.vectorize.width`` to 1 disables |
5044 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to | 5469 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to |
5045 0 or if the loop does not have this metadata the width will be | 5470 0 or if the loop does not have this metadata the width will be |
5046 determined automatically. | 5471 determined automatically. |
5472 | |
5473 '``llvm.loop.vectorize.followup_vectorized``' Metadata | |
5474 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5475 | |
5476 This metadata defines which loop attributes the vectorized loop will | |
5477 have. See :ref:`transformation-metadata` for details. | |
5478 | |
5479 '``llvm.loop.vectorize.followup_epilogue``' Metadata | |
5480 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5481 | |
5482 This metadata defines which loop attributes the epilogue will have. The | |
5483 epilogue is not vectorized and is executed when either the vectorized | |
5484 loop is not known to preserve semantics (because e.g., it processes two | |
5485 arrays that are found to alias by a runtime check) or for the last | |
5486 iterations that do not fill a complete set of vector lanes. See | |
5487 :ref:`Transformation Metadata <transformation-metadata>` for details. | |
5488 | |
5489 '``llvm.loop.vectorize.followup_all``' Metadata | |
5490 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5491 | |
5492 Attributes in the metadata will be added to both the vectorized and | |
5493 epilogue loop. | |
5494 See :ref:`Transformation Metadata <transformation-metadata>` for details. | |
5047 | 5495 |
5048 '``llvm.loop.unroll``' | 5496 '``llvm.loop.unroll``' |
5049 ^^^^^^^^^^^^^^^^^^^^^^ | 5497 ^^^^^^^^^^^^^^^^^^^^^^ |
5050 | 5498 |
5051 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling | 5499 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling |
5111 | 5559 |
5112 .. code-block:: llvm | 5560 .. code-block:: llvm |
5113 | 5561 |
5114 !0 = !{!"llvm.loop.unroll.full"} | 5562 !0 = !{!"llvm.loop.unroll.full"} |
5115 | 5563 |
5564 '``llvm.loop.unroll.followup``' Metadata | |
5565 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5566 | |
5567 This metadata defines which loop attributes the unrolled loop will have. | |
5568 See :ref:`Transformation Metadata <transformation-metadata>` for details. | |
5569 | |
5570 '``llvm.loop.unroll.followup_remainder``' Metadata | |
5571 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5572 | |
5573 This metadata defines which loop attributes the remainder loop after | |
5574 partial/runtime unrolling will have. See | |
5575 :ref:`Transformation Metadata <transformation-metadata>` for details. | |
5576 | |
5577 '``llvm.loop.unroll_and_jam``' | |
5578 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5579 | |
5580 This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata | |
5581 above, but affect the unroll and jam pass. In addition any loop with | |
5582 ``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will | |
5583 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the | |
5584 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam | |
5585 too.) | |
5586 | |
5587 The metadata for unroll and jam otherwise is the same as for ``unroll``. | |
5588 ``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and | |
5589 ``llvm.loop.unroll_and_jam.count`` do the same as for unroll. | |
5590 ``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints | |
5591 and the normal safety checks will still be performed. | |
5592 | |
5593 '``llvm.loop.unroll_and_jam.count``' Metadata | |
5594 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5595 | |
5596 This metadata suggests an unroll and jam factor to use, similarly to | |
5597 ``llvm.loop.unroll.count``. The first operand is the string | |
5598 ``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer | |
5599 specifying the unroll factor. For example: | |
5600 | |
5601 .. code-block:: llvm | |
5602 | |
5603 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4} | |
5604 | |
5605 If the trip count of the loop is less than the unroll count the loop | |
5606 will be partially unroll and jammed. | |
5607 | |
5608 '``llvm.loop.unroll_and_jam.disable``' Metadata | |
5609 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5610 | |
5611 This metadata disables loop unroll and jamming. The metadata has a single | |
5612 operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example: | |
5613 | |
5614 .. code-block:: llvm | |
5615 | |
5616 !0 = !{!"llvm.loop.unroll_and_jam.disable"} | |
5617 | |
5618 '``llvm.loop.unroll_and_jam.enable``' Metadata | |
5619 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5620 | |
5621 This metadata suggests that the loop should be fully unroll and jammed if the | |
5622 trip count is known at compile time and partially unrolled if the trip count is | |
5623 not known at compile time. The metadata has a single operand which is the | |
5624 string ``llvm.loop.unroll_and_jam.enable``. For example: | |
5625 | |
5626 .. code-block:: llvm | |
5627 | |
5628 !0 = !{!"llvm.loop.unroll_and_jam.enable"} | |
5629 | |
5630 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata | |
5631 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5632 | |
5633 This metadata defines which loop attributes the outer unrolled loop will | |
5634 have. See :ref:`Transformation Metadata <transformation-metadata>` for | |
5635 details. | |
5636 | |
5637 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata | |
5638 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5639 | |
5640 This metadata defines which loop attributes the inner jammed loop will | |
5641 have. See :ref:`Transformation Metadata <transformation-metadata>` for | |
5642 details. | |
5643 | |
5644 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata | |
5645 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5646 | |
5647 This metadata defines which attributes the epilogue of the outer loop | |
5648 will have. This loop is usually unrolled, meaning there is no such | |
5649 loop. This attribute will be ignored in this case. See | |
5650 :ref:`Transformation Metadata <transformation-metadata>` for details. | |
5651 | |
5652 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata | |
5653 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5654 | |
5655 This metadata defines which attributes the inner loop of the epilogue | |
5656 will have. The outer epilogue will usually be unrolled, meaning there | |
5657 can be multiple inner remainder loops. See | |
5658 :ref:`Transformation Metadata <transformation-metadata>` for details. | |
5659 | |
5660 '``llvm.loop.unroll_and_jam.followup_all``' Metadata | |
5661 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5662 | |
5663 Attributes specified in the metadata is added to all | |
5664 ``llvm.loop.unroll_and_jam.*`` loops. See | |
5665 :ref:`Transformation Metadata <transformation-metadata>` for details. | |
5666 | |
5116 '``llvm.loop.licm_versioning.disable``' Metadata | 5667 '``llvm.loop.licm_versioning.disable``' Metadata |
5117 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 5668 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5118 | 5669 |
5119 This metadata indicates that the loop should not be versioned for the purpose | 5670 This metadata indicates that the loop should not be versioned for the purpose |
5120 of enabling loop-invariant code motion (LICM). The metadata has a single operand | 5671 of enabling loop-invariant code motion (LICM). The metadata has a single operand |
5143 !1 = !{!"llvm.loop.distribute.enable", i1 1} | 5694 !1 = !{!"llvm.loop.distribute.enable", i1 1} |
5144 | 5695 |
5145 This metadata should be used in conjunction with ``llvm.loop`` loop | 5696 This metadata should be used in conjunction with ``llvm.loop`` loop |
5146 identification metadata. | 5697 identification metadata. |
5147 | 5698 |
5148 '``llvm.mem``' | 5699 '``llvm.loop.distribute.followup_coincident``' Metadata |
5149 ^^^^^^^^^^^^^^^ | 5700 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5150 | 5701 |
5151 Metadata types used to annotate memory accesses with information helpful | 5702 This metadata defines which attributes extracted loops with no cyclic |
5152 for optimizations are prefixed with ``llvm.mem``. | 5703 dependencies will have (i.e. can be vectorized). See |
5153 | 5704 :ref:`Transformation Metadata <transformation-metadata>` for details. |
5154 '``llvm.mem.parallel_loop_access``' Metadata | 5705 |
5155 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 5706 '``llvm.loop.distribute.followup_sequential``' Metadata |
5156 | 5707 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5157 The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier, | 5708 |
5158 or metadata containing a list of loop identifiers for nested loops. | 5709 This metadata defines which attributes the isolated loops with unsafe |
5159 The metadata is attached to memory accessing instructions and denotes that | 5710 memory dependencies will have. See |
5160 no loop carried memory dependence exist between it and other instructions denoted | 5711 :ref:`Transformation Metadata <transformation-metadata>` for details. |
5161 with the same loop identifier. The metadata on memory reads also implies that | 5712 |
5162 if conversion (i.e. speculative execution within a loop iteration) is safe. | 5713 '``llvm.loop.distribute.followup_fallback``' Metadata |
5163 | 5714 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5164 Precisely, given two instructions ``m1`` and ``m2`` that both have the | 5715 |
5165 ``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the | 5716 If loop versioning is necessary, this metadata defined the attributes |
5166 set of loops associated with that metadata, respectively, then there is no loop | 5717 the non-distributed fallback version will have. See |
5167 carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and | 5718 :ref:`Transformation Metadata <transformation-metadata>` for details. |
5168 ``L2``. | 5719 |
5169 | 5720 '``llvm.loop.distribute.followup_all``' Metadata |
5170 As a special case, if all memory accessing instructions in a loop have | 5721 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5171 ``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the | 5722 |
5172 loop has no loop carried memory dependences and is considered to be a parallel | 5723 The attributes in this metadata is added to all followup loops of the |
5173 loop. | 5724 loop distribution pass. See |
5174 | 5725 :ref:`Transformation Metadata <transformation-metadata>` for details. |
5175 Note that if not all memory access instructions have such metadata referring to | 5726 |
5176 the loop, then the loop is considered not being trivially parallel. Additional | 5727 '``llvm.licm.disable``' Metadata |
5728 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5729 | |
5730 This metadata indicates that loop-invariant code motion (LICM) should not be | |
5731 performed on this loop. The metadata has a single operand which is the string | |
5732 ``llvm.licm.disable``. For example: | |
5733 | |
5734 .. code-block:: llvm | |
5735 | |
5736 !0 = !{!"llvm.licm.disable"} | |
5737 | |
5738 Note that although it operates per loop it isn't given the llvm.loop prefix | |
5739 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata. | |
5740 | |
5741 '``llvm.access.group``' Metadata | |
5742 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5743 | |
5744 ``llvm.access.group`` metadata can be attached to any instruction that | |
5745 potentially accesses memory. It can point to a single distinct metadata | |
5746 node, which we call access group. This node represents all memory access | |
5747 instructions referring to it via ``llvm.access.group``. When an | |
5748 instruction belongs to multiple access groups, it can also point to a | |
5749 list of accesses groups, illustrated by the following example. | |
5750 | |
5751 .. code-block:: llvm | |
5752 | |
5753 %val = load i32, i32* %arrayidx, !llvm.access.group !0 | |
5754 ... | |
5755 !0 = !{!1, !2} | |
5756 !1 = distinct !{} | |
5757 !2 = distinct !{} | |
5758 | |
5759 It is illegal for the list node to be empty since it might be confused | |
5760 with an access group. | |
5761 | |
5762 The access group metadata node must be 'distinct' to avoid collapsing | |
5763 multiple access groups by content. A access group metadata node must | |
5764 always be empty which can be used to distinguish an access group | |
5765 metadata node from a list of access groups. Being empty avoids the | |
5766 situation that the content must be updated which, because metadata is | |
5767 immutable by design, would required finding and updating all references | |
5768 to the access group node. | |
5769 | |
5770 The access group can be used to refer to a memory access instruction | |
5771 without pointing to it directly (which is not possible in global | |
5772 metadata). Currently, the only metadata making use of it is | |
5773 ``llvm.loop.parallel_accesses``. | |
5774 | |
5775 '``llvm.loop.parallel_accesses``' Metadata | |
5776 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
5777 | |
5778 The ``llvm.loop.parallel_accesses`` metadata refers to one or more | |
5779 access group metadata nodes (see ``llvm.access.group``). It denotes that | |
5780 no loop-carried memory dependence exist between it and other instructions | |
5781 in the loop with this metadata. | |
5782 | |
5783 Let ``m1`` and ``m2`` be two instructions that both have the | |
5784 ``llvm.access.group`` metadata to the access group ``g1``, respectively | |
5785 ``g2`` (which might be identical). If a loop contains both access groups | |
5786 in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can | |
5787 assume that there is no dependency between ``m1`` and ``m2`` carried by | |
5788 this loop. Instructions that belong to multiple access groups are | |
5789 considered having this property if at least one of the access groups | |
5790 matches the ``llvm.loop.parallel_accesses`` list. | |
5791 | |
5792 If all memory-accessing instructions in a loop have | |
5793 ``llvm.loop.parallel_accesses`` metadata that refers to that loop, then the | |
5794 loop has no loop carried memory dependences and is considered to be a | |
5795 parallel loop. | |
5796 | |
5797 Note that if not all memory access instructions belong to an access | |
5798 group referred to by ``llvm.loop.parallel_accesses``, then the loop must | |
5799 not be considered trivially parallel. Additional | |
5177 memory dependence analysis is required to make that determination. As a fail | 5800 memory dependence analysis is required to make that determination. As a fail |
5178 safe mechanism, this causes loops that were originally parallel to be considered | 5801 safe mechanism, this causes loops that were originally parallel to be considered |
5179 sequential (if optimization passes that are unaware of the parallel semantics | 5802 sequential (if optimization passes that are unaware of the parallel semantics |
5180 insert new memory instructions into the loop body). | 5803 insert new memory instructions into the loop body). |
5181 | 5804 |
5182 Example of a loop that is considered parallel due to its correct use of | 5805 Example of a loop that is considered parallel due to its correct use of |
5183 both ``llvm.loop`` and ``llvm.mem.parallel_loop_access`` | 5806 both ``llvm.access.group`` and ``llvm.loop.parallel_accesses`` |
5184 metadata types that refer to the same loop identifier metadata. | 5807 metadata types. |
5185 | 5808 |
5186 .. code-block:: llvm | 5809 .. code-block:: llvm |
5187 | 5810 |
5188 for.body: | 5811 for.body: |
5189 ... | 5812 ... |
5190 %val0 = load i32, i32* %arrayidx, !llvm.mem.parallel_loop_access !0 | 5813 %val0 = load i32, i32* %arrayidx, !llvm.access.group !1 |
5191 ... | 5814 ... |
5192 store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0 | 5815 store i32 %val0, i32* %arrayidx1, !llvm.access.group !1 |
5193 ... | 5816 ... |
5194 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 | 5817 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 |
5195 | 5818 |
5196 for.end: | 5819 for.end: |
5197 ... | 5820 ... |
5198 !0 = !{!0} | 5821 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}} |
5199 | 5822 !1 = distinct !{} |
5200 It is also possible to have nested parallel loops. In that case the | 5823 |
5201 memory accesses refer to a list of loop identifier metadata nodes instead of | 5824 It is also possible to have nested parallel loops: |
5202 the loop identifier metadata node directly: | |
5203 | 5825 |
5204 .. code-block:: llvm | 5826 .. code-block:: llvm |
5205 | 5827 |
5206 outer.for.body: | 5828 outer.for.body: |
5207 ... | 5829 ... |
5208 %val1 = load i32, i32* %arrayidx3, !llvm.mem.parallel_loop_access !2 | 5830 %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4 |
5209 ... | 5831 ... |
5210 br label %inner.for.body | 5832 br label %inner.for.body |
5211 | 5833 |
5212 inner.for.body: | 5834 inner.for.body: |
5213 ... | 5835 ... |
5214 %val0 = load i32, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0 | 5836 %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3 |
5215 ... | 5837 ... |
5216 store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0 | 5838 store i32 %val0, i32* %arrayidx2, !llvm.access.group !3 |
5217 ... | 5839 ... |
5218 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 | 5840 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 |
5219 | 5841 |
5220 inner.for.end: | 5842 inner.for.end: |
5221 ... | 5843 ... |
5222 store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2 | 5844 store i32 %val1, i32* %arrayidx4, !llvm.access.group !4 |
5223 ... | 5845 ... |
5224 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 | 5846 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 |
5225 | 5847 |
5226 outer.for.end: ; preds = %for.body | 5848 outer.for.end: ; preds = %for.body |
5227 ... | 5849 ... |
5228 !0 = !{!1, !2} ; a list of loop identifiers | 5850 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop |
5229 !1 = !{!1} ; an identifier for the inner loop | 5851 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop |
5230 !2 = !{!2} ; an identifier for the outer loop | 5852 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well) |
5853 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop | |
5231 | 5854 |
5232 '``irr_loop``' Metadata | 5855 '``irr_loop``' Metadata |
5233 ^^^^^^^^^^^^^^^^^^^^^^^ | 5856 ^^^^^^^^^^^^^^^^^^^^^^^ |
5234 | 5857 |
5235 ``irr_loop`` metadata may be attached to the terminator instruction of a basic | 5858 ``irr_loop`` metadata may be attached to the terminator instruction of a basic |
5250 ... | 5873 ... |
5251 !0 = !{"loop_header_weight", i64 100} | 5874 !0 = !{"loop_header_weight", i64 100} |
5252 | 5875 |
5253 Irreducible loop header weights are typically based on profile data. | 5876 Irreducible loop header weights are typically based on profile data. |
5254 | 5877 |
5878 .. _md_invariant.group: | |
5879 | |
5255 '``invariant.group``' Metadata | 5880 '``invariant.group``' Metadata |
5256 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 5881 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
5257 | 5882 |
5258 The ``invariant.group`` metadata may be attached to ``load``/``store`` instructions. | 5883 The experimental ``invariant.group`` metadata may be attached to |
5884 ``load``/``store`` instructions referencing a single metadata with no entries. | |
5259 The existence of the ``invariant.group`` metadata on the instruction tells | 5885 The existence of the ``invariant.group`` metadata on the instruction tells |
5260 the optimizer that every ``load`` and ``store`` to the same pointer operand | 5886 the optimizer that every ``load`` and ``store`` to the same pointer operand |
5261 within the same invariant group can be assumed to load or store the same | 5887 can be assumed to load or store the same |
5262 value (but see the ``llvm.invariant.group.barrier`` intrinsic which affects | 5888 value (but see the ``llvm.launder.invariant.group`` intrinsic which affects |
5263 when two pointers are considered the same). Pointers returned by bitcast or | 5889 when two pointers are considered the same). Pointers returned by bitcast or |
5264 getelementptr with only zero indices are considered the same. | 5890 getelementptr with only zero indices are considered the same. |
5265 | 5891 |
5266 Examples: | 5892 Examples: |
5267 | 5893 |
5273 store i8 42, i8* %ptr, !invariant.group !0 | 5899 store i8 42, i8* %ptr, !invariant.group !0 |
5274 call void @foo(i8* %ptr) | 5900 call void @foo(i8* %ptr) |
5275 | 5901 |
5276 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change | 5902 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change |
5277 call void @foo(i8* %ptr) | 5903 call void @foo(i8* %ptr) |
5278 %b = load i8, i8* %ptr, !invariant.group !1 ; Can't assume anything, because group changed | |
5279 | 5904 |
5280 %newPtr = call i8* @getPointer(i8* %ptr) | 5905 %newPtr = call i8* @getPointer(i8* %ptr) |
5281 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr | 5906 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr |
5282 | 5907 |
5283 %unknownValue = load i8, i8* @unknownPtr | 5908 %unknownValue = load i8, i8* @unknownPtr |
5284 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42 | 5909 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42 |
5285 | 5910 |
5286 call void @foo(i8* %ptr) | 5911 call void @foo(i8* %ptr) |
5287 %newPtr2 = call i8* @llvm.invariant.group.barrier(i8* %ptr) | 5912 %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr) |
5288 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through invariant.group.barrier to get value of %ptr | 5913 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr |
5289 | 5914 |
5290 ... | 5915 ... |
5291 declare void @foo(i8*) | 5916 declare void @foo(i8*) |
5292 declare i8* @getPointer(i8*) | 5917 declare i8* @getPointer(i8*) |
5293 declare i8* @llvm.invariant.group.barrier(i8*) | 5918 declare i8* @llvm.launder.invariant.group(i8*) |
5294 | 5919 |
5295 !0 = !{!"magic ptr"} | 5920 !0 = !{} |
5296 !1 = !{!"other ptr"} | |
5297 | 5921 |
5298 The invariant.group metadata must be dropped when replacing one pointer by | 5922 The invariant.group metadata must be dropped when replacing one pointer by |
5299 another based on aliasing information. This is because invariant.group is tied | 5923 another based on aliasing information. This is because invariant.group is tied |
5300 to the SSA value of the pointer operand. | 5924 to the SSA value of the pointer operand. |
5301 | 5925 |
5303 | 5927 |
5304 %v = load i8, i8* %x, !invariant.group !0 | 5928 %v = load i8, i8* %x, !invariant.group !0 |
5305 ; if %x mustalias %y then we can replace the above instruction with | 5929 ; if %x mustalias %y then we can replace the above instruction with |
5306 %v = load i8, i8* %y | 5930 %v = load i8, i8* %y |
5307 | 5931 |
5932 Note that this is an experimental feature, which means that its semantics might | |
5933 change in the future. | |
5308 | 5934 |
5309 '``type``' Metadata | 5935 '``type``' Metadata |
5310 ^^^^^^^^^^^^^^^^^^^ | 5936 ^^^^^^^^^^^^^^^^^^^ |
5311 | 5937 |
5312 See :doc:`TypeMetadata`. | 5938 See :doc:`TypeMetadata`. |
5611 !1 = !{i32 1, !"short_enum", i32 0} | 6237 !1 = !{i32 1, !"short_enum", i32 0} |
5612 | 6238 |
5613 Automatic Linker Flags Named Metadata | 6239 Automatic Linker Flags Named Metadata |
5614 ===================================== | 6240 ===================================== |
5615 | 6241 |
5616 Some targets support embedding flags to the linker inside individual object | 6242 Some targets support embedding of flags to the linker inside individual object |
5617 files. Typically this is used in conjunction with language extensions which | 6243 files. Typically this is used in conjunction with language extensions which |
5618 allow source files to explicitly declare the libraries they depend on, and have | 6244 allow source files to contain linker command line options, and have these |
5619 these automatically be transmitted to the linker via object files. | 6245 automatically be transmitted to the linker via object files. |
5620 | 6246 |
5621 These flags are encoded in the IR using named metadata with the name | 6247 These flags are encoded in the IR using named metadata with the name |
5622 ``!llvm.linker.options``. Each operand is expected to be a metadata node | 6248 ``!llvm.linker.options``. Each operand is expected to be a metadata node |
5623 which should be a list of other metadata nodes, each of which should be a | 6249 which should be a list of other metadata nodes, each of which should be a |
5624 list of metadata strings defining linker options. | 6250 list of metadata strings defining linker options. |
5625 | 6251 |
5626 For example, the following metadata section specifies two separate sets of | 6252 For example, the following metadata section specifies two separate sets of |
5627 linker options, presumably to link against ``libz`` and the ``Cocoa`` | 6253 linker options, presumably to link against ``libz`` and the ``Cocoa`` |
5628 framework:: | 6254 framework:: |
5629 | 6255 |
5630 !0 = !{ !"-lz" }, | 6256 !0 = !{ !"-lz" } |
5631 !1 = !{ !"-framework", !"Cocoa" } } } | 6257 !1 = !{ !"-framework", !"Cocoa" } |
5632 !llvm.linker.options = !{ !0, !1 } | 6258 !llvm.linker.options = !{ !0, !1 } |
5633 | 6259 |
5634 The metadata encoding as lists of lists of options, as opposed to a collapsed | 6260 The metadata encoding as lists of lists of options, as opposed to a collapsed |
5635 list of options, is chosen so that the IR encoding can use multiple option | 6261 list of options, is chosen so that the IR encoding can use multiple option |
5636 strings to specify e.g., a single library, while still having that specifier be | 6262 strings to specify e.g., a single library, while still having that specifier be |
5638 assembly writer or object file emitter. | 6264 assembly writer or object file emitter. |
5639 | 6265 |
5640 Each individual option is required to be either a valid option for the target's | 6266 Each individual option is required to be either a valid option for the target's |
5641 linker, or an option that is reserved by the target specific assembly writer or | 6267 linker, or an option that is reserved by the target specific assembly writer or |
5642 object file emitter. No other aspect of these options is defined by the IR. | 6268 object file emitter. No other aspect of these options is defined by the IR. |
6269 | |
6270 Dependent Libs Named Metadata | |
6271 ============================= | |
6272 | |
6273 Some targets support embedding of strings into object files to indicate | |
6274 a set of libraries to add to the link. Typically this is used in conjunction | |
6275 with language extensions which allow source files to explicitly declare the | |
6276 libraries they depend on, and have these automatically be transmitted to the | |
6277 linker via object files. | |
6278 | |
6279 The list is encoded in the IR using named metadata with the name | |
6280 ``!llvm.dependent-libraries``. Each operand is expected to be a metadata node | |
6281 which should contain a single string operand. | |
6282 | |
6283 For example, the following metadata section contains two library specfiers:: | |
6284 | |
6285 !0 = !{!"a library specifier"} | |
6286 !1 = !{!"another library specifier"} | |
6287 !llvm.dependent-libraries = !{ !0, !1 } | |
6288 | |
6289 Each library specifier will be handled independently by the consuming linker. | |
6290 The effect of the library specifiers are defined by the consuming linker. | |
6291 | |
6292 .. _summary: | |
6293 | |
6294 ThinLTO Summary | |
6295 =============== | |
6296 | |
6297 Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_ | |
6298 causes the building of a compact summary of the module that is emitted into | |
6299 the bitcode. The summary is emitted into the LLVM assembly and identified | |
6300 in syntax by a caret ('``^``'). | |
6301 | |
6302 The summary is parsed into a bitcode output, along with the Module | |
6303 IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes | |
6304 of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the | |
6305 summary entries (just as they currently ignore summary entries in a bitcode | |
6306 input file). | |
6307 | |
6308 Eventually, the summary will be parsed into a ModuleSummaryIndex object under | |
6309 the same conditions where summary index is currently built from bitcode. | |
6310 Specifically, tools that test the Thin Link portion of a ThinLTO compile | |
6311 (i.e. llvm-lto and llvm-lto2), or when parsing a combined index | |
6312 for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag | |
6313 (this part is not yet implemented, use llvm-as to create a bitcode object | |
6314 before feeding into thin link tools for now). | |
6315 | |
6316 There are currently 3 types of summary entries in the LLVM assembly: | |
6317 :ref:`module paths<module_path_summary>`, | |
6318 :ref:`global values<gv_summary>`, and | |
6319 :ref:`type identifiers<typeid_summary>`. | |
6320 | |
6321 .. _module_path_summary: | |
6322 | |
6323 Module Path Summary Entry | |
6324 ------------------------- | |
6325 | |
6326 Each module path summary entry lists a module containing global values included | |
6327 in the summary. For a single IR module there will be one such entry, but | |
6328 in a combined summary index produced during the thin link, there will be | |
6329 one module path entry per linked module with summary. | |
6330 | |
6331 Example: | |
6332 | |
6333 .. code-block:: text | |
6334 | |
6335 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418)) | |
6336 | |
6337 The ``path`` field is a string path to the bitcode file, and the ``hash`` | |
6338 field is the 160-bit SHA-1 hash of the IR bitcode contents, used for | |
6339 incremental builds and caching. | |
6340 | |
6341 .. _gv_summary: | |
6342 | |
6343 Global Value Summary Entry | |
6344 -------------------------- | |
6345 | |
6346 Each global value summary entry corresponds to a global value defined or | |
6347 referenced by a summarized module. | |
6348 | |
6349 Example: | |
6350 | |
6351 .. code-block:: text | |
6352 | |
6353 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831 | |
6354 | |
6355 For declarations, there will not be a summary list. For definitions, a | |
6356 global value will contain a list of summaries, one per module containing | |
6357 a definition. There can be multiple entries in a combined summary index | |
6358 for symbols with weak linkage. | |
6359 | |
6360 Each ``Summary`` format will depend on whether the global value is a | |
6361 :ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or | |
6362 :ref:`alias<alias_summary>`. | |
6363 | |
6364 .. _function_summary: | |
6365 | |
6366 Function Summary | |
6367 ^^^^^^^^^^^^^^^^ | |
6368 | |
6369 If the global value is a function, the ``Summary`` entry will look like: | |
6370 | |
6371 .. code-block:: text | |
6372 | |
6373 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Refs]? | |
6374 | |
6375 The ``module`` field includes the summary entry id for the module containing | |
6376 this definition, and the ``flags`` field contains information such as | |
6377 the linkage type, a flag indicating whether it is legal to import the | |
6378 definition, whether it is globally live and whether the linker resolved it | |
6379 to a local definition (the latter two are populated during the thin link). | |
6380 The ``insts`` field contains the number of IR instructions in the function. | |
6381 Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`, | |
6382 :ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`, | |
6383 :ref:`Refs<refs_summary>`. | |
6384 | |
6385 .. _variable_summary: | |
6386 | |
6387 Global Variable Summary | |
6388 ^^^^^^^^^^^^^^^^^^^^^^^ | |
6389 | |
6390 If the global value is a variable, the ``Summary`` entry will look like: | |
6391 | |
6392 .. code-block:: text | |
6393 | |
6394 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]? | |
6395 | |
6396 The variable entry contains a subset of the fields in a | |
6397 :ref:`function summary <function_summary>`, see the descriptions there. | |
6398 | |
6399 .. _alias_summary: | |
6400 | |
6401 Alias Summary | |
6402 ^^^^^^^^^^^^^ | |
6403 | |
6404 If the global value is an alias, the ``Summary`` entry will look like: | |
6405 | |
6406 .. code-block:: text | |
6407 | |
6408 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2) | |
6409 | |
6410 The ``module`` and ``flags`` fields are as described for a | |
6411 :ref:`function summary <function_summary>`. The ``aliasee`` field | |
6412 contains a reference to the global value summary entry of the aliasee. | |
6413 | |
6414 .. _funcflags_summary: | |
6415 | |
6416 Function Flags | |
6417 ^^^^^^^^^^^^^^ | |
6418 | |
6419 The optional ``FuncFlags`` field looks like: | |
6420 | |
6421 .. code-block:: text | |
6422 | |
6423 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0) | |
6424 | |
6425 If unspecified, flags are assumed to hold the conservative ``false`` value of | |
6426 ``0``. | |
6427 | |
6428 .. _calls_summary: | |
6429 | |
6430 Calls | |
6431 ^^^^^ | |
6432 | |
6433 The optional ``Calls`` field looks like: | |
6434 | |
6435 .. code-block:: text | |
6436 | |
6437 calls: ((Callee)[, (Callee)]*) | |
6438 | |
6439 where each ``Callee`` looks like: | |
6440 | |
6441 .. code-block:: text | |
6442 | |
6443 callee: ^1[, hotness: None]?[, relbf: 0]? | |
6444 | |
6445 The ``callee`` refers to the summary entry id of the callee. At most one | |
6446 of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``, | |
6447 ``Hot``, and ``Critical``), and ``relbf`` (which holds the integer | |
6448 branch frequency relative to the entry frequency, scaled down by 2^8) | |
6449 may be specified. The defaults are ``Unknown`` and ``0``, respectively. | |
6450 | |
6451 .. _refs_summary: | |
6452 | |
6453 Refs | |
6454 ^^^^ | |
6455 | |
6456 The optional ``Refs`` field looks like: | |
6457 | |
6458 .. code-block:: text | |
6459 | |
6460 refs: ((Ref)[, (Ref)]*) | |
6461 | |
6462 where each ``Ref`` contains a reference to the summary id of the referenced | |
6463 value (e.g. ``^1``). | |
6464 | |
6465 .. _typeidinfo_summary: | |
6466 | |
6467 TypeIdInfo | |
6468 ^^^^^^^^^^ | |
6469 | |
6470 The optional ``TypeIdInfo`` field, used for | |
6471 `Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, | |
6472 looks like: | |
6473 | |
6474 .. code-block:: text | |
6475 | |
6476 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]? | |
6477 | |
6478 These optional fields have the following forms: | |
6479 | |
6480 TypeTests | |
6481 """"""""" | |
6482 | |
6483 .. code-block:: text | |
6484 | |
6485 typeTests: (TypeIdRef[, TypeIdRef]*) | |
6486 | |
6487 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` | |
6488 by summary id or ``GUID``. | |
6489 | |
6490 TypeTestAssumeVCalls | |
6491 """""""""""""""""""" | |
6492 | |
6493 .. code-block:: text | |
6494 | |
6495 typeTestAssumeVCalls: (VFuncId[, VFuncId]*) | |
6496 | |
6497 Where each VFuncId has the format: | |
6498 | |
6499 .. code-block:: text | |
6500 | |
6501 vFuncId: (TypeIdRef, offset: 16) | |
6502 | |
6503 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` | |
6504 by summary id or ``GUID`` preceeded by a ``guid:`` tag. | |
6505 | |
6506 TypeCheckedLoadVCalls | |
6507 """"""""""""""""""""" | |
6508 | |
6509 .. code-block:: text | |
6510 | |
6511 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*) | |
6512 | |
6513 Where each VFuncId has the format described for ``TypeTestAssumeVCalls``. | |
6514 | |
6515 TypeTestAssumeConstVCalls | |
6516 """"""""""""""""""""""""" | |
6517 | |
6518 .. code-block:: text | |
6519 | |
6520 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*) | |
6521 | |
6522 Where each ConstVCall has the format: | |
6523 | |
6524 .. code-block:: text | |
6525 | |
6526 (VFuncId, args: (Arg[, Arg]*)) | |
6527 | |
6528 and where each VFuncId has the format described for ``TypeTestAssumeVCalls``, | |
6529 and each Arg is an integer argument number. | |
6530 | |
6531 TypeCheckedLoadConstVCalls | |
6532 """""""""""""""""""""""""" | |
6533 | |
6534 .. code-block:: text | |
6535 | |
6536 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*) | |
6537 | |
6538 Where each ConstVCall has the format described for | |
6539 ``TypeTestAssumeConstVCalls``. | |
6540 | |
6541 .. _typeid_summary: | |
6542 | |
6543 Type ID Summary Entry | |
6544 --------------------- | |
6545 | |
6546 Each type id summary entry corresponds to a type identifier resolution | |
6547 which is generated during the LTO link portion of the compile when building | |
6548 with `Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, | |
6549 so these are only present in a combined summary index. | |
6550 | |
6551 Example: | |
6552 | |
6553 .. code-block:: text | |
6554 | |
6555 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778 | |
6556 | |
6557 The ``typeTestRes`` gives the type test resolution ``kind`` (which may | |
6558 be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and | |
6559 the ``size-1`` bit width. It is followed by optional flags, which default to 0, | |
6560 and an optional WpdResolutions (whole program devirtualization resolution) | |
6561 field that looks like: | |
6562 | |
6563 .. code-block:: text | |
6564 | |
6565 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]* | |
6566 | |
6567 where each entry is a mapping from the given byte offset to the whole-program | |
6568 devirtualization resolution WpdRes, that has one of the following formats: | |
6569 | |
6570 .. code-block:: text | |
6571 | |
6572 wpdRes: (kind: branchFunnel) | |
6573 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi") | |
6574 wpdRes: (kind: indir) | |
6575 | |
6576 Additionally, each wpdRes has an optional ``resByArg`` field, which | |
6577 describes the resolutions for calls with all constant integer arguments: | |
6578 | |
6579 .. code-block:: text | |
6580 | |
6581 resByArg: (ResByArg[, ResByArg]*) | |
6582 | |
6583 where ResByArg is: | |
6584 | |
6585 .. code-block:: text | |
6586 | |
6587 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0]) | |
6588 | |
6589 Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal`` | |
6590 or ``VirtualConstProp``. The ``info`` field is only used if the kind | |
6591 is ``UniformRetVal`` (indicates the uniform return value), or | |
6592 ``UniqueRetVal`` (holds the return value associated with the unique vtable | |
6593 (0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does | |
6594 not support the use of absolute symbols to store constants. | |
5643 | 6595 |
5644 .. _intrinsicglobalvariables: | 6596 .. _intrinsicglobalvariables: |
5645 | 6597 |
5646 Intrinsic Global Variables | 6598 Intrinsic Global Variables |
5647 ========================== | 6599 ========================== |
5708 | 6660 |
5709 %0 = type { i32, void ()*, i8* } | 6661 %0 = type { i32, void ()*, i8* } |
5710 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] | 6662 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] |
5711 | 6663 |
5712 The ``@llvm.global_ctors`` array contains a list of constructor | 6664 The ``@llvm.global_ctors`` array contains a list of constructor |
5713 functions, priorities, and an optional associated global or function. | 6665 functions, priorities, and an associated global or function. |
5714 The functions referenced by this array will be called in ascending order | 6666 The functions referenced by this array will be called in ascending order |
5715 of priority (i.e. lowest first) when the module is loaded. The order of | 6667 of priority (i.e. lowest first) when the module is loaded. The order of |
5716 functions with the same priority is not defined. | 6668 functions with the same priority is not defined. |
5717 | 6669 |
5718 If the third field is present, non-null, and points to a global variable | 6670 If the third field is non-null, and points to a global variable |
5719 or function, the initializer function will only run if the associated | 6671 or function, the initializer function will only run if the associated |
5720 data from the current module is not discarded. | 6672 data from the current module is not discarded. |
5721 | 6673 |
5722 .. _llvmglobaldtors: | 6674 .. _llvmglobaldtors: |
5723 | 6675 |
5728 | 6680 |
5729 %0 = type { i32, void ()*, i8* } | 6681 %0 = type { i32, void ()*, i8* } |
5730 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] | 6682 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] |
5731 | 6683 |
5732 The ``@llvm.global_dtors`` array contains a list of destructor | 6684 The ``@llvm.global_dtors`` array contains a list of destructor |
5733 functions, priorities, and an optional associated global or function. | 6685 functions, priorities, and an associated global or function. |
5734 The functions referenced by this array will be called in descending | 6686 The functions referenced by this array will be called in descending |
5735 order of priority (i.e. highest first) when the module is unloaded. The | 6687 order of priority (i.e. highest first) when the module is unloaded. The |
5736 order of functions with the same priority is not defined. | 6688 order of functions with the same priority is not defined. |
5737 | 6689 |
5738 If the third field is present, non-null, and points to a global variable | 6690 If the third field is non-null, and points to a global variable |
5739 or function, the destructor function will only run if the associated | 6691 or function, the destructor function will only run if the associated |
5740 data from the current module is not discarded. | 6692 data from the current module is not discarded. |
5741 | 6693 |
5742 Instruction Reference | 6694 Instruction Reference |
5743 ===================== | 6695 ===================== |
5761 ':ref:`invoke <i_invoke>`' instruction). | 6713 ':ref:`invoke <i_invoke>`' instruction). |
5762 | 6714 |
5763 The terminator instructions are: ':ref:`ret <i_ret>`', | 6715 The terminator instructions are: ':ref:`ret <i_ret>`', |
5764 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`', | 6716 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`', |
5765 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', | 6717 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', |
6718 ':ref:`callbr <i_callbr>`' | |
5766 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`', | 6719 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`', |
5767 ':ref:`catchret <i_catchret>`', | 6720 ':ref:`catchret <i_catchret>`', |
5768 ':ref:`cleanupret <i_cleanupret>`', | 6721 ':ref:`cleanupret <i_cleanupret>`', |
5769 and ':ref:`unreachable <i_unreachable>`'. | 6722 and ':ref:`unreachable <i_unreachable>`'. |
5770 | 6723 |
5796 | 6749 |
5797 The '``ret``' instruction optionally accepts a single argument, the | 6750 The '``ret``' instruction optionally accepts a single argument, the |
5798 return value. The type of the return value must be a ':ref:`first | 6751 return value. The type of the return value must be a ':ref:`first |
5799 class <t_firstclass>`' type. | 6752 class <t_firstclass>`' type. |
5800 | 6753 |
5801 A function is not :ref:`well formed <wellformed>` if it it has a non-void | 6754 A function is not :ref:`well formed <wellformed>` if it has a non-void |
5802 return type and contains a '``ret``' instruction with no return value or | 6755 return type and contains a '``ret``' instruction with no return value or |
5803 a return value with a type that does not match its type, or if it has a | 6756 a return value with a type that does not match its type, or if it has a |
5804 void return type and contains a '``ret``' instruction with a return | 6757 void return type and contains a '``ret``' instruction with a return |
5805 value. | 6758 value. |
5806 | 6759 |
5995 Syntax: | 6948 Syntax: |
5996 """"""" | 6949 """"""" |
5997 | 6950 |
5998 :: | 6951 :: |
5999 | 6952 |
6000 <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] | 6953 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] [<ty>|<fnty> <fnptrval>(<function args>) [fn attrs] |
6001 [operand bundles] to label <normal label> unwind label <exception label> | 6954 [operand bundles] to label <normal label> unwind label <exception label> |
6002 | 6955 |
6003 Overview: | 6956 Overview: |
6004 """"""""" | 6957 """"""""" |
6005 | 6958 |
6031 convention <callingconv>` the call should use. If none is | 6984 convention <callingconv>` the call should use. If none is |
6032 specified, the call defaults to using C calling conventions. | 6985 specified, the call defaults to using C calling conventions. |
6033 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return | 6986 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return |
6034 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes | 6987 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes |
6035 are valid here. | 6988 are valid here. |
6989 #. The optional addrspace attribute can be used to indicate the address space | |
6990 of the called function. If it is not specified, the program address space | |
6991 from the :ref:`datalayout string<langref_datalayout>` will be used. | |
6036 #. '``ty``': the type of the call instruction itself which is also the | 6992 #. '``ty``': the type of the call instruction itself which is also the |
6037 type of the return value. Functions that return no value are marked | 6993 type of the return value. Functions that return no value are marked |
6038 ``void``. | 6994 ``void``. |
6039 #. '``fnty``': shall be the signature of the function being invoked. The | 6995 #. '``fnty``': shall be the signature of the function being invoked. The |
6040 argument types must match the types implied by this signature. This | 6996 argument types must match the types implied by this signature. This |
6082 %retval = invoke i32 @Test(i32 15) to label %Continue | 7038 %retval = invoke i32 @Test(i32 15) to label %Continue |
6083 unwind label %TestCleanup ; i32:retval set | 7039 unwind label %TestCleanup ; i32:retval set |
6084 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue | 7040 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue |
6085 unwind label %TestCleanup ; i32:retval set | 7041 unwind label %TestCleanup ; i32:retval set |
6086 | 7042 |
7043 .. _i_callbr: | |
7044 | |
7045 '``callbr``' Instruction | |
7046 ^^^^^^^^^^^^^^^^^^^^^^^^ | |
7047 | |
7048 Syntax: | |
7049 """"""" | |
7050 | |
7051 :: | |
7052 | |
7053 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] [<ty>|<fnty> <fnptrval>(<function args>) [fn attrs] | |
7054 [operand bundles] to label <normal label> or jump [other labels] | |
7055 | |
7056 Overview: | |
7057 """"""""" | |
7058 | |
7059 The '``callbr``' instruction causes control to transfer to a specified | |
7060 function, with the possibility of control flow transfer to either the | |
7061 '``normal``' label or one of the '``other``' labels. | |
7062 | |
7063 This instruction should only be used to implement the "goto" feature of gcc | |
7064 style inline assembly. Any other usage is an error in the IR verifier. | |
7065 | |
7066 Arguments: | |
7067 """""""""" | |
7068 | |
7069 This instruction requires several arguments: | |
7070 | |
7071 #. The optional "cconv" marker indicates which :ref:`calling | |
7072 convention <callingconv>` the call should use. If none is | |
7073 specified, the call defaults to using C calling conventions. | |
7074 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return | |
7075 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes | |
7076 are valid here. | |
7077 #. The optional addrspace attribute can be used to indicate the address space | |
7078 of the called function. If it is not specified, the program address space | |
7079 from the :ref:`datalayout string<langref_datalayout>` will be used. | |
7080 #. '``ty``': the type of the call instruction itself which is also the | |
7081 type of the return value. Functions that return no value are marked | |
7082 ``void``. | |
7083 #. '``fnty``': shall be the signature of the function being called. The | |
7084 argument types must match the types implied by this signature. This | |
7085 type can be omitted if the function is not varargs. | |
7086 #. '``fnptrval``': An LLVM value containing a pointer to a function to | |
7087 be called. In most cases, this is a direct function call, but | |
7088 indirect ``callbr``'s are just as possible, calling an arbitrary pointer | |
7089 to function value. | |
7090 #. '``function args``': argument list whose types match the function | |
7091 signature argument types and parameter attributes. All arguments must | |
7092 be of :ref:`first class <t_firstclass>` type. If the function signature | |
7093 indicates the function accepts a variable number of arguments, the | |
7094 extra arguments can be specified. | |
7095 #. '``normal label``': the label reached when the called function | |
7096 executes a '``ret``' instruction. | |
7097 #. '``other labels``': the labels reached when a callee transfers control | |
7098 to a location other than the normal '``normal label``' | |
7099 #. The optional :ref:`function attributes <fnattrs>` list. | |
7100 #. The optional :ref:`operand bundles <opbundles>` list. | |
7101 | |
7102 Semantics: | |
7103 """""""""" | |
7104 | |
7105 This instruction is designed to operate as a standard '``call``' | |
7106 instruction in most regards. The primary difference is that it | |
7107 establishes an association with additional labels to define where control | |
7108 flow goes after the call. | |
7109 | |
7110 The only use of this today is to implement the "goto" feature of gcc inline | |
7111 assembly where additional labels can be provided as locations for the inline | |
7112 assembly to jump to. | |
7113 | |
7114 Example: | |
7115 """""""" | |
7116 | |
7117 .. code-block:: text | |
7118 | |
7119 callbr void asm "", "r,x"(i32 %x, i8 *blockaddress(@foo, %fail)) | |
7120 to label %normal or jump [label %fail] | |
7121 | |
6087 .. _i_resume: | 7122 .. _i_resume: |
6088 | 7123 |
6089 '``resume``' Instruction | 7124 '``resume``' Instruction |
6090 ^^^^^^^^^^^^^^^^^^^^^^^^ | 7125 ^^^^^^^^^^^^^^^^^^^^^^^^ |
6091 | 7126 |
6301 Semantics: | 7336 Semantics: |
6302 """""""""" | 7337 """""""""" |
6303 | 7338 |
6304 The '``unreachable``' instruction has no defined semantics. | 7339 The '``unreachable``' instruction has no defined semantics. |
6305 | 7340 |
7341 .. _unaryops: | |
7342 | |
7343 Unary Operations | |
7344 ----------------- | |
7345 | |
7346 Unary operators require a single operand, execute an operation on | |
7347 it, and produce a single value. The operand might represent multiple | |
7348 data, as is the case with the :ref:`vector <t_vector>` data type. The | |
7349 result value has the same type as its operand. | |
7350 | |
7351 .. _i_fneg: | |
7352 | |
7353 '``fneg``' Instruction | |
7354 ^^^^^^^^^^^^^^^^^^^^^^ | |
7355 | |
7356 Syntax: | |
7357 """"""" | |
7358 | |
7359 :: | |
7360 | |
7361 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result | |
7362 | |
7363 Overview: | |
7364 """"""""" | |
7365 | |
7366 The '``fneg``' instruction returns the negation of its operand. | |
7367 | |
7368 Arguments: | |
7369 """""""""" | |
7370 | |
7371 The argument to the '``fneg``' instruction must be a | |
7372 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of | |
7373 floating-point values. | |
7374 | |
7375 Semantics: | |
7376 """""""""" | |
7377 | |
7378 The value produced is a copy of the operand with its sign bit flipped. | |
7379 This instruction can also take any number of :ref:`fast-math | |
7380 flags <fastmath>`, which are optimization hints to enable otherwise | |
7381 unsafe floating-point optimizations: | |
7382 | |
7383 Example: | |
7384 """""""" | |
7385 | |
7386 .. code-block:: text | |
7387 | |
7388 <result> = fneg float %val ; yields float:result = -%var | |
7389 | |
6306 .. _binaryops: | 7390 .. _binaryops: |
6307 | 7391 |
6308 Binary Operations | 7392 Binary Operations |
6309 ----------------- | 7393 ----------------- |
6310 | 7394 |
6385 The '``fadd``' instruction returns the sum of its two operands. | 7469 The '``fadd``' instruction returns the sum of its two operands. |
6386 | 7470 |
6387 Arguments: | 7471 Arguments: |
6388 """""""""" | 7472 """""""""" |
6389 | 7473 |
6390 The two arguments to the '``fadd``' instruction must be :ref:`floating | 7474 The two arguments to the '``fadd``' instruction must be |
6391 point <t_floating>` or :ref:`vector <t_vector>` of floating point values. | 7475 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of |
6392 Both arguments must have identical types. | 7476 floating-point values. Both arguments must have identical types. |
6393 | 7477 |
6394 Semantics: | 7478 Semantics: |
6395 """""""""" | 7479 """""""""" |
6396 | 7480 |
6397 The value produced is the floating point sum of the two operands. This | 7481 The value produced is the floating-point sum of the two operands. |
6398 instruction can also take any number of :ref:`fast-math flags <fastmath>`, | 7482 This instruction is assumed to execute in the default :ref:`floating-point |
6399 which are optimization hints to enable otherwise unsafe floating point | 7483 environment <floatenv>`. |
6400 optimizations: | 7484 This instruction can also take any number of :ref:`fast-math |
7485 flags <fastmath>`, which are optimization hints to enable otherwise | |
7486 unsafe floating-point optimizations: | |
6401 | 7487 |
6402 Example: | 7488 Example: |
6403 """""""" | 7489 """""""" |
6404 | 7490 |
6405 .. code-block:: text | 7491 .. code-block:: text |
6474 Overview: | 7560 Overview: |
6475 """"""""" | 7561 """"""""" |
6476 | 7562 |
6477 The '``fsub``' instruction returns the difference of its two operands. | 7563 The '``fsub``' instruction returns the difference of its two operands. |
6478 | 7564 |
6479 Note that the '``fsub``' instruction is used to represent the '``fneg``' | 7565 Arguments: |
6480 instruction present in most other intermediate representations. | 7566 """""""""" |
6481 | 7567 |
6482 Arguments: | 7568 The two arguments to the '``fsub``' instruction must be |
6483 """""""""" | 7569 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of |
6484 | 7570 floating-point values. Both arguments must have identical types. |
6485 The two arguments to the '``fsub``' instruction must be :ref:`floating | 7571 |
6486 point <t_floating>` or :ref:`vector <t_vector>` of floating point values. | 7572 Semantics: |
6487 Both arguments must have identical types. | 7573 """""""""" |
6488 | 7574 |
6489 Semantics: | 7575 The value produced is the floating-point difference of the two operands. |
6490 """""""""" | 7576 This instruction is assumed to execute in the default :ref:`floating-point |
6491 | 7577 environment <floatenv>`. |
6492 The value produced is the floating point difference of the two operands. | |
6493 This instruction can also take any number of :ref:`fast-math | 7578 This instruction can also take any number of :ref:`fast-math |
6494 flags <fastmath>`, which are optimization hints to enable otherwise | 7579 flags <fastmath>`, which are optimization hints to enable otherwise |
6495 unsafe floating point optimizations: | 7580 unsafe floating-point optimizations: |
6496 | 7581 |
6497 Example: | 7582 Example: |
6498 """""""" | 7583 """""""" |
6499 | 7584 |
6500 .. code-block:: text | 7585 .. code-block:: text |
6573 The '``fmul``' instruction returns the product of its two operands. | 7658 The '``fmul``' instruction returns the product of its two operands. |
6574 | 7659 |
6575 Arguments: | 7660 Arguments: |
6576 """""""""" | 7661 """""""""" |
6577 | 7662 |
6578 The two arguments to the '``fmul``' instruction must be :ref:`floating | 7663 The two arguments to the '``fmul``' instruction must be |
6579 point <t_floating>` or :ref:`vector <t_vector>` of floating point values. | 7664 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of |
6580 Both arguments must have identical types. | 7665 floating-point values. Both arguments must have identical types. |
6581 | 7666 |
6582 Semantics: | 7667 Semantics: |
6583 """""""""" | 7668 """""""""" |
6584 | 7669 |
6585 The value produced is the floating point product of the two operands. | 7670 The value produced is the floating-point product of the two operands. |
7671 This instruction is assumed to execute in the default :ref:`floating-point | |
7672 environment <floatenv>`. | |
6586 This instruction can also take any number of :ref:`fast-math | 7673 This instruction can also take any number of :ref:`fast-math |
6587 flags <fastmath>`, which are optimization hints to enable otherwise | 7674 flags <fastmath>`, which are optimization hints to enable otherwise |
6588 unsafe floating point optimizations: | 7675 unsafe floating-point optimizations: |
6589 | 7676 |
6590 Example: | 7677 Example: |
6591 """""""" | 7678 """""""" |
6592 | 7679 |
6593 .. code-block:: text | 7680 .. code-block:: text |
6705 The '``fdiv``' instruction returns the quotient of its two operands. | 7792 The '``fdiv``' instruction returns the quotient of its two operands. |
6706 | 7793 |
6707 Arguments: | 7794 Arguments: |
6708 """""""""" | 7795 """""""""" |
6709 | 7796 |
6710 The two arguments to the '``fdiv``' instruction must be :ref:`floating | 7797 The two arguments to the '``fdiv``' instruction must be |
6711 point <t_floating>` or :ref:`vector <t_vector>` of floating point values. | 7798 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of |
6712 Both arguments must have identical types. | 7799 floating-point values. Both arguments must have identical types. |
6713 | 7800 |
6714 Semantics: | 7801 Semantics: |
6715 """""""""" | 7802 """""""""" |
6716 | 7803 |
6717 The value produced is the floating point quotient of the two operands. | 7804 The value produced is the floating-point quotient of the two operands. |
7805 This instruction is assumed to execute in the default :ref:`floating-point | |
7806 environment <floatenv>`. | |
6718 This instruction can also take any number of :ref:`fast-math | 7807 This instruction can also take any number of :ref:`fast-math |
6719 flags <fastmath>`, which are optimization hints to enable otherwise | 7808 flags <fastmath>`, which are optimization hints to enable otherwise |
6720 unsafe floating point optimizations: | 7809 unsafe floating-point optimizations: |
6721 | 7810 |
6722 Example: | 7811 Example: |
6723 """""""" | 7812 """""""" |
6724 | 7813 |
6725 .. code-block:: text | 7814 .. code-block:: text |
6846 its two operands. | 7935 its two operands. |
6847 | 7936 |
6848 Arguments: | 7937 Arguments: |
6849 """""""""" | 7938 """""""""" |
6850 | 7939 |
6851 The two arguments to the '``frem``' instruction must be :ref:`floating | 7940 The two arguments to the '``frem``' instruction must be |
6852 point <t_floating>` or :ref:`vector <t_vector>` of floating point values. | 7941 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of |
6853 Both arguments must have identical types. | 7942 floating-point values. Both arguments must have identical types. |
6854 | 7943 |
6855 Semantics: | 7944 Semantics: |
6856 """""""""" | 7945 """""""""" |
6857 | 7946 |
6858 Return the same value as a libm '``fmod``' function but without trapping or | 7947 The value produced is the floating-point remainder of the two operands. |
6859 setting ``errno``. | 7948 This is the same output as a libm '``fmod``' function, but without any |
6860 | 7949 possibility of setting ``errno``. The remainder has the same sign as the |
6861 The remainder has the same sign as the dividend. This instruction can also | 7950 dividend. |
6862 take any number of :ref:`fast-math flags <fastmath>`, which are optimization | 7951 This instruction is assumed to execute in the default :ref:`floating-point |
6863 hints to enable otherwise unsafe floating-point optimizations: | 7952 environment <floatenv>`. |
7953 This instruction can also take any number of :ref:`fast-math | |
7954 flags <fastmath>`, which are optimization hints to enable otherwise | |
7955 unsafe floating-point optimizations: | |
6864 | 7956 |
6865 Example: | 7957 Example: |
6866 """""""" | 7958 """""""" |
6867 | 7959 |
6868 .. code-block:: text | 7960 .. code-block:: text |
6917 by the corresponding shift amount in ``op2``. | 8009 by the corresponding shift amount in ``op2``. |
6918 | 8010 |
6919 If the ``nuw`` keyword is present, then the shift produces a poison | 8011 If the ``nuw`` keyword is present, then the shift produces a poison |
6920 value if it shifts out any non-zero bits. | 8012 value if it shifts out any non-zero bits. |
6921 If the ``nsw`` keyword is present, then the shift produces a poison | 8013 If the ``nsw`` keyword is present, then the shift produces a poison |
6922 value it shifts out any bits that disagree with the resultant sign bit. | 8014 value if it shifts out any bits that disagree with the resultant sign bit. |
6923 | 8015 |
6924 Example: | 8016 Example: |
6925 """""""" | 8017 """""""" |
6926 | 8018 |
6927 .. code-block:: text | 8019 .. code-block:: text |
7199 """"""" | 8291 """"""" |
7200 | 8292 |
7201 :: | 8293 :: |
7202 | 8294 |
7203 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> | 8295 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> |
8296 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty> | |
7204 | 8297 |
7205 Overview: | 8298 Overview: |
7206 """"""""" | 8299 """"""""" |
7207 | 8300 |
7208 The '``extractelement``' instruction extracts a single scalar element | 8301 The '``extractelement``' instruction extracts a single scalar element |
7219 Semantics: | 8312 Semantics: |
7220 """""""""" | 8313 """""""""" |
7221 | 8314 |
7222 The result is a scalar of the same type as the element type of ``val``. | 8315 The result is a scalar of the same type as the element type of ``val``. |
7223 Its value is the value at position ``idx`` of ``val``. If ``idx`` | 8316 Its value is the value at position ``idx`` of ``val``. If ``idx`` |
7224 exceeds the length of ``val``, the results are undefined. | 8317 exceeds the length of ``val`` for a fixed-length vector, the result is a |
8318 :ref:`poison value <poisonvalues>`. For a scalable vector, if the value | |
8319 of ``idx`` exceeds the runtime length of the vector, the result is a | |
8320 :ref:`poison value <poisonvalues>`. | |
7225 | 8321 |
7226 Example: | 8322 Example: |
7227 """""""" | 8323 """""""" |
7228 | 8324 |
7229 .. code-block:: text | 8325 .. code-block:: text |
7239 """"""" | 8335 """"""" |
7240 | 8336 |
7241 :: | 8337 :: |
7242 | 8338 |
7243 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> | 8339 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> |
8340 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>> | |
7244 | 8341 |
7245 Overview: | 8342 Overview: |
7246 """"""""" | 8343 """"""""" |
7247 | 8344 |
7248 The '``insertelement``' instruction inserts a scalar element into a | 8345 The '``insertelement``' instruction inserts a scalar element into a |
7260 Semantics: | 8357 Semantics: |
7261 """""""""" | 8358 """""""""" |
7262 | 8359 |
7263 The result is a vector of the same type as ``val``. Its element values | 8360 The result is a vector of the same type as ``val``. Its element values |
7264 are those of ``val`` except at position ``idx``, where it gets the value | 8361 are those of ``val`` except at position ``idx``, where it gets the value |
7265 ``elt``. If ``idx`` exceeds the length of ``val``, the results are | 8362 ``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector, |
7266 undefined. | 8363 the result is a :ref:`poison value <poisonvalues>`. For a scalable vector, |
8364 if the value of ``idx`` exceeds the runtime length of the vector, the result | |
8365 is a :ref:`poison value <poisonvalues>`. | |
7267 | 8366 |
7268 Example: | 8367 Example: |
7269 """""""" | 8368 """""""" |
7270 | 8369 |
7271 .. code-block:: text | 8370 .. code-block:: text |
7281 """"""" | 8380 """"""" |
7282 | 8381 |
7283 :: | 8382 :: |
7284 | 8383 |
7285 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> | 8384 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> |
8385 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>> | |
7286 | 8386 |
7287 Overview: | 8387 Overview: |
7288 """"""""" | 8388 """"""""" |
7289 | 8389 |
7290 The '``shufflevector``' instruction constructs a permutation of elements | 8390 The '``shufflevector``' instruction constructs a permutation of elements |
7311 element of the result vector, which element of the two input vectors the | 8411 element of the result vector, which element of the two input vectors the |
7312 result element gets. If the shuffle mask is undef, the result vector is | 8412 result element gets. If the shuffle mask is undef, the result vector is |
7313 undef. If any element of the mask operand is undef, that element of the | 8413 undef. If any element of the mask operand is undef, that element of the |
7314 result is undef. If the shuffle mask selects an undef element from one | 8414 result is undef. If the shuffle mask selects an undef element from one |
7315 of the input vectors, the resulting element is undef. | 8415 of the input vectors, the resulting element is undef. |
8416 | |
8417 For scalable vectors, the only valid mask values at present are | |
8418 ``zeroinitializer`` and ``undef``, since we cannot write all indices as | |
8419 literals for a vector with a length unknown at compile time. | |
7316 | 8420 |
7317 Example: | 8421 Example: |
7318 """""""" | 8422 """""""" |
7319 | 8423 |
7320 .. code-block:: text | 8424 .. code-block:: text |
7471 '``type``' may be any sized type. | 8575 '``type``' may be any sized type. |
7472 | 8576 |
7473 Semantics: | 8577 Semantics: |
7474 """""""""" | 8578 """""""""" |
7475 | 8579 |
7476 Memory is allocated; a pointer is returned. The operation is undefined | 8580 Memory is allocated; a pointer is returned. The allocated memory is |
7477 if there is insufficient stack space for the allocation. '``alloca``'d | 8581 uninitialized, and loading from uninitialized memory produces an undefined |
7478 memory is automatically released when the function returns. The | 8582 value. The operation itself is undefined if there is insufficient stack |
7479 '``alloca``' instruction is commonly used to represent automatic | 8583 space for the allocation.'``alloca``'d memory is automatically released |
7480 variables that must have an address available. When the function returns | 8584 when the function returns. The '``alloca``' instruction is commonly used |
7481 (either with the ``ret`` or ``resume`` instructions), the memory is | 8585 to represent automatic variables that must have an address available. When |
7482 reclaimed. Allocating zero bytes is legal, but the result is undefined. | 8586 the function returns (either with the ``ret`` or ``resume`` instructions), |
7483 The order in which memory is allocated (ie., which way the stack grows) | 8587 the memory is reclaimed. Allocating zero bytes is legal, but the returned |
7484 is not specified. | 8588 pointer may not be unique. The order in which memory is allocated (ie., |
8589 which way the stack grows) is not specified. | |
7485 | 8590 |
7486 Example: | 8591 Example: |
7487 """""""" | 8592 """""""" |
7488 | 8593 |
7489 .. code-block:: llvm | 8594 .. code-block:: llvm |
7560 The optional ``!invariant.load`` metadata must reference a single | 8665 The optional ``!invariant.load`` metadata must reference a single |
7561 metadata name ``<index>`` corresponding to a metadata node with no | 8666 metadata name ``<index>`` corresponding to a metadata node with no |
7562 entries. If a load instruction tagged with the ``!invariant.load`` | 8667 entries. If a load instruction tagged with the ``!invariant.load`` |
7563 metadata is executed, the optimizer may assume the memory location | 8668 metadata is executed, the optimizer may assume the memory location |
7564 referenced by the load contains the same value at all points in the | 8669 referenced by the load contains the same value at all points in the |
7565 program where the memory location is known to be dereferenceable. | 8670 program where the memory location is known to be dereferenceable; |
8671 otherwise, the behavior is undefined. | |
7566 | 8672 |
7567 The optional ``!invariant.group`` metadata must reference a single metadata name | 8673 The optional ``!invariant.group`` metadata must reference a single metadata name |
7568 ``<index>`` corresponding to a metadata node. See ``invariant.group`` metadata. | 8674 ``<index>`` corresponding to a metadata node with no entries. |
8675 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>` | |
7569 | 8676 |
7570 The optional ``!nonnull`` metadata must reference a single | 8677 The optional ``!nonnull`` metadata must reference a single |
7571 metadata name ``<index>`` corresponding to a metadata node with no | 8678 metadata name ``<index>`` corresponding to a metadata node with no |
7572 entries. The existence of the ``!nonnull`` metadata on the | 8679 entries. The existence of the ``!nonnull`` metadata on the |
7573 instruction tells the optimizer that the value loaded is known to | 8680 instruction tells the optimizer that the value loaded is known to |
7574 never be null. This is analogous to the ``nonnull`` attribute | 8681 never be null. If the value is null at runtime, the behavior is undefined. |
7575 on parameters and return values. This metadata can only be applied | 8682 This is analogous to the ``nonnull`` attribute on parameters and return |
7576 to loads of a pointer type. | 8683 values. This metadata can only be applied to loads of a pointer type. |
7577 | 8684 |
7578 The optional ``!dereferenceable`` metadata must reference a single metadata | 8685 The optional ``!dereferenceable`` metadata must reference a single metadata |
7579 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` | 8686 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` |
7580 entry. The existence of the ``!dereferenceable`` metadata on the instruction | 8687 entry. |
7581 tells the optimizer that the value loaded is known to be dereferenceable. | 8688 See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>` |
7582 The number of bytes known to be dereferenceable is specified by the integer | |
7583 value in the metadata node. This is analogous to the ''dereferenceable'' | |
7584 attribute on parameters and return values. This metadata can only be applied | |
7585 to loads of a pointer type. | |
7586 | 8689 |
7587 The optional ``!dereferenceable_or_null`` metadata must reference a single | 8690 The optional ``!dereferenceable_or_null`` metadata must reference a single |
7588 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one | 8691 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one |
7589 ``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the | 8692 ``i64`` entry. |
7590 instruction tells the optimizer that the value loaded is known to be either | 8693 See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null |
7591 dereferenceable or null. | 8694 <md_dereferenceable_or_null>` |
7592 The number of bytes known to be dereferenceable is specified by the integer | |
7593 value in the metadata node. This is analogous to the ''dereferenceable_or_null'' | |
7594 attribute on parameters and return values. This metadata can only be applied | |
7595 to loads of a pointer type. | |
7596 | 8695 |
7597 The optional ``!align`` metadata must reference a single metadata name | 8696 The optional ``!align`` metadata must reference a single metadata name |
7598 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry. | 8697 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry. |
7599 The existence of the ``!align`` metadata on the instruction tells the | 8698 The existence of the ``!align`` metadata on the instruction tells the |
7600 optimizer that the value loaded is known to be aligned to a boundary specified | 8699 optimizer that the value loaded is known to be aligned to a boundary specified |
7601 by the integer value in the metadata node. The alignment must be a power of 2. | 8700 by the integer value in the metadata node. The alignment must be a power of 2. |
7602 This is analogous to the ''align'' attribute on parameters and return values. | 8701 This is analogous to the ''align'' attribute on parameters and return values. |
7603 This metadata can only be applied to loads of a pointer type. | 8702 This metadata can only be applied to loads of a pointer type. If the returned |
8703 value is not appropriately aligned at runtime, the behavior is undefined. | |
7604 | 8704 |
7605 Semantics: | 8705 Semantics: |
7606 """""""""" | 8706 """""""""" |
7607 | 8707 |
7608 The location of memory pointed to is loaded. If the value being loaded | 8708 The location of memory pointed to is loaded. If the value being loaded |
7884 - xor | 8984 - xor |
7885 - max | 8985 - max |
7886 - min | 8986 - min |
7887 - umax | 8987 - umax |
7888 - umin | 8988 - umin |
7889 | 8989 - fadd |
7890 The type of '<value>' must be an integer type whose bit width is a power | 8990 - fsub |
7891 of two greater than or equal to eight and less than or equal to a | 8991 |
7892 target-specific size limit. The type of the '``<pointer>``' operand must | 8992 For most of these operations, the type of '<value>' must be an integer |
7893 be a pointer to that type. If the ``atomicrmw`` is marked as | 8993 type whose bit width is a power of two greater than or equal to eight |
7894 ``volatile``, then the optimizer is not allowed to modify the number or | 8994 and less than or equal to a target-specific size limit. For xchg, this |
7895 order of execution of this ``atomicrmw`` with other :ref:`volatile | 8995 may also be a floating point type with the same size constraints as |
7896 operations <volatile>`. | 8996 integers. For fadd/fsub, this must be a floating point type. The |
8997 type of the '``<pointer>``' operand must be a pointer to that type. If | |
8998 the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not | |
8999 allowed to modify the number or order of execution of this | |
9000 ``atomicrmw`` with other :ref:`volatile operations <volatile>`. | |
7897 | 9001 |
7898 A ``atomicrmw`` instruction can also take an optional | 9002 A ``atomicrmw`` instruction can also take an optional |
7899 ":ref:`syncscope <syncscope>`" argument. | 9003 ":ref:`syncscope <syncscope>`" argument. |
7900 | 9004 |
7901 Semantics: | 9005 Semantics: |
7917 - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) | 9021 - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) |
7918 - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned | 9022 - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned |
7919 comparison) | 9023 comparison) |
7920 - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned | 9024 - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned |
7921 comparison) | 9025 comparison) |
9026 - fadd: ``*ptr = *ptr + val`` (using floating point arithmetic) | |
9027 - fsub: ``*ptr = *ptr - val`` (using floating point arithmetic) | |
7922 | 9028 |
7923 Example: | 9029 Example: |
7924 """""""" | 9030 """""""" |
7925 | 9031 |
7926 .. code-block:: llvm | 9032 .. code-block:: llvm |
8295 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. | 9401 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. |
8296 | 9402 |
8297 Arguments: | 9403 Arguments: |
8298 """""""""" | 9404 """""""""" |
8299 | 9405 |
8300 The '``fptrunc``' instruction takes a :ref:`floating point <t_floating>` | 9406 The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>` |
8301 value to cast and a :ref:`floating point <t_floating>` type to cast it to. | 9407 value to cast and a :ref:`floating-point <t_floating>` type to cast it to. |
8302 The size of ``value`` must be larger than the size of ``ty2``. This | 9408 The size of ``value`` must be larger than the size of ``ty2``. This |
8303 implies that ``fptrunc`` cannot be used to make a *no-op cast*. | 9409 implies that ``fptrunc`` cannot be used to make a *no-op cast*. |
8304 | 9410 |
8305 Semantics: | 9411 Semantics: |
8306 """""""""" | 9412 """""""""" |
8307 | 9413 |
8308 The '``fptrunc``' instruction casts a ``value`` from a larger | 9414 The '``fptrunc``' instruction casts a ``value`` from a larger |
8309 :ref:`floating point <t_floating>` type to a smaller :ref:`floating | 9415 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point |
8310 point <t_floating>` type. If the value cannot fit (i.e. overflows) within the | 9416 <t_floating>` type. |
8311 destination type, ``ty2``, then the results are undefined. If the cast produces | 9417 This instruction is assumed to execute in the default :ref:`floating-point |
8312 an inexact result, how rounding is performed (e.g. truncation, also known as | 9418 environment <floatenv>`. |
8313 round to zero) is undefined. | |
8314 | 9419 |
8315 Example: | 9420 Example: |
8316 """""""" | 9421 """""""" |
8317 | 9422 |
8318 .. code-block:: llvm | 9423 .. code-block:: llvm |
8319 | 9424 |
8320 %X = fptrunc double 123.0 to float ; yields float:123.0 | 9425 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0 |
8321 %Y = fptrunc double 1.0E+300 to float ; yields undefined | 9426 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity |
8322 | 9427 |
8323 '``fpext .. to``' Instruction | 9428 '``fpext .. to``' Instruction |
8324 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 9429 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
8325 | 9430 |
8326 Syntax: | 9431 Syntax: |
8331 <result> = fpext <ty> <value> to <ty2> ; yields ty2 | 9436 <result> = fpext <ty> <value> to <ty2> ; yields ty2 |
8332 | 9437 |
8333 Overview: | 9438 Overview: |
8334 """"""""" | 9439 """"""""" |
8335 | 9440 |
8336 The '``fpext``' extends a floating point ``value`` to a larger floating | 9441 The '``fpext``' extends a floating-point ``value`` to a larger floating-point |
8337 point value. | 9442 value. |
8338 | 9443 |
8339 Arguments: | 9444 Arguments: |
8340 """""""""" | 9445 """""""""" |
8341 | 9446 |
8342 The '``fpext``' instruction takes a :ref:`floating point <t_floating>` | 9447 The '``fpext``' instruction takes a :ref:`floating-point <t_floating>` |
8343 ``value`` to cast, and a :ref:`floating point <t_floating>` type to cast it | 9448 ``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it |
8344 to. The source type must be smaller than the destination type. | 9449 to. The source type must be smaller than the destination type. |
8345 | 9450 |
8346 Semantics: | 9451 Semantics: |
8347 """""""""" | 9452 """""""""" |
8348 | 9453 |
8349 The '``fpext``' instruction extends the ``value`` from a smaller | 9454 The '``fpext``' instruction extends the ``value`` from a smaller |
8350 :ref:`floating point <t_floating>` type to a larger :ref:`floating | 9455 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point |
8351 point <t_floating>` type. The ``fpext`` cannot be used to make a | 9456 <t_floating>` type. The ``fpext`` cannot be used to make a |
8352 *no-op cast* because it always changes bits. Use ``bitcast`` to make a | 9457 *no-op cast* because it always changes bits. Use ``bitcast`` to make a |
8353 *no-op cast* for a floating point cast. | 9458 *no-op cast* for a floating-point cast. |
8354 | 9459 |
8355 Example: | 9460 Example: |
8356 """""""" | 9461 """""""" |
8357 | 9462 |
8358 .. code-block:: llvm | 9463 .. code-block:: llvm |
8371 <result> = fptoui <ty> <value> to <ty2> ; yields ty2 | 9476 <result> = fptoui <ty> <value> to <ty2> ; yields ty2 |
8372 | 9477 |
8373 Overview: | 9478 Overview: |
8374 """"""""" | 9479 """"""""" |
8375 | 9480 |
8376 The '``fptoui``' converts a floating point ``value`` to its unsigned | 9481 The '``fptoui``' converts a floating-point ``value`` to its unsigned |
8377 integer equivalent of type ``ty2``. | 9482 integer equivalent of type ``ty2``. |
8378 | 9483 |
8379 Arguments: | 9484 Arguments: |
8380 """""""""" | 9485 """""""""" |
8381 | 9486 |
8382 The '``fptoui``' instruction takes a value to cast, which must be a | 9487 The '``fptoui``' instruction takes a value to cast, which must be a |
8383 scalar or vector :ref:`floating point <t_floating>` value, and a type to | 9488 scalar or vector :ref:`floating-point <t_floating>` value, and a type to |
8384 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If | 9489 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If |
8385 ``ty`` is a vector floating point type, ``ty2`` must be a vector integer | 9490 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer |
8386 type with the same number of elements as ``ty`` | 9491 type with the same number of elements as ``ty`` |
8387 | 9492 |
8388 Semantics: | 9493 Semantics: |
8389 """""""""" | 9494 """""""""" |
8390 | 9495 |
8391 The '``fptoui``' instruction converts its :ref:`floating | 9496 The '``fptoui``' instruction converts its :ref:`floating-point |
8392 point <t_floating>` operand into the nearest (rounding towards zero) | 9497 <t_floating>` operand into the nearest (rounding towards zero) |
8393 unsigned integer value. If the value cannot fit in ``ty2``, the results | 9498 unsigned integer value. If the value cannot fit in ``ty2``, the result |
8394 are undefined. | 9499 is a :ref:`poison value <poisonvalues>`. |
8395 | 9500 |
8396 Example: | 9501 Example: |
8397 """""""" | 9502 """""""" |
8398 | 9503 |
8399 .. code-block:: llvm | 9504 .. code-block:: llvm |
8413 <result> = fptosi <ty> <value> to <ty2> ; yields ty2 | 9518 <result> = fptosi <ty> <value> to <ty2> ; yields ty2 |
8414 | 9519 |
8415 Overview: | 9520 Overview: |
8416 """"""""" | 9521 """"""""" |
8417 | 9522 |
8418 The '``fptosi``' instruction converts :ref:`floating point <t_floating>` | 9523 The '``fptosi``' instruction converts :ref:`floating-point <t_floating>` |
8419 ``value`` to type ``ty2``. | 9524 ``value`` to type ``ty2``. |
8420 | 9525 |
8421 Arguments: | 9526 Arguments: |
8422 """""""""" | 9527 """""""""" |
8423 | 9528 |
8424 The '``fptosi``' instruction takes a value to cast, which must be a | 9529 The '``fptosi``' instruction takes a value to cast, which must be a |
8425 scalar or vector :ref:`floating point <t_floating>` value, and a type to | 9530 scalar or vector :ref:`floating-point <t_floating>` value, and a type to |
8426 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If | 9531 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If |
8427 ``ty`` is a vector floating point type, ``ty2`` must be a vector integer | 9532 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer |
8428 type with the same number of elements as ``ty`` | 9533 type with the same number of elements as ``ty`` |
8429 | 9534 |
8430 Semantics: | 9535 Semantics: |
8431 """""""""" | 9536 """""""""" |
8432 | 9537 |
8433 The '``fptosi``' instruction converts its :ref:`floating | 9538 The '``fptosi``' instruction converts its :ref:`floating-point |
8434 point <t_floating>` operand into the nearest (rounding towards zero) | 9539 <t_floating>` operand into the nearest (rounding towards zero) |
8435 signed integer value. If the value cannot fit in ``ty2``, the results | 9540 signed integer value. If the value cannot fit in ``ty2``, the result |
8436 are undefined. | 9541 is a :ref:`poison value <poisonvalues>`. |
8437 | 9542 |
8438 Example: | 9543 Example: |
8439 """""""" | 9544 """""""" |
8440 | 9545 |
8441 .. code-block:: llvm | 9546 .. code-block:: llvm |
8463 Arguments: | 9568 Arguments: |
8464 """""""""" | 9569 """""""""" |
8465 | 9570 |
8466 The '``uitofp``' instruction takes a value to cast, which must be a | 9571 The '``uitofp``' instruction takes a value to cast, which must be a |
8467 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to | 9572 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to |
8468 ``ty2``, which must be an :ref:`floating point <t_floating>` type. If | 9573 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If |
8469 ``ty`` is a vector integer type, ``ty2`` must be a vector floating point | 9574 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point |
8470 type with the same number of elements as ``ty`` | 9575 type with the same number of elements as ``ty`` |
8471 | 9576 |
8472 Semantics: | 9577 Semantics: |
8473 """""""""" | 9578 """""""""" |
8474 | 9579 |
8475 The '``uitofp``' instruction interprets its operand as an unsigned | 9580 The '``uitofp``' instruction interprets its operand as an unsigned |
8476 integer quantity and converts it to the corresponding floating point | 9581 integer quantity and converts it to the corresponding floating-point |
8477 value. If the value cannot fit in the floating point value, the results | 9582 value. If the value cannot be exactly represented, it is rounded using |
8478 are undefined. | 9583 the default rounding mode. |
9584 | |
8479 | 9585 |
8480 Example: | 9586 Example: |
8481 """""""" | 9587 """""""" |
8482 | 9588 |
8483 .. code-block:: llvm | 9589 .. code-block:: llvm |
8504 Arguments: | 9610 Arguments: |
8505 """""""""" | 9611 """""""""" |
8506 | 9612 |
8507 The '``sitofp``' instruction takes a value to cast, which must be a | 9613 The '``sitofp``' instruction takes a value to cast, which must be a |
8508 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to | 9614 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to |
8509 ``ty2``, which must be an :ref:`floating point <t_floating>` type. If | 9615 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If |
8510 ``ty`` is a vector integer type, ``ty2`` must be a vector floating point | 9616 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point |
8511 type with the same number of elements as ``ty`` | 9617 type with the same number of elements as ``ty`` |
8512 | 9618 |
8513 Semantics: | 9619 Semantics: |
8514 """""""""" | 9620 """""""""" |
8515 | 9621 |
8516 The '``sitofp``' instruction interprets its operand as a signed integer | 9622 The '``sitofp``' instruction interprets its operand as a signed integer |
8517 quantity and converts it to the corresponding floating point value. If | 9623 quantity and converts it to the corresponding floating-point value. If the |
8518 the value cannot fit in the floating point value, the results are | 9624 value cannot be exactly represented, it is rounded using the default rounding |
8519 undefined. | 9625 mode. |
8520 | 9626 |
8521 Example: | 9627 Example: |
8522 """""""" | 9628 """""""" |
8523 | 9629 |
8524 .. code-block:: llvm | 9630 .. code-block:: llvm |
8580 Syntax: | 9686 Syntax: |
8581 """"""" | 9687 """"""" |
8582 | 9688 |
8583 :: | 9689 :: |
8584 | 9690 |
8585 <result> = inttoptr <ty> <value> to <ty2> ; yields ty2 | 9691 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node] ; yields ty2 |
8586 | 9692 |
8587 Overview: | 9693 Overview: |
8588 """"""""" | 9694 """"""""" |
8589 | 9695 |
8590 The '``inttoptr``' instruction converts an integer ``value`` to a | 9696 The '``inttoptr``' instruction converts an integer ``value`` to a |
8594 """""""""" | 9700 """""""""" |
8595 | 9701 |
8596 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to | 9702 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to |
8597 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` | 9703 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` |
8598 type. | 9704 type. |
9705 | |
9706 The optional ``!dereferenceable`` metadata must reference a single metadata | |
9707 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` | |
9708 entry. | |
9709 See ``dereferenceable`` metadata. | |
9710 | |
9711 The optional ``!dereferenceable_or_null`` metadata must reference a single | |
9712 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one | |
9713 ``i64`` entry. | |
9714 See ``dereferenceable_or_null`` metadata. | |
8599 | 9715 |
8600 Semantics: | 9716 Semantics: |
8601 """""""""" | 9717 """""""""" |
8602 | 9718 |
8603 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by | 9719 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by |
8826 """"""""" | 9942 """"""""" |
8827 | 9943 |
8828 The '``fcmp``' instruction returns a boolean value or vector of boolean | 9944 The '``fcmp``' instruction returns a boolean value or vector of boolean |
8829 values based on comparison of its operands. | 9945 values based on comparison of its operands. |
8830 | 9946 |
8831 If the operands are floating point scalars, then the result type is a | 9947 If the operands are floating-point scalars, then the result type is a |
8832 boolean (:ref:`i1 <t_integer>`). | 9948 boolean (:ref:`i1 <t_integer>`). |
8833 | 9949 |
8834 If the operands are floating point vectors, then the result type is a | 9950 If the operands are floating-point vectors, then the result type is a |
8835 vector of boolean with the same number of elements as the operands being | 9951 vector of boolean with the same number of elements as the operands being |
8836 compared. | 9952 compared. |
8837 | 9953 |
8838 Arguments: | 9954 Arguments: |
8839 """""""""" | 9955 """""""""" |
8860 #. ``true``: no comparison, always returns true | 9976 #. ``true``: no comparison, always returns true |
8861 | 9977 |
8862 *Ordered* means that neither operand is a QNAN while *unordered* means | 9978 *Ordered* means that neither operand is a QNAN while *unordered* means |
8863 that either operand may be a QNAN. | 9979 that either operand may be a QNAN. |
8864 | 9980 |
8865 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating | 9981 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point |
8866 point <t_floating>` type or a :ref:`vector <t_vector>` of floating point | 9982 <t_floating>` type or a :ref:`vector <t_vector>` of floating-point type. |
8867 type. They must have identical types. | 9983 They must have identical types. |
8868 | 9984 |
8869 Semantics: | 9985 Semantics: |
8870 """""""""" | 9986 """""""""" |
8871 | 9987 |
8872 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the | 9988 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the |
8903 #. ``uno``: yields ``true`` if either operand is a QNAN. | 10019 #. ``uno``: yields ``true`` if either operand is a QNAN. |
8904 #. ``true``: always yields ``true``, regardless of operands. | 10020 #. ``true``: always yields ``true``, regardless of operands. |
8905 | 10021 |
8906 The ``fcmp`` instruction can also optionally take any number of | 10022 The ``fcmp`` instruction can also optionally take any number of |
8907 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable | 10023 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable |
8908 otherwise unsafe floating point optimizations. | 10024 otherwise unsafe floating-point optimizations. |
8909 | 10025 |
8910 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the | 10026 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the |
8911 only flags that have any effect on its semantics are those that allow | 10027 only flags that have any effect on its semantics are those that allow |
8912 assumptions to be made about the values of input arguments; namely | 10028 assumptions to be made about the values of input arguments; namely |
8913 ``nnan``, ``ninf``, and ``nsz``. See :ref:`fastmath` for more information. | 10029 ``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information. |
8914 | 10030 |
8915 Example: | 10031 Example: |
8916 """""""" | 10032 """""""" |
8917 | 10033 |
8918 .. code-block:: text | 10034 .. code-block:: text |
8984 Syntax: | 10100 Syntax: |
8985 """"""" | 10101 """"""" |
8986 | 10102 |
8987 :: | 10103 :: |
8988 | 10104 |
8989 <result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty | 10105 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty |
8990 | 10106 |
8991 selty is either i1 or {<N x i1>} | 10107 selty is either i1 or {<N x i1>} |
8992 | 10108 |
8993 Overview: | 10109 Overview: |
8994 """"""""" | 10110 """"""""" |
9001 | 10117 |
9002 The '``select``' instruction requires an 'i1' value or a vector of 'i1' | 10118 The '``select``' instruction requires an 'i1' value or a vector of 'i1' |
9003 values indicating the condition, and two values of the same :ref:`first | 10119 values indicating the condition, and two values of the same :ref:`first |
9004 class <t_firstclass>` type. | 10120 class <t_firstclass>` type. |
9005 | 10121 |
10122 #. The optional ``fast-math flags`` marker indicates that the select has one or more | |
10123 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable | |
10124 otherwise unsafe floating-point optimizations. Fast-math flags are only valid | |
10125 for selects that return a floating-point scalar or vector type. | |
10126 | |
9006 Semantics: | 10127 Semantics: |
9007 """""""""" | 10128 """""""""" |
9008 | 10129 |
9009 If the condition is an i1 and it evaluates to 1, the instruction returns | 10130 If the condition is an i1 and it evaluates to 1, the instruction returns |
9010 the first value argument; otherwise, it returns the second value | 10131 the first value argument; otherwise, it returns the second value |
9031 Syntax: | 10152 Syntax: |
9032 """"""" | 10153 """"""" |
9033 | 10154 |
9034 :: | 10155 :: |
9035 | 10156 |
9036 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] | 10157 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)] |
9037 [ operand bundles ] | 10158 [<ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ] |
9038 | 10159 |
9039 Overview: | 10160 Overview: |
9040 """"""""" | 10161 """"""""" |
9041 | 10162 |
9042 The '``call``' instruction represents a simple function call. | 10163 The '``call``' instruction represents a simple function call. |
9054 | 10175 |
9055 #. The call will not cause unbounded stack growth if it is part of a | 10176 #. The call will not cause unbounded stack growth if it is part of a |
9056 recursive cycle in the call graph. | 10177 recursive cycle in the call graph. |
9057 #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are | 10178 #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are |
9058 forwarded in place. | 10179 forwarded in place. |
10180 #. If the musttail call appears in a function with the ``"thunk"`` attribute | |
10181 and the caller and callee both have varargs, than any unprototyped | |
10182 arguments in register or memory are forwarded to the callee. Similarly, | |
10183 the return value of the callee is returned the the caller's caller, even | |
10184 if a void return type is in use. | |
9059 | 10185 |
9060 Both markers imply that the callee does not access allocas from the caller. | 10186 Both markers imply that the callee does not access allocas from the caller. |
9061 The ``tail`` marker additionally implies that the callee does not access | 10187 The ``tail`` marker additionally implies that the callee does not access |
9062 varargs from the caller, while ``musttail`` implies that varargs from the | 10188 varargs from the caller. Calls marked ``musttail`` must obey the following |
9063 caller are passed to the callee. Calls marked ``musttail`` must obey the | 10189 additional rules: |
9064 following additional rules: | |
9065 | 10190 |
9066 - The call must immediately precede a :ref:`ret <i_ret>` instruction, | 10191 - The call must immediately precede a :ref:`ret <i_ret>` instruction, |
9067 or a pointer bitcast followed by a ret instruction. | 10192 or a pointer bitcast followed by a ret instruction. |
9068 - The ret instruction must return the (possibly bitcasted) value | 10193 - The ret instruction must return the (possibly bitcasted) value |
9069 produced by the call or void. | 10194 produced by the call or void. |
9103 calling convention of the call must match the calling convention of | 10228 calling convention of the call must match the calling convention of |
9104 the target function, or else the behavior is undefined. | 10229 the target function, or else the behavior is undefined. |
9105 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return | 10230 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return |
9106 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes | 10231 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes |
9107 are valid here. | 10232 are valid here. |
10233 #. The optional addrspace attribute can be used to indicate the address space | |
10234 of the called function. If it is not specified, the program address space | |
10235 from the :ref:`datalayout string<langref_datalayout>` will be used. | |
9108 #. '``ty``': the type of the call instruction itself which is also the | 10236 #. '``ty``': the type of the call instruction itself which is also the |
9109 type of the return value. Functions that return no value are marked | 10237 type of the return value. Functions that return no value are marked |
9110 ``void``. | 10238 ``void``. |
9111 #. '``fnty``': shall be the signature of the function being called. The | 10239 #. '``fnty``': shall be the signature of the function being called. The |
9112 argument types must match the types implied by this signature. This | 10240 argument types must match the types implied by this signature. This |
9468 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is | 10596 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is |
9469 overloaded, and only one type suffix is required. Because the argument's | 10597 overloaded, and only one type suffix is required. Because the argument's |
9470 type is matched against the return type, it does not require its own | 10598 type is matched against the return type, it does not require its own |
9471 name suffix. | 10599 name suffix. |
9472 | 10600 |
10601 For target developers who are defining intrinsics for back-end code | |
10602 generation, any intrinsic overloads based solely the distinction between | |
10603 integer or floating point types should not be relied upon for correct | |
10604 code generation. In such cases, the recommended approach for target | |
10605 maintainers when defining intrinsics is to create separate integer and | |
10606 FP intrinsics rather than rely on overloading. For example, if different | |
10607 codegen is required for ``llvm.target.foo(<4 x i32>)`` and | |
10608 ``llvm.target.foo(<4 x float>)`` then these should be split into | |
10609 different intrinsics. | |
10610 | |
9473 To learn how to add an intrinsic function, please see the `Extending | 10611 To learn how to add an intrinsic function, please see the `Extending |
9474 LLVM Guide <ExtendingLLVM.html>`_. | 10612 LLVM Guide <ExtendingLLVM.html>`_. |
9475 | 10613 |
9476 .. _int_varargs: | 10614 .. _int_varargs: |
9477 | 10615 |
9821 | 10959 |
9822 Note that calling this intrinsic does not prevent function inlining or | 10960 Note that calling this intrinsic does not prevent function inlining or |
9823 other aggressive transformations, so the value returned may not be that | 10961 other aggressive transformations, so the value returned may not be that |
9824 of the obvious source-language caller. | 10962 of the obvious source-language caller. |
9825 | 10963 |
9826 This intrinsic is only implemented for x86. | 10964 This intrinsic is only implemented for x86 and aarch64. |
10965 | |
10966 '``llvm.sponentry``' Intrinsic | |
10967 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
10968 | |
10969 Syntax: | |
10970 """"""" | |
10971 | |
10972 :: | |
10973 | |
10974 declare i8* @llvm.sponentry() | |
10975 | |
10976 Overview: | |
10977 """"""""" | |
10978 | |
10979 The '``llvm.sponentry``' intrinsic returns the stack pointer value at | |
10980 the entry of the current function calling this intrinsic. | |
10981 | |
10982 Semantics: | |
10983 """""""""" | |
10984 | |
10985 Note this intrinsic is only verified on AArch64. | |
9827 | 10986 |
9828 '``llvm.frameaddress``' Intrinsic | 10987 '``llvm.frameaddress``' Intrinsic |
9829 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 10988 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
9830 | 10989 |
9831 Syntax: | 10990 Syntax: |
10398 """""""""" | 11557 """""""""" |
10399 | 11558 |
10400 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the | 11559 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the |
10401 source location to the destination location, which are not allowed to | 11560 source location to the destination location, which are not allowed to |
10402 overlap. It copies "len" bytes of memory over. If the argument is known | 11561 overlap. It copies "len" bytes of memory over. If the argument is known |
10403 to be aligned to some boundary, this can be specified as the fourth | 11562 to be aligned to some boundary, this can be specified as an attribute on |
10404 argument, otherwise it should be set to 0 or 1 (both meaning no alignment). | 11563 the argument. |
11564 | |
11565 If "len" is 0, the pointers may be NULL or dangling. However, they must still | |
11566 be appropriately aligned. | |
10405 | 11567 |
10406 .. _int_memmove: | 11568 .. _int_memmove: |
10407 | 11569 |
10408 '``llvm.memmove``' Intrinsic | 11570 '``llvm.memmove``' Intrinsic |
10409 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 11571 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
10453 """""""""" | 11615 """""""""" |
10454 | 11616 |
10455 The '``llvm.memmove.*``' intrinsics copy a block of memory from the | 11617 The '``llvm.memmove.*``' intrinsics copy a block of memory from the |
10456 source location to the destination location, which may overlap. It | 11618 source location to the destination location, which may overlap. It |
10457 copies "len" bytes of memory over. If the argument is known to be | 11619 copies "len" bytes of memory over. If the argument is known to be |
10458 aligned to some boundary, this can be specified as the fourth argument, | 11620 aligned to some boundary, this can be specified as an attribute on |
10459 otherwise it should be set to 0 or 1 (both meaning no alignment). | 11621 the argument. |
11622 | |
11623 If "len" is 0, the pointers may be NULL or dangling. However, they must still | |
11624 be appropriately aligned. | |
10460 | 11625 |
10461 .. _int_memset: | 11626 .. _int_memset: |
10462 | 11627 |
10463 '``llvm.memset.*``' Intrinsics | 11628 '``llvm.memset.*``' Intrinsics |
10464 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 11629 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
10504 | 11669 |
10505 Semantics: | 11670 Semantics: |
10506 """""""""" | 11671 """""""""" |
10507 | 11672 |
10508 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting | 11673 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting |
10509 at the destination location. | 11674 at the destination location. If the argument is known to be |
11675 aligned to some boundary, this can be specified as an attribute on | |
11676 the argument. | |
11677 | |
11678 If "len" is 0, the pointers may be NULL or dangling. However, they must still | |
11679 be appropriately aligned. | |
10510 | 11680 |
10511 '``llvm.sqrt.*``' Intrinsic | 11681 '``llvm.sqrt.*``' Intrinsic |
10512 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 11682 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
10513 | 11683 |
10514 Syntax: | 11684 Syntax: |
10551 | 11721 |
10552 Syntax: | 11722 Syntax: |
10553 """"""" | 11723 """"""" |
10554 | 11724 |
10555 This is an overloaded intrinsic. You can use ``llvm.powi`` on any | 11725 This is an overloaded intrinsic. You can use ``llvm.powi`` on any |
10556 floating point or vector of floating point type. Not all targets support | 11726 floating-point or vector of floating-point type. Not all targets support |
10557 all types however. | 11727 all types however. |
10558 | 11728 |
10559 :: | 11729 :: |
10560 | 11730 |
10561 declare float @llvm.powi.f32(float %Val, i32 %power) | 11731 declare float @llvm.powi.f32(float %Val, i32 %power) |
10567 Overview: | 11737 Overview: |
10568 """"""""" | 11738 """"""""" |
10569 | 11739 |
10570 The '``llvm.powi.*``' intrinsics return the first operand raised to the | 11740 The '``llvm.powi.*``' intrinsics return the first operand raised to the |
10571 specified (positive or negative) power. The order of evaluation of | 11741 specified (positive or negative) power. The order of evaluation of |
10572 multiplications is not defined. When a vector of floating point type is | 11742 multiplications is not defined. When a vector of floating-point type is |
10573 used, the second argument remains a scalar integer value. | 11743 used, the second argument remains a scalar integer value. |
10574 | 11744 |
10575 Arguments: | 11745 Arguments: |
10576 """""""""" | 11746 """""""""" |
10577 | 11747 |
10928 | 12098 |
10929 Syntax: | 12099 Syntax: |
10930 """"""" | 12100 """"""" |
10931 | 12101 |
10932 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any | 12102 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any |
10933 floating point or vector of floating point type. Not all targets support | 12103 floating-point or vector of floating-point type. Not all targets support |
10934 all types however. | 12104 all types however. |
10935 | 12105 |
10936 :: | 12106 :: |
10937 | 12107 |
10938 declare float @llvm.fabs.f32(float %Val) | 12108 declare float @llvm.fabs.f32(float %Val) |
10948 operand. | 12118 operand. |
10949 | 12119 |
10950 Arguments: | 12120 Arguments: |
10951 """""""""" | 12121 """""""""" |
10952 | 12122 |
10953 The argument and return value are floating point numbers of the same | 12123 The argument and return value are floating-point numbers of the same |
10954 type. | 12124 type. |
10955 | 12125 |
10956 Semantics: | 12126 Semantics: |
10957 """""""""" | 12127 """""""""" |
10958 | 12128 |
10964 | 12134 |
10965 Syntax: | 12135 Syntax: |
10966 """"""" | 12136 """"""" |
10967 | 12137 |
10968 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any | 12138 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any |
10969 floating point or vector of floating point type. Not all targets support | 12139 floating-point or vector of floating-point type. Not all targets support |
10970 all types however. | 12140 all types however. |
10971 | 12141 |
10972 :: | 12142 :: |
10973 | 12143 |
10974 declare float @llvm.minnum.f32(float %Val0, float %Val1) | 12144 declare float @llvm.minnum.f32(float %Val0, float %Val1) |
10985 | 12155 |
10986 | 12156 |
10987 Arguments: | 12157 Arguments: |
10988 """""""""" | 12158 """""""""" |
10989 | 12159 |
10990 The arguments and return value are floating point numbers of the same | 12160 The arguments and return value are floating-point numbers of the same |
10991 type. | 12161 type. |
10992 | 12162 |
10993 Semantics: | 12163 Semantics: |
10994 """""""""" | 12164 """""""""" |
10995 | 12165 |
10996 Follows the IEEE-754 semantics for minNum, which also match for libm's | 12166 Follows the IEEE-754 semantics for minNum, except for handling of |
10997 fmin. | 12167 signaling NaNs. This match's the behavior of libm's fmin. |
10998 | 12168 |
10999 If either operand is a NaN, returns the other non-NaN operand. Returns | 12169 If either operand is a NaN, returns the other non-NaN operand. Returns |
11000 NaN only if both operands are NaN. If the operands compare equal, | 12170 NaN only if both operands are NaN. The returned NaN is always |
11001 returns a value that compares equal to both operands. This means that | 12171 quiet. If the operands compare equal, returns a value that compares |
11002 fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0. | 12172 equal to both operands. This means that fmin(+/-0.0, +/-0.0) could |
12173 return either -0.0 or 0.0. | |
12174 | |
12175 Unlike the IEEE-754 2008 behavior, this does not distinguish between | |
12176 signaling and quiet NaN inputs. If a target's implementation follows | |
12177 the standard and returns a quiet NaN if either input is a signaling | |
12178 NaN, the intrinsic lowering is responsible for quieting the inputs to | |
12179 correctly return the non-NaN input (e.g. by using the equivalent of | |
12180 ``llvm.canonicalize``). | |
12181 | |
11003 | 12182 |
11004 '``llvm.maxnum.*``' Intrinsic | 12183 '``llvm.maxnum.*``' Intrinsic |
11005 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 12184 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
11006 | 12185 |
11007 Syntax: | 12186 Syntax: |
11008 """"""" | 12187 """"""" |
11009 | 12188 |
11010 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any | 12189 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any |
11011 floating point or vector of floating point type. Not all targets support | 12190 floating-point or vector of floating-point type. Not all targets support |
11012 all types however. | 12191 all types however. |
11013 | 12192 |
11014 :: | 12193 :: |
11015 | 12194 |
11016 declare float @llvm.maxnum.f32(float %Val0, float %Val1l) | 12195 declare float @llvm.maxnum.f32(float %Val0, float %Val1l) |
11027 | 12206 |
11028 | 12207 |
11029 Arguments: | 12208 Arguments: |
11030 """""""""" | 12209 """""""""" |
11031 | 12210 |
11032 The arguments and return value are floating point numbers of the same | 12211 The arguments and return value are floating-point numbers of the same |
11033 type. | 12212 type. |
11034 | 12213 |
11035 Semantics: | 12214 Semantics: |
11036 """""""""" | 12215 """""""""" |
11037 Follows the IEEE-754 semantics for maxNum, which also match for libm's | 12216 Follows the IEEE-754 semantics for maxNum except for the handling of |
11038 fmax. | 12217 signaling NaNs. This matches the behavior of libm's fmax. |
11039 | 12218 |
11040 If either operand is a NaN, returns the other non-NaN operand. Returns | 12219 If either operand is a NaN, returns the other non-NaN operand. Returns |
11041 NaN only if both operands are NaN. If the operands compare equal, | 12220 NaN only if both operands are NaN. The returned NaN is always |
11042 returns a value that compares equal to both operands. This means that | 12221 quiet. If the operands compare equal, returns a value that compares |
11043 fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0. | 12222 equal to both operands. This means that fmax(+/-0.0, +/-0.0) could |
12223 return either -0.0 or 0.0. | |
12224 | |
12225 Unlike the IEEE-754 2008 behavior, this does not distinguish between | |
12226 signaling and quiet NaN inputs. If a target's implementation follows | |
12227 the standard and returns a quiet NaN if either input is a signaling | |
12228 NaN, the intrinsic lowering is responsible for quieting the inputs to | |
12229 correctly return the non-NaN input (e.g. by using the equivalent of | |
12230 ``llvm.canonicalize``). | |
12231 | |
12232 '``llvm.minimum.*``' Intrinsic | |
12233 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12234 | |
12235 Syntax: | |
12236 """"""" | |
12237 | |
12238 This is an overloaded intrinsic. You can use ``llvm.minimum`` on any | |
12239 floating-point or vector of floating-point type. Not all targets support | |
12240 all types however. | |
12241 | |
12242 :: | |
12243 | |
12244 declare float @llvm.minimum.f32(float %Val0, float %Val1) | |
12245 declare double @llvm.minimum.f64(double %Val0, double %Val1) | |
12246 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1) | |
12247 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1) | |
12248 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) | |
12249 | |
12250 Overview: | |
12251 """"""""" | |
12252 | |
12253 The '``llvm.minimum.*``' intrinsics return the minimum of the two | |
12254 arguments, propagating NaNs and treating -0.0 as less than +0.0. | |
12255 | |
12256 | |
12257 Arguments: | |
12258 """""""""" | |
12259 | |
12260 The arguments and return value are floating-point numbers of the same | |
12261 type. | |
12262 | |
12263 Semantics: | |
12264 """""""""" | |
12265 If either operand is a NaN, returns NaN. Otherwise returns the lesser | |
12266 of the two arguments. -0.0 is considered to be less than +0.0 for this | |
12267 intrinsic. Note that these are the semantics specified in the draft of | |
12268 IEEE 754-2018. | |
12269 | |
12270 '``llvm.maximum.*``' Intrinsic | |
12271 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12272 | |
12273 Syntax: | |
12274 """"""" | |
12275 | |
12276 This is an overloaded intrinsic. You can use ``llvm.maximum`` on any | |
12277 floating-point or vector of floating-point type. Not all targets support | |
12278 all types however. | |
12279 | |
12280 :: | |
12281 | |
12282 declare float @llvm.maximum.f32(float %Val0, float %Val1) | |
12283 declare double @llvm.maximum.f64(double %Val0, double %Val1) | |
12284 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1) | |
12285 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1) | |
12286 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) | |
12287 | |
12288 Overview: | |
12289 """"""""" | |
12290 | |
12291 The '``llvm.maximum.*``' intrinsics return the maximum of the two | |
12292 arguments, propagating NaNs and treating -0.0 as less than +0.0. | |
12293 | |
12294 | |
12295 Arguments: | |
12296 """""""""" | |
12297 | |
12298 The arguments and return value are floating-point numbers of the same | |
12299 type. | |
12300 | |
12301 Semantics: | |
12302 """""""""" | |
12303 If either operand is a NaN, returns NaN. Otherwise returns the greater | |
12304 of the two arguments. -0.0 is considered to be less than +0.0 for this | |
12305 intrinsic. Note that these are the semantics specified in the draft of | |
12306 IEEE 754-2018. | |
11044 | 12307 |
11045 '``llvm.copysign.*``' Intrinsic | 12308 '``llvm.copysign.*``' Intrinsic |
11046 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 12309 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
11047 | 12310 |
11048 Syntax: | 12311 Syntax: |
11049 """"""" | 12312 """"""" |
11050 | 12313 |
11051 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any | 12314 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any |
11052 floating point or vector of floating point type. Not all targets support | 12315 floating-point or vector of floating-point type. Not all targets support |
11053 all types however. | 12316 all types however. |
11054 | 12317 |
11055 :: | 12318 :: |
11056 | 12319 |
11057 declare float @llvm.copysign.f32(float %Mag, float %Sgn) | 12320 declare float @llvm.copysign.f32(float %Mag, float %Sgn) |
11067 first operand and the sign of the second operand. | 12330 first operand and the sign of the second operand. |
11068 | 12331 |
11069 Arguments: | 12332 Arguments: |
11070 """""""""" | 12333 """""""""" |
11071 | 12334 |
11072 The arguments and return value are floating point numbers of the same | 12335 The arguments and return value are floating-point numbers of the same |
11073 type. | 12336 type. |
11074 | 12337 |
11075 Semantics: | 12338 Semantics: |
11076 """""""""" | 12339 """""""""" |
11077 | 12340 |
11083 | 12346 |
11084 Syntax: | 12347 Syntax: |
11085 """"""" | 12348 """"""" |
11086 | 12349 |
11087 This is an overloaded intrinsic. You can use ``llvm.floor`` on any | 12350 This is an overloaded intrinsic. You can use ``llvm.floor`` on any |
11088 floating point or vector of floating point type. Not all targets support | 12351 floating-point or vector of floating-point type. Not all targets support |
11089 all types however. | 12352 all types however. |
11090 | 12353 |
11091 :: | 12354 :: |
11092 | 12355 |
11093 declare float @llvm.floor.f32(float %Val) | 12356 declare float @llvm.floor.f32(float %Val) |
11102 The '``llvm.floor.*``' intrinsics return the floor of the operand. | 12365 The '``llvm.floor.*``' intrinsics return the floor of the operand. |
11103 | 12366 |
11104 Arguments: | 12367 Arguments: |
11105 """""""""" | 12368 """""""""" |
11106 | 12369 |
11107 The argument and return value are floating point numbers of the same | 12370 The argument and return value are floating-point numbers of the same |
11108 type. | 12371 type. |
11109 | 12372 |
11110 Semantics: | 12373 Semantics: |
11111 """""""""" | 12374 """""""""" |
11112 | 12375 |
11118 | 12381 |
11119 Syntax: | 12382 Syntax: |
11120 """"""" | 12383 """"""" |
11121 | 12384 |
11122 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any | 12385 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any |
11123 floating point or vector of floating point type. Not all targets support | 12386 floating-point or vector of floating-point type. Not all targets support |
11124 all types however. | 12387 all types however. |
11125 | 12388 |
11126 :: | 12389 :: |
11127 | 12390 |
11128 declare float @llvm.ceil.f32(float %Val) | 12391 declare float @llvm.ceil.f32(float %Val) |
11137 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. | 12400 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. |
11138 | 12401 |
11139 Arguments: | 12402 Arguments: |
11140 """""""""" | 12403 """""""""" |
11141 | 12404 |
11142 The argument and return value are floating point numbers of the same | 12405 The argument and return value are floating-point numbers of the same |
11143 type. | 12406 type. |
11144 | 12407 |
11145 Semantics: | 12408 Semantics: |
11146 """""""""" | 12409 """""""""" |
11147 | 12410 |
11153 | 12416 |
11154 Syntax: | 12417 Syntax: |
11155 """"""" | 12418 """"""" |
11156 | 12419 |
11157 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any | 12420 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any |
11158 floating point or vector of floating point type. Not all targets support | 12421 floating-point or vector of floating-point type. Not all targets support |
11159 all types however. | 12422 all types however. |
11160 | 12423 |
11161 :: | 12424 :: |
11162 | 12425 |
11163 declare float @llvm.trunc.f32(float %Val) | 12426 declare float @llvm.trunc.f32(float %Val) |
11173 nearest integer not larger in magnitude than the operand. | 12436 nearest integer not larger in magnitude than the operand. |
11174 | 12437 |
11175 Arguments: | 12438 Arguments: |
11176 """""""""" | 12439 """""""""" |
11177 | 12440 |
11178 The argument and return value are floating point numbers of the same | 12441 The argument and return value are floating-point numbers of the same |
11179 type. | 12442 type. |
11180 | 12443 |
11181 Semantics: | 12444 Semantics: |
11182 """""""""" | 12445 """""""""" |
11183 | 12446 |
11189 | 12452 |
11190 Syntax: | 12453 Syntax: |
11191 """"""" | 12454 """"""" |
11192 | 12455 |
11193 This is an overloaded intrinsic. You can use ``llvm.rint`` on any | 12456 This is an overloaded intrinsic. You can use ``llvm.rint`` on any |
11194 floating point or vector of floating point type. Not all targets support | 12457 floating-point or vector of floating-point type. Not all targets support |
11195 all types however. | 12458 all types however. |
11196 | 12459 |
11197 :: | 12460 :: |
11198 | 12461 |
11199 declare float @llvm.rint.f32(float %Val) | 12462 declare float @llvm.rint.f32(float %Val) |
11210 operand isn't an integer. | 12473 operand isn't an integer. |
11211 | 12474 |
11212 Arguments: | 12475 Arguments: |
11213 """""""""" | 12476 """""""""" |
11214 | 12477 |
11215 The argument and return value are floating point numbers of the same | 12478 The argument and return value are floating-point numbers of the same |
11216 type. | 12479 type. |
11217 | 12480 |
11218 Semantics: | 12481 Semantics: |
11219 """""""""" | 12482 """""""""" |
11220 | 12483 |
11226 | 12489 |
11227 Syntax: | 12490 Syntax: |
11228 """"""" | 12491 """"""" |
11229 | 12492 |
11230 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any | 12493 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any |
11231 floating point or vector of floating point type. Not all targets support | 12494 floating-point or vector of floating-point type. Not all targets support |
11232 all types however. | 12495 all types however. |
11233 | 12496 |
11234 :: | 12497 :: |
11235 | 12498 |
11236 declare float @llvm.nearbyint.f32(float %Val) | 12499 declare float @llvm.nearbyint.f32(float %Val) |
11246 nearest integer. | 12509 nearest integer. |
11247 | 12510 |
11248 Arguments: | 12511 Arguments: |
11249 """""""""" | 12512 """""""""" |
11250 | 12513 |
11251 The argument and return value are floating point numbers of the same | 12514 The argument and return value are floating-point numbers of the same |
11252 type. | 12515 type. |
11253 | 12516 |
11254 Semantics: | 12517 Semantics: |
11255 """""""""" | 12518 """""""""" |
11256 | 12519 |
11262 | 12525 |
11263 Syntax: | 12526 Syntax: |
11264 """"""" | 12527 """"""" |
11265 | 12528 |
11266 This is an overloaded intrinsic. You can use ``llvm.round`` on any | 12529 This is an overloaded intrinsic. You can use ``llvm.round`` on any |
11267 floating point or vector of floating point type. Not all targets support | 12530 floating-point or vector of floating-point type. Not all targets support |
11268 all types however. | 12531 all types however. |
11269 | 12532 |
11270 :: | 12533 :: |
11271 | 12534 |
11272 declare float @llvm.round.f32(float %Val) | 12535 declare float @llvm.round.f32(float %Val) |
11282 nearest integer. | 12545 nearest integer. |
11283 | 12546 |
11284 Arguments: | 12547 Arguments: |
11285 """""""""" | 12548 """""""""" |
11286 | 12549 |
11287 The argument and return value are floating point numbers of the same | 12550 The argument and return value are floating-point numbers of the same |
11288 type. | 12551 type. |
11289 | 12552 |
11290 Semantics: | 12553 Semantics: |
11291 """""""""" | 12554 """""""""" |
11292 | 12555 |
11293 This function returns the same values as the libm ``round`` | 12556 This function returns the same values as the libm ``round`` |
11294 functions would, and handles error conditions in the same way. | 12557 functions would, and handles error conditions in the same way. |
12558 | |
12559 '``llvm.lround.*``' Intrinsic | |
12560 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12561 | |
12562 Syntax: | |
12563 """"""" | |
12564 | |
12565 This is an overloaded intrinsic. You can use ``llvm.lround`` on any | |
12566 floating-point type. Not all targets support all types however. | |
12567 | |
12568 :: | |
12569 | |
12570 declare i32 @llvm.lround.i32.f32(float %Val) | |
12571 declare i32 @llvm.lround.i32.f64(double %Val) | |
12572 declare i32 @llvm.lround.i32.f80(float %Val) | |
12573 declare i32 @llvm.lround.i32.f128(double %Val) | |
12574 declare i32 @llvm.lround.i32.ppcf128(double %Val) | |
12575 | |
12576 declare i64 @llvm.lround.i64.f32(float %Val) | |
12577 declare i64 @llvm.lround.i64.f64(double %Val) | |
12578 declare i64 @llvm.lround.i64.f80(float %Val) | |
12579 declare i64 @llvm.lround.i64.f128(double %Val) | |
12580 declare i64 @llvm.lround.i64.ppcf128(double %Val) | |
12581 | |
12582 Overview: | |
12583 """"""""" | |
12584 | |
12585 The '``llvm.lround.*``' intrinsics returns the operand rounded to the | |
12586 nearest integer. | |
12587 | |
12588 Arguments: | |
12589 """""""""" | |
12590 | |
12591 The argument is a floating-point number and return is an integer type. | |
12592 | |
12593 Semantics: | |
12594 """""""""" | |
12595 | |
12596 This function returns the same values as the libm ``lround`` | |
12597 functions would, but without setting errno. | |
12598 | |
12599 '``llvm.llround.*``' Intrinsic | |
12600 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12601 | |
12602 Syntax: | |
12603 """"""" | |
12604 | |
12605 This is an overloaded intrinsic. You can use ``llvm.llround`` on any | |
12606 floating-point type. Not all targets support all types however. | |
12607 | |
12608 :: | |
12609 | |
12610 declare i64 @llvm.lround.i64.f32(float %Val) | |
12611 declare i64 @llvm.lround.i64.f64(double %Val) | |
12612 declare i64 @llvm.lround.i64.f80(float %Val) | |
12613 declare i64 @llvm.lround.i64.f128(double %Val) | |
12614 declare i64 @llvm.lround.i64.ppcf128(double %Val) | |
12615 | |
12616 Overview: | |
12617 """"""""" | |
12618 | |
12619 The '``llvm.llround.*``' intrinsics returns the operand rounded to the | |
12620 nearest integer. | |
12621 | |
12622 Arguments: | |
12623 """""""""" | |
12624 | |
12625 The argument is a floating-point number and return is an integer type. | |
12626 | |
12627 Semantics: | |
12628 """""""""" | |
12629 | |
12630 This function returns the same values as the libm ``llround`` | |
12631 functions would, but without setting errno. | |
12632 | |
12633 '``llvm.lrint.*``' Intrinsic | |
12634 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12635 | |
12636 Syntax: | |
12637 """"""" | |
12638 | |
12639 This is an overloaded intrinsic. You can use ``llvm.lrint`` on any | |
12640 floating-point type. Not all targets support all types however. | |
12641 | |
12642 :: | |
12643 | |
12644 declare i32 @llvm.lrint.i32.f32(float %Val) | |
12645 declare i32 @llvm.lrint.i32.f64(double %Val) | |
12646 declare i32 @llvm.lrint.i32.f80(float %Val) | |
12647 declare i32 @llvm.lrint.i32.f128(double %Val) | |
12648 declare i32 @llvm.lrint.i32.ppcf128(double %Val) | |
12649 | |
12650 declare i64 @llvm.lrint.i64.f32(float %Val) | |
12651 declare i64 @llvm.lrint.i64.f64(double %Val) | |
12652 declare i64 @llvm.lrint.i64.f80(float %Val) | |
12653 declare i64 @llvm.lrint.i64.f128(double %Val) | |
12654 declare i64 @llvm.lrint.i64.ppcf128(double %Val) | |
12655 | |
12656 Overview: | |
12657 """"""""" | |
12658 | |
12659 The '``llvm.lrint.*``' intrinsics returns the operand rounded to the | |
12660 nearest integer. | |
12661 | |
12662 Arguments: | |
12663 """""""""" | |
12664 | |
12665 The argument is a floating-point number and return is an integer type. | |
12666 | |
12667 Semantics: | |
12668 """""""""" | |
12669 | |
12670 This function returns the same values as the libm ``lrint`` | |
12671 functions would, but without setting errno. | |
12672 | |
12673 '``llvm.llrint.*``' Intrinsic | |
12674 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12675 | |
12676 Syntax: | |
12677 """"""" | |
12678 | |
12679 This is an overloaded intrinsic. You can use ``llvm.llrint`` on any | |
12680 floating-point type. Not all targets support all types however. | |
12681 | |
12682 :: | |
12683 | |
12684 declare i64 @llvm.llrint.i64.f32(float %Val) | |
12685 declare i64 @llvm.llrint.i64.f64(double %Val) | |
12686 declare i64 @llvm.llrint.i64.f80(float %Val) | |
12687 declare i64 @llvm.llrint.i64.f128(double %Val) | |
12688 declare i64 @llvm.llrint.i64.ppcf128(double %Val) | |
12689 | |
12690 Overview: | |
12691 """"""""" | |
12692 | |
12693 The '``llvm.llrint.*``' intrinsics returns the operand rounded to the | |
12694 nearest integer. | |
12695 | |
12696 Arguments: | |
12697 """""""""" | |
12698 | |
12699 The argument is a floating-point number and return is an integer type. | |
12700 | |
12701 Semantics: | |
12702 """""""""" | |
12703 | |
12704 This function returns the same values as the libm ``llrint`` | |
12705 functions would, but without setting errno. | |
11295 | 12706 |
11296 Bit Manipulation Intrinsics | 12707 Bit Manipulation Intrinsics |
11297 --------------------------- | 12708 --------------------------- |
11298 | 12709 |
11299 LLVM provides intrinsics for a few important bit manipulation | 12710 LLVM provides intrinsics for a few important bit manipulation |
11311 :: | 12722 :: |
11312 | 12723 |
11313 declare i16 @llvm.bitreverse.i16(i16 <id>) | 12724 declare i16 @llvm.bitreverse.i16(i16 <id>) |
11314 declare i32 @llvm.bitreverse.i32(i32 <id>) | 12725 declare i32 @llvm.bitreverse.i32(i32 <id>) |
11315 declare i64 @llvm.bitreverse.i64(i64 <id>) | 12726 declare i64 @llvm.bitreverse.i64(i64 <id>) |
12727 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>) | |
11316 | 12728 |
11317 Overview: | 12729 Overview: |
11318 """"""""" | 12730 """"""""" |
11319 | 12731 |
11320 The '``llvm.bitreverse``' family of intrinsics is used to reverse the | 12732 The '``llvm.bitreverse``' family of intrinsics is used to reverse the |
11321 bitpattern of an integer value; for example ``0b10110110`` becomes | 12733 bitpattern of an integer value or vector of integer values; for example |
11322 ``0b01101101``. | 12734 ``0b10110110`` becomes ``0b01101101``. |
11323 | 12735 |
11324 Semantics: | 12736 Semantics: |
11325 """""""""" | 12737 """""""""" |
11326 | 12738 |
11327 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit | 12739 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit |
11328 ``M`` in the input moved to bit ``N-M`` in the output. | 12740 ``M`` in the input moved to bit ``N-M`` in the output. The vector |
12741 intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element | |
12742 basis and the element order is not affected. | |
11329 | 12743 |
11330 '``llvm.bswap.*``' Intrinsics | 12744 '``llvm.bswap.*``' Intrinsics |
11331 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 12745 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
11332 | 12746 |
11333 Syntax: | 12747 Syntax: |
11339 :: | 12753 :: |
11340 | 12754 |
11341 declare i16 @llvm.bswap.i16(i16 <id>) | 12755 declare i16 @llvm.bswap.i16(i16 <id>) |
11342 declare i32 @llvm.bswap.i32(i32 <id>) | 12756 declare i32 @llvm.bswap.i32(i32 <id>) |
11343 declare i64 @llvm.bswap.i64(i64 <id>) | 12757 declare i64 @llvm.bswap.i64(i64 <id>) |
11344 | 12758 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>) |
11345 Overview: | 12759 |
11346 """"""""" | 12760 Overview: |
11347 | 12761 """"""""" |
11348 The '``llvm.bswap``' family of intrinsics is used to byte swap integer | 12762 |
11349 values with an even number of bytes (positive multiple of 16 bits). | 12763 The '``llvm.bswap``' family of intrinsics is used to byte swap an integer |
11350 These are useful for performing operations on data that is not in the | 12764 value or vector of integer values with an even number of bytes (positive |
11351 target's native byte order. | 12765 multiple of 16 bits). |
11352 | 12766 |
11353 Semantics: | 12767 Semantics: |
11354 """""""""" | 12768 """""""""" |
11355 | 12769 |
11356 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high | 12770 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high |
11358 intrinsic returns an i32 value that has the four bytes of the input i32 | 12772 intrinsic returns an i32 value that has the four bytes of the input i32 |
11359 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the | 12773 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the |
11360 returned i32 will have its bytes in 3, 2, 1, 0 order. The | 12774 returned i32 will have its bytes in 3, 2, 1, 0 order. The |
11361 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this | 12775 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this |
11362 concept to additional even-byte lengths (6 bytes, 8 bytes and more, | 12776 concept to additional even-byte lengths (6 bytes, 8 bytes and more, |
11363 respectively). | 12777 respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``, |
12778 operate on a per-element basis and the element order is not affected. | |
11364 | 12779 |
11365 '``llvm.ctpop.*``' Intrinsic | 12780 '``llvm.ctpop.*``' Intrinsic |
11366 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 12781 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
11367 | 12782 |
11368 Syntax: | 12783 Syntax: |
11493 then the result is the size in bits of the type of ``src`` if | 12908 then the result is the size in bits of the type of ``src`` if |
11494 ``is_zero_undef == 0`` and ``undef`` otherwise. For example, | 12909 ``is_zero_undef == 0`` and ``undef`` otherwise. For example, |
11495 ``llvm.cttz(2) = 1``. | 12910 ``llvm.cttz(2) = 1``. |
11496 | 12911 |
11497 .. _int_overflow: | 12912 .. _int_overflow: |
12913 | |
12914 '``llvm.fshl.*``' Intrinsic | |
12915 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12916 | |
12917 Syntax: | |
12918 """"""" | |
12919 | |
12920 This is an overloaded intrinsic. You can use ``llvm.fshl`` on any | |
12921 integer bit width or any vector of integer elements. Not all targets | |
12922 support all bit widths or vector types, however. | |
12923 | |
12924 :: | |
12925 | |
12926 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c) | |
12927 declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c) | |
12928 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) | |
12929 | |
12930 Overview: | |
12931 """"""""" | |
12932 | |
12933 The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left: | |
12934 the first two values are concatenated as { %a : %b } (%a is the most significant | |
12935 bits of the wide value), the combined value is shifted left, and the most | |
12936 significant bits are extracted to produce a result that is the same size as the | |
12937 original arguments. If the first 2 arguments are identical, this is equivalent | |
12938 to a rotate left operation. For vector types, the operation occurs for each | |
12939 element of the vector. The shift argument is treated as an unsigned amount | |
12940 modulo the element size of the arguments. | |
12941 | |
12942 Arguments: | |
12943 """""""""" | |
12944 | |
12945 The first two arguments are the values to be concatenated. The third | |
12946 argument is the shift amount. The arguments may be any integer type or a | |
12947 vector with integer element type. All arguments and the return value must | |
12948 have the same type. | |
12949 | |
12950 Example: | |
12951 """""""" | |
12952 | |
12953 .. code-block:: text | |
12954 | |
12955 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8) | |
12956 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000) | |
12957 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000) | |
12958 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000) | |
12959 | |
12960 '``llvm.fshr.*``' Intrinsic | |
12961 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
12962 | |
12963 Syntax: | |
12964 """"""" | |
12965 | |
12966 This is an overloaded intrinsic. You can use ``llvm.fshr`` on any | |
12967 integer bit width or any vector of integer elements. Not all targets | |
12968 support all bit widths or vector types, however. | |
12969 | |
12970 :: | |
12971 | |
12972 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c) | |
12973 declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c) | |
12974 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) | |
12975 | |
12976 Overview: | |
12977 """"""""" | |
12978 | |
12979 The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right: | |
12980 the first two values are concatenated as { %a : %b } (%a is the most significant | |
12981 bits of the wide value), the combined value is shifted right, and the least | |
12982 significant bits are extracted to produce a result that is the same size as the | |
12983 original arguments. If the first 2 arguments are identical, this is equivalent | |
12984 to a rotate right operation. For vector types, the operation occurs for each | |
12985 element of the vector. The shift argument is treated as an unsigned amount | |
12986 modulo the element size of the arguments. | |
12987 | |
12988 Arguments: | |
12989 """""""""" | |
12990 | |
12991 The first two arguments are the values to be concatenated. The third | |
12992 argument is the shift amount. The arguments may be any integer type or a | |
12993 vector with integer element type. All arguments and the return value must | |
12994 have the same type. | |
12995 | |
12996 Example: | |
12997 """""""" | |
12998 | |
12999 .. code-block:: text | |
13000 | |
13001 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8) | |
13002 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110) | |
13003 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001) | |
13004 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111) | |
11498 | 13005 |
11499 Arithmetic with Overflow Intrinsics | 13006 Arithmetic with Overflow Intrinsics |
11500 ----------------------------------- | 13007 ----------------------------------- |
11501 | 13008 |
11502 LLVM provides intrinsics for fast arithmetic overflow checking. | 13009 LLVM provides intrinsics for fast arithmetic overflow checking. |
11525 | 13032 |
11526 Syntax: | 13033 Syntax: |
11527 """"""" | 13034 """"""" |
11528 | 13035 |
11529 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` | 13036 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` |
11530 on any integer bit width. | 13037 on any integer bit width or vectors of integers. |
11531 | 13038 |
11532 :: | 13039 :: |
11533 | 13040 |
11534 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) | 13041 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) |
11535 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) | 13042 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) |
11536 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) | 13043 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) |
13044 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) | |
11537 | 13045 |
11538 Overview: | 13046 Overview: |
11539 """"""""" | 13047 """"""""" |
11540 | 13048 |
11541 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform | 13049 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform |
11575 | 13083 |
11576 Syntax: | 13084 Syntax: |
11577 """"""" | 13085 """"""" |
11578 | 13086 |
11579 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` | 13087 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` |
11580 on any integer bit width. | 13088 on any integer bit width or vectors of integers. |
11581 | 13089 |
11582 :: | 13090 :: |
11583 | 13091 |
11584 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) | 13092 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) |
11585 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) | 13093 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) |
11586 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) | 13094 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) |
13095 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) | |
11587 | 13096 |
11588 Overview: | 13097 Overview: |
11589 """"""""" | 13098 """"""""" |
11590 | 13099 |
11591 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform | 13100 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform |
11624 | 13133 |
11625 Syntax: | 13134 Syntax: |
11626 """"""" | 13135 """"""" |
11627 | 13136 |
11628 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` | 13137 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` |
11629 on any integer bit width. | 13138 on any integer bit width or vectors of integers. |
11630 | 13139 |
11631 :: | 13140 :: |
11632 | 13141 |
11633 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) | 13142 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) |
11634 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) | 13143 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) |
11635 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) | 13144 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) |
13145 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) | |
11636 | 13146 |
11637 Overview: | 13147 Overview: |
11638 """"""""" | 13148 """"""""" |
11639 | 13149 |
11640 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform | 13150 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform |
11674 | 13184 |
11675 Syntax: | 13185 Syntax: |
11676 """"""" | 13186 """"""" |
11677 | 13187 |
11678 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` | 13188 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` |
11679 on any integer bit width. | 13189 on any integer bit width or vectors of integers. |
11680 | 13190 |
11681 :: | 13191 :: |
11682 | 13192 |
11683 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) | 13193 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) |
11684 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) | 13194 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) |
11685 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) | 13195 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) |
13196 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) | |
11686 | 13197 |
11687 Overview: | 13198 Overview: |
11688 """"""""" | 13199 """"""""" |
11689 | 13200 |
11690 The '``llvm.usub.with.overflow``' family of intrinsic functions perform | 13201 The '``llvm.usub.with.overflow``' family of intrinsic functions perform |
11724 | 13235 |
11725 Syntax: | 13236 Syntax: |
11726 """"""" | 13237 """"""" |
11727 | 13238 |
11728 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` | 13239 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` |
11729 on any integer bit width. | 13240 on any integer bit width or vectors of integers. |
11730 | 13241 |
11731 :: | 13242 :: |
11732 | 13243 |
11733 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) | 13244 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) |
11734 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) | 13245 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) |
11735 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) | 13246 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) |
13247 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) | |
11736 | 13248 |
11737 Overview: | 13249 Overview: |
11738 """"""""" | 13250 """"""""" |
11739 | 13251 |
11740 The '``llvm.smul.with.overflow``' family of intrinsic functions perform | 13252 The '``llvm.smul.with.overflow``' family of intrinsic functions perform |
11774 | 13286 |
11775 Syntax: | 13287 Syntax: |
11776 """"""" | 13288 """"""" |
11777 | 13289 |
11778 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` | 13290 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` |
11779 on any integer bit width. | 13291 on any integer bit width or vectors of integers. |
11780 | 13292 |
11781 :: | 13293 :: |
11782 | 13294 |
11783 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) | 13295 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) |
11784 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) | 13296 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) |
11785 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) | 13297 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) |
13298 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) | |
11786 | 13299 |
11787 Overview: | 13300 Overview: |
11788 """"""""" | 13301 """"""""" |
11789 | 13302 |
11790 The '``llvm.umul.with.overflow``' family of intrinsic functions perform | 13303 The '``llvm.umul.with.overflow``' family of intrinsic functions perform |
11817 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) | 13330 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) |
11818 %sum = extractvalue {i32, i1} %res, 0 | 13331 %sum = extractvalue {i32, i1} %res, 0 |
11819 %obit = extractvalue {i32, i1} %res, 1 | 13332 %obit = extractvalue {i32, i1} %res, 1 |
11820 br i1 %obit, label %overflow, label %normal | 13333 br i1 %obit, label %overflow, label %normal |
11821 | 13334 |
13335 Saturation Arithmetic Intrinsics | |
13336 --------------------------------- | |
13337 | |
13338 Saturation arithmetic is a version of arithmetic in which operations are | |
13339 limited to a fixed range between a minimum and maximum value. If the result of | |
13340 an operation is greater than the maximum value, the result is set (or | |
13341 "clamped") to this maximum. If it is below the minimum, it is clamped to this | |
13342 minimum. | |
13343 | |
13344 | |
13345 '``llvm.sadd.sat.*``' Intrinsics | |
13346 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
13347 | |
13348 Syntax | |
13349 """"""" | |
13350 | |
13351 This is an overloaded intrinsic. You can use ``llvm.sadd.sat`` | |
13352 on any integer bit width or vectors of integers. | |
13353 | |
13354 :: | |
13355 | |
13356 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b) | |
13357 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b) | |
13358 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b) | |
13359 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) | |
13360 | |
13361 Overview | |
13362 """"""""" | |
13363 | |
13364 The '``llvm.sadd.sat``' family of intrinsic functions perform signed | |
13365 saturation addition on the 2 arguments. | |
13366 | |
13367 Arguments | |
13368 """""""""" | |
13369 | |
13370 The arguments (%a and %b) and the result may be of integer types of any bit | |
13371 width, but they must have the same bit width. ``%a`` and ``%b`` are the two | |
13372 values that will undergo signed addition. | |
13373 | |
13374 Semantics: | |
13375 """""""""" | |
13376 | |
13377 The maximum value this operation can clamp to is the largest signed value | |
13378 representable by the bit width of the arguments. The minimum value is the | |
13379 smallest signed value representable by this bit width. | |
13380 | |
13381 | |
13382 Examples | |
13383 """"""""" | |
13384 | |
13385 .. code-block:: llvm | |
13386 | |
13387 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3 | |
13388 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7 | |
13389 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2 | |
13390 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8 | |
13391 | |
13392 | |
13393 '``llvm.uadd.sat.*``' Intrinsics | |
13394 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
13395 | |
13396 Syntax | |
13397 """"""" | |
13398 | |
13399 This is an overloaded intrinsic. You can use ``llvm.uadd.sat`` | |
13400 on any integer bit width or vectors of integers. | |
13401 | |
13402 :: | |
13403 | |
13404 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b) | |
13405 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b) | |
13406 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b) | |
13407 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) | |
13408 | |
13409 Overview | |
13410 """"""""" | |
13411 | |
13412 The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned | |
13413 saturation addition on the 2 arguments. | |
13414 | |
13415 Arguments | |
13416 """""""""" | |
13417 | |
13418 The arguments (%a and %b) and the result may be of integer types of any bit | |
13419 width, but they must have the same bit width. ``%a`` and ``%b`` are the two | |
13420 values that will undergo unsigned addition. | |
13421 | |
13422 Semantics: | |
13423 """""""""" | |
13424 | |
13425 The maximum value this operation can clamp to is the largest unsigned value | |
13426 representable by the bit width of the arguments. Because this is an unsigned | |
13427 operation, the result will never saturate towards zero. | |
13428 | |
13429 | |
13430 Examples | |
13431 """"""""" | |
13432 | |
13433 .. code-block:: llvm | |
13434 | |
13435 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3 | |
13436 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11 | |
13437 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15 | |
13438 | |
13439 | |
13440 '``llvm.ssub.sat.*``' Intrinsics | |
13441 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
13442 | |
13443 Syntax | |
13444 """"""" | |
13445 | |
13446 This is an overloaded intrinsic. You can use ``llvm.ssub.sat`` | |
13447 on any integer bit width or vectors of integers. | |
13448 | |
13449 :: | |
13450 | |
13451 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b) | |
13452 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b) | |
13453 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b) | |
13454 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) | |
13455 | |
13456 Overview | |
13457 """"""""" | |
13458 | |
13459 The '``llvm.ssub.sat``' family of intrinsic functions perform signed | |
13460 saturation subtraction on the 2 arguments. | |
13461 | |
13462 Arguments | |
13463 """""""""" | |
13464 | |
13465 The arguments (%a and %b) and the result may be of integer types of any bit | |
13466 width, but they must have the same bit width. ``%a`` and ``%b`` are the two | |
13467 values that will undergo signed subtraction. | |
13468 | |
13469 Semantics: | |
13470 """""""""" | |
13471 | |
13472 The maximum value this operation can clamp to is the largest signed value | |
13473 representable by the bit width of the arguments. The minimum value is the | |
13474 smallest signed value representable by this bit width. | |
13475 | |
13476 | |
13477 Examples | |
13478 """"""""" | |
13479 | |
13480 .. code-block:: llvm | |
13481 | |
13482 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1 | |
13483 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4 | |
13484 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8 | |
13485 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7 | |
13486 | |
13487 | |
13488 '``llvm.usub.sat.*``' Intrinsics | |
13489 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
13490 | |
13491 Syntax | |
13492 """"""" | |
13493 | |
13494 This is an overloaded intrinsic. You can use ``llvm.usub.sat`` | |
13495 on any integer bit width or vectors of integers. | |
13496 | |
13497 :: | |
13498 | |
13499 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b) | |
13500 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b) | |
13501 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b) | |
13502 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) | |
13503 | |
13504 Overview | |
13505 """"""""" | |
13506 | |
13507 The '``llvm.usub.sat``' family of intrinsic functions perform unsigned | |
13508 saturation subtraction on the 2 arguments. | |
13509 | |
13510 Arguments | |
13511 """""""""" | |
13512 | |
13513 The arguments (%a and %b) and the result may be of integer types of any bit | |
13514 width, but they must have the same bit width. ``%a`` and ``%b`` are the two | |
13515 values that will undergo unsigned subtraction. | |
13516 | |
13517 Semantics: | |
13518 """""""""" | |
13519 | |
13520 The minimum value this operation can clamp to is 0, which is the smallest | |
13521 unsigned value representable by the bit width of the unsigned arguments. | |
13522 Because this is an unsigned operation, the result will never saturate towards | |
13523 the largest possible value representable by this bit width. | |
13524 | |
13525 | |
13526 Examples | |
13527 """"""""" | |
13528 | |
13529 .. code-block:: llvm | |
13530 | |
13531 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1 | |
13532 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0 | |
13533 | |
13534 | |
13535 Fixed Point Arithmetic Intrinsics | |
13536 --------------------------------- | |
13537 | |
13538 A fixed point number represents a real data type for a number that has a fixed | |
13539 number of digits after a radix point (equivalent to the decimal point '.'). | |
13540 The number of digits after the radix point is referred as the ``scale``. These | |
13541 are useful for representing fractional values to a specific precision. The | |
13542 following intrinsics perform fixed point arithmetic operations on 2 operands | |
13543 of the same scale, specified as the third argument. | |
13544 | |
13545 The `llvm.*mul.fix` family of intrinsic functions represents a multiplication | |
13546 of fixed point numbers through scaled integers. Therefore, fixed point | |
13547 multplication can be represented as | |
13548 | |
13549 :: | |
13550 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale) | |
13551 | |
13552 ; Expands to | |
13553 %a2 = sext i4 %a to i8 | |
13554 %b2 = sext i4 %b to i8 | |
13555 %mul = mul nsw nuw i8 %a, %b | |
13556 %scale2 = trunc i32 %scale to i8 | |
13557 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity | |
13558 %result = trunc i8 %r to i4 | |
13559 | |
13560 For each of these functions, if the result cannot be represented exactly with | |
13561 the provided scale, the result is rounded. Rounding is unspecified since | |
13562 preferred rounding may vary for different targets. Rounding is specified | |
13563 through a target hook. Different pipelines should legalize or optimize this | |
13564 using the rounding specified by this hook if it is provided. Operations like | |
13565 constant folding, instruction combining, KnownBits, and ValueTracking should | |
13566 also use this hook, if provided, and not assume the direction of rounding. A | |
13567 rounded result must always be within one unit of precision from the true | |
13568 result. That is, the error between the returned result and the true result must | |
13569 be less than 1/2^(scale). | |
13570 | |
13571 | |
13572 '``llvm.smul.fix.*``' Intrinsics | |
13573 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
13574 | |
13575 Syntax | |
13576 """"""" | |
13577 | |
13578 This is an overloaded intrinsic. You can use ``llvm.smul.fix`` | |
13579 on any integer bit width or vectors of integers. | |
13580 | |
13581 :: | |
13582 | |
13583 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale) | |
13584 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale) | |
13585 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale) | |
13586 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) | |
13587 | |
13588 Overview | |
13589 """"""""" | |
13590 | |
13591 The '``llvm.smul.fix``' family of intrinsic functions perform signed | |
13592 fixed point multiplication on 2 arguments of the same scale. | |
13593 | |
13594 Arguments | |
13595 """""""""" | |
13596 | |
13597 The arguments (%a and %b) and the result may be of integer types of any bit | |
13598 width, but they must have the same bit width. The arguments may also work with | |
13599 int vectors of the same length and int size. ``%a`` and ``%b`` are the two | |
13600 values that will undergo signed fixed point multiplication. The argument | |
13601 ``%scale`` represents the scale of both operands, and must be a constant | |
13602 integer. | |
13603 | |
13604 Semantics: | |
13605 """""""""" | |
13606 | |
13607 This operation performs fixed point multiplication on the 2 arguments of a | |
13608 specified scale. The result will also be returned in the same scale specified | |
13609 in the third argument. | |
13610 | |
13611 If the result value cannot be precisely represented in the given scale, the | |
13612 value is rounded up or down to the closest representable value. The rounding | |
13613 direction is unspecified. | |
13614 | |
13615 It is undefined behavior if the result value does not fit within the range of | |
13616 the fixed point type. | |
13617 | |
13618 | |
13619 Examples | |
13620 """"""""" | |
13621 | |
13622 .. code-block:: llvm | |
13623 | |
13624 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) | |
13625 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) | |
13626 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) | |
13627 | |
13628 ; The result in the following could be rounded up to -2 or down to -2.5 | |
13629 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) | |
13630 | |
13631 | |
13632 '``llvm.umul.fix.*``' Intrinsics | |
13633 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
13634 | |
13635 Syntax | |
13636 """"""" | |
13637 | |
13638 This is an overloaded intrinsic. You can use ``llvm.umul.fix`` | |
13639 on any integer bit width or vectors of integers. | |
13640 | |
13641 :: | |
13642 | |
13643 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale) | |
13644 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale) | |
13645 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale) | |
13646 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) | |
13647 | |
13648 Overview | |
13649 """"""""" | |
13650 | |
13651 The '``llvm.umul.fix``' family of intrinsic functions perform unsigned | |
13652 fixed point multiplication on 2 arguments of the same scale. | |
13653 | |
13654 Arguments | |
13655 """""""""" | |
13656 | |
13657 The arguments (%a and %b) and the result may be of integer types of any bit | |
13658 width, but they must have the same bit width. The arguments may also work with | |
13659 int vectors of the same length and int size. ``%a`` and ``%b`` are the two | |
13660 values that will undergo unsigned fixed point multiplication. The argument | |
13661 ``%scale`` represents the scale of both operands, and must be a constant | |
13662 integer. | |
13663 | |
13664 Semantics: | |
13665 """""""""" | |
13666 | |
13667 This operation performs unsigned fixed point multiplication on the 2 arguments of a | |
13668 specified scale. The result will also be returned in the same scale specified | |
13669 in the third argument. | |
13670 | |
13671 If the result value cannot be precisely represented in the given scale, the | |
13672 value is rounded up or down to the closest representable value. The rounding | |
13673 direction is unspecified. | |
13674 | |
13675 It is undefined behavior if the result value does not fit within the range of | |
13676 the fixed point type. | |
13677 | |
13678 | |
13679 Examples | |
13680 """"""""" | |
13681 | |
13682 .. code-block:: llvm | |
13683 | |
13684 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) | |
13685 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) | |
13686 | |
13687 ; The result in the following could be rounded down to 3.5 or up to 4 | |
13688 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75) | |
13689 | |
13690 | |
13691 '``llvm.smul.fix.sat.*``' Intrinsics | |
13692 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
13693 | |
13694 Syntax | |
13695 """"""" | |
13696 | |
13697 This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat`` | |
13698 on any integer bit width or vectors of integers. | |
13699 | |
13700 :: | |
13701 | |
13702 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) | |
13703 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) | |
13704 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) | |
13705 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) | |
13706 | |
13707 Overview | |
13708 """"""""" | |
13709 | |
13710 The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed | |
13711 fixed point saturation multiplication on 2 arguments of the same scale. | |
13712 | |
13713 Arguments | |
13714 """""""""" | |
13715 | |
13716 The arguments (%a and %b) and the result may be of integer types of any bit | |
13717 width, but they must have the same bit width. ``%a`` and ``%b`` are the two | |
13718 values that will undergo signed fixed point multiplication. The argument | |
13719 ``%scale`` represents the scale of both operands, and must be a constant | |
13720 integer. | |
13721 | |
13722 Semantics: | |
13723 """""""""" | |
13724 | |
13725 This operation performs fixed point multiplication on the 2 arguments of a | |
13726 specified scale. The result will also be returned in the same scale specified | |
13727 in the third argument. | |
13728 | |
13729 If the result value cannot be precisely represented in the given scale, the | |
13730 value is rounded up or down to the closest representable value. The rounding | |
13731 direction is unspecified. | |
13732 | |
13733 The maximum value this operation can clamp to is the largest signed value | |
13734 representable by the bit width of the first 2 arguments. The minimum value is the | |
13735 smallest signed value representable by this bit width. | |
13736 | |
13737 | |
13738 Examples | |
13739 """"""""" | |
13740 | |
13741 .. code-block:: llvm | |
13742 | |
13743 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) | |
13744 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) | |
13745 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) | |
13746 | |
13747 ; The result in the following could be rounded up to -2 or down to -2.5 | |
13748 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) | |
13749 | |
13750 ; Saturation | |
13751 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7 | |
13752 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 2) ; %res = 7 | |
13753 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 2, i32 2) ; %res = -8 | |
13754 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 2) ; %res = 7 | |
13755 | |
13756 ; Scale can affect the saturation result | |
13757 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) | |
13758 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) | |
13759 | |
13760 | |
11822 Specialised Arithmetic Intrinsics | 13761 Specialised Arithmetic Intrinsics |
11823 --------------------------------- | 13762 --------------------------------- |
11824 | 13763 |
11825 '``llvm.canonicalize.*``' Intrinsic | 13764 '``llvm.canonicalize.*``' Intrinsic |
11826 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 13765 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
11835 | 13774 |
11836 Overview: | 13775 Overview: |
11837 """"""""" | 13776 """"""""" |
11838 | 13777 |
11839 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical | 13778 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical |
11840 encoding of a floating point number. This canonicalization is useful for | 13779 encoding of a floating-point number. This canonicalization is useful for |
11841 implementing certain numeric primitives such as frexp. The canonical encoding is | 13780 implementing certain numeric primitives such as frexp. The canonical encoding is |
11842 defined by IEEE-754-2008 to be: | 13781 defined by IEEE-754-2008 to be: |
11843 | 13782 |
11844 :: | 13783 :: |
11845 | 13784 |
11853 | 13792 |
11854 Examples of non-canonical encodings: | 13793 Examples of non-canonical encodings: |
11855 | 13794 |
11856 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are | 13795 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are |
11857 converted to a canonical representation per hardware-specific protocol. | 13796 converted to a canonical representation per hardware-specific protocol. |
11858 - Many normal decimal floating point numbers have non-canonical alternative | 13797 - Many normal decimal floating-point numbers have non-canonical alternative |
11859 encodings. | 13798 encodings. |
11860 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. | 13799 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. |
11861 These are treated as non-canonical encodings of zero and will be flushed to | 13800 These are treated as non-canonical encodings of zero and will be flushed to |
11862 a zero of the same sign by this operation. | 13801 a zero of the same sign by this operation. |
11863 | 13802 |
11887 The canonicalization operation may be optimized away if: | 13826 The canonicalization operation may be optimized away if: |
11888 | 13827 |
11889 - The input is known to be canonical. For example, it was produced by a | 13828 - The input is known to be canonical. For example, it was produced by a |
11890 floating-point operation that is required by the standard to be canonical. | 13829 floating-point operation that is required by the standard to be canonical. |
11891 - The result is consumed only by (or fused with) other floating-point | 13830 - The result is consumed only by (or fused with) other floating-point |
11892 operations. That is, the bits of the floating point value are not examined. | 13831 operations. That is, the bits of the floating-point value are not examined. |
11893 | 13832 |
11894 '``llvm.fmuladd.*``' Intrinsic | 13833 '``llvm.fmuladd.*``' Intrinsic |
11895 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 13834 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
11896 | 13835 |
11897 Syntax: | 13836 Syntax: |
11956 Syntax: | 13895 Syntax: |
11957 """"""" | 13896 """"""" |
11958 | 13897 |
11959 :: | 13898 :: |
11960 | 13899 |
11961 declare i32 @llvm.experimental.vector.reduce.add.i32.v4i32(<4 x i32> %a) | 13900 declare i32 @llvm.experimental.vector.reduce.add.v4i32(<4 x i32> %a) |
11962 declare i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64> %a) | 13901 declare i64 @llvm.experimental.vector.reduce.add.v2i64(<2 x i64> %a) |
11963 | 13902 |
11964 Overview: | 13903 Overview: |
11965 """"""""" | 13904 """"""""" |
11966 | 13905 |
11967 The '``llvm.experimental.vector.reduce.add.*``' intrinsics do an integer ``ADD`` | 13906 The '``llvm.experimental.vector.reduce.add.*``' intrinsics do an integer ``ADD`` |
11970 | 13909 |
11971 Arguments: | 13910 Arguments: |
11972 """""""""" | 13911 """""""""" |
11973 The argument to this intrinsic must be a vector of integer values. | 13912 The argument to this intrinsic must be a vector of integer values. |
11974 | 13913 |
11975 '``llvm.experimental.vector.reduce.fadd.*``' Intrinsic | 13914 '``llvm.experimental.vector.reduce.v2.fadd.*``' Intrinsic |
11976 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 13915 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
11977 | 13916 |
11978 Syntax: | 13917 Syntax: |
11979 """"""" | 13918 """"""" |
11980 | 13919 |
11981 :: | 13920 :: |
11982 | 13921 |
11983 declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %a) | 13922 declare float @llvm.experimental.vector.reduce.v2.fadd.f32.v4f32(float %start_value, <4 x float> %a) |
11984 declare double @llvm.experimental.vector.reduce.fadd.f64.v2f64(double %acc, <2 x double> %a) | 13923 declare double @llvm.experimental.vector.reduce.v2.fadd.f64.v2f64(double %start_value, <2 x double> %a) |
11985 | 13924 |
11986 Overview: | 13925 Overview: |
11987 """"""""" | 13926 """"""""" |
11988 | 13927 |
11989 The '``llvm.experimental.vector.reduce.fadd.*``' intrinsics do a floating point | 13928 The '``llvm.experimental.vector.reduce.v2.fadd.*``' intrinsics do a floating-point |
11990 ``ADD`` reduction of a vector, returning the result as a scalar. The return type | 13929 ``ADD`` reduction of a vector, returning the result as a scalar. The return type |
11991 matches the element-type of the vector input. | 13930 matches the element-type of the vector input. |
11992 | 13931 |
11993 If the intrinsic call has fast-math flags, then the reduction will not preserve | 13932 If the intrinsic call has the 'reassoc' or 'fast' flags set, then the |
11994 the associativity of an equivalent scalarized counterpart. If it does not have | 13933 reduction will not preserve the associativity of an equivalent scalarized |
11995 fast-math flags, then the reduction will be *ordered*, implying that the | 13934 counterpart. Otherwise the reduction will be *ordered*, thus implying that |
11996 operation respects the associativity of a scalarized reduction. | 13935 the operation respects the associativity of a scalarized reduction. |
11997 | 13936 |
11998 | 13937 |
11999 Arguments: | 13938 Arguments: |
12000 """""""""" | 13939 """""""""" |
12001 The first argument to this intrinsic is a scalar accumulator value, which is | 13940 The first argument to this intrinsic is a scalar start value for the reduction. |
12002 only used when there are no fast-math flags attached. This argument may be undef | 13941 The type of the start value matches the element-type of the vector input. |
12003 when fast-math flags are used. | 13942 The second argument must be a vector of floating-point values. |
12004 | |
12005 The second argument must be a vector of floating point values. | |
12006 | 13943 |
12007 Examples: | 13944 Examples: |
12008 """"""""" | 13945 """"""""" |
12009 | 13946 |
12010 .. code-block:: llvm | 13947 :: |
12011 | 13948 |
12012 %fast = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %input) ; fast reduction | 13949 %unord = call reassoc float @llvm.experimental.vector.reduce.v2.fadd.f32.v4f32(float 0.0, <4 x float> %input) ; unordered reduction |
12013 %ord = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction | 13950 %ord = call float @llvm.experimental.vector.reduce.v2.fadd.f32.v4f32(float %start_value, <4 x float> %input) ; ordered reduction |
12014 | 13951 |
12015 | 13952 |
12016 '``llvm.experimental.vector.reduce.mul.*``' Intrinsic | 13953 '``llvm.experimental.vector.reduce.mul.*``' Intrinsic |
12017 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 13954 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12018 | 13955 |
12019 Syntax: | 13956 Syntax: |
12020 """"""" | 13957 """"""" |
12021 | 13958 |
12022 :: | 13959 :: |
12023 | 13960 |
12024 declare i32 @llvm.experimental.vector.reduce.mul.i32.v4i32(<4 x i32> %a) | 13961 declare i32 @llvm.experimental.vector.reduce.mul.v4i32(<4 x i32> %a) |
12025 declare i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64> %a) | 13962 declare i64 @llvm.experimental.vector.reduce.mul.v2i64(<2 x i64> %a) |
12026 | 13963 |
12027 Overview: | 13964 Overview: |
12028 """"""""" | 13965 """"""""" |
12029 | 13966 |
12030 The '``llvm.experimental.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` | 13967 The '``llvm.experimental.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` |
12033 | 13970 |
12034 Arguments: | 13971 Arguments: |
12035 """""""""" | 13972 """""""""" |
12036 The argument to this intrinsic must be a vector of integer values. | 13973 The argument to this intrinsic must be a vector of integer values. |
12037 | 13974 |
12038 '``llvm.experimental.vector.reduce.fmul.*``' Intrinsic | 13975 '``llvm.experimental.vector.reduce.v2.fmul.*``' Intrinsic |
12039 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 13976 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12040 | 13977 |
12041 Syntax: | 13978 Syntax: |
12042 """"""" | 13979 """"""" |
12043 | 13980 |
12044 :: | 13981 :: |
12045 | 13982 |
12046 declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %a) | 13983 declare float @llvm.experimental.vector.reduce.v2.fmul.f32.v4f32(float %start_value, <4 x float> %a) |
12047 declare double @llvm.experimental.vector.reduce.fmul.f64.v2f64(double %acc, <2 x double> %a) | 13984 declare double @llvm.experimental.vector.reduce.v2.fmul.f64.v2f64(double %start_value, <2 x double> %a) |
12048 | 13985 |
12049 Overview: | 13986 Overview: |
12050 """"""""" | 13987 """"""""" |
12051 | 13988 |
12052 The '``llvm.experimental.vector.reduce.fmul.*``' intrinsics do a floating point | 13989 The '``llvm.experimental.vector.reduce.v2.fmul.*``' intrinsics do a floating-point |
12053 ``MUL`` reduction of a vector, returning the result as a scalar. The return type | 13990 ``MUL`` reduction of a vector, returning the result as a scalar. The return type |
12054 matches the element-type of the vector input. | 13991 matches the element-type of the vector input. |
12055 | 13992 |
12056 If the intrinsic call has fast-math flags, then the reduction will not preserve | 13993 If the intrinsic call has the 'reassoc' or 'fast' flags set, then the |
12057 the associativity of an equivalent scalarized counterpart. If it does not have | 13994 reduction will not preserve the associativity of an equivalent scalarized |
12058 fast-math flags, then the reduction will be *ordered*, implying that the | 13995 counterpart. Otherwise the reduction will be *ordered*, thus implying that |
12059 operation respects the associativity of a scalarized reduction. | 13996 the operation respects the associativity of a scalarized reduction. |
12060 | 13997 |
12061 | 13998 |
12062 Arguments: | 13999 Arguments: |
12063 """""""""" | 14000 """""""""" |
12064 The first argument to this intrinsic is a scalar accumulator value, which is | 14001 The first argument to this intrinsic is a scalar start value for the reduction. |
12065 only used when there are no fast-math flags attached. This argument may be undef | 14002 The type of the start value matches the element-type of the vector input. |
12066 when fast-math flags are used. | 14003 The second argument must be a vector of floating-point values. |
12067 | |
12068 The second argument must be a vector of floating point values. | |
12069 | 14004 |
12070 Examples: | 14005 Examples: |
12071 """"""""" | 14006 """"""""" |
12072 | 14007 |
12073 .. code-block:: llvm | 14008 :: |
12074 | 14009 |
12075 %fast = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %input) ; fast reduction | 14010 %unord = call reassoc float @llvm.experimental.vector.reduce.v2.fmul.f32.v4f32(float 1.0, <4 x float> %input) ; unordered reduction |
12076 %ord = call float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction | 14011 %ord = call float @llvm.experimental.vector.reduce.v2.fmul.f32.v4f32(float %start_value, <4 x float> %input) ; ordered reduction |
12077 | 14012 |
12078 '``llvm.experimental.vector.reduce.and.*``' Intrinsic | 14013 '``llvm.experimental.vector.reduce.and.*``' Intrinsic |
12079 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 14014 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12080 | 14015 |
12081 Syntax: | 14016 Syntax: |
12082 """"""" | 14017 """"""" |
12083 | 14018 |
12084 :: | 14019 :: |
12085 | 14020 |
12086 declare i32 @llvm.experimental.vector.reduce.and.i32.v4i32(<4 x i32> %a) | 14021 declare i32 @llvm.experimental.vector.reduce.and.v4i32(<4 x i32> %a) |
12087 | 14022 |
12088 Overview: | 14023 Overview: |
12089 """"""""" | 14024 """"""""" |
12090 | 14025 |
12091 The '``llvm.experimental.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` | 14026 The '``llvm.experimental.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` |
12102 Syntax: | 14037 Syntax: |
12103 """"""" | 14038 """"""" |
12104 | 14039 |
12105 :: | 14040 :: |
12106 | 14041 |
12107 declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) | 14042 declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a) |
12108 | 14043 |
12109 Overview: | 14044 Overview: |
12110 """"""""" | 14045 """"""""" |
12111 | 14046 |
12112 The '``llvm.experimental.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction | 14047 The '``llvm.experimental.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction |
12123 Syntax: | 14058 Syntax: |
12124 """"""" | 14059 """"""" |
12125 | 14060 |
12126 :: | 14061 :: |
12127 | 14062 |
12128 declare i32 @llvm.experimental.vector.reduce.xor.i32.v4i32(<4 x i32> %a) | 14063 declare i32 @llvm.experimental.vector.reduce.xor.v4i32(<4 x i32> %a) |
12129 | 14064 |
12130 Overview: | 14065 Overview: |
12131 """"""""" | 14066 """"""""" |
12132 | 14067 |
12133 The '``llvm.experimental.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` | 14068 The '``llvm.experimental.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` |
12144 Syntax: | 14079 Syntax: |
12145 """"""" | 14080 """"""" |
12146 | 14081 |
12147 :: | 14082 :: |
12148 | 14083 |
12149 declare i32 @llvm.experimental.vector.reduce.smax.i32.v4i32(<4 x i32> %a) | 14084 declare i32 @llvm.experimental.vector.reduce.smax.v4i32(<4 x i32> %a) |
12150 | 14085 |
12151 Overview: | 14086 Overview: |
12152 """"""""" | 14087 """"""""" |
12153 | 14088 |
12154 The '``llvm.experimental.vector.reduce.smax.*``' intrinsics do a signed integer | 14089 The '``llvm.experimental.vector.reduce.smax.*``' intrinsics do a signed integer |
12165 Syntax: | 14100 Syntax: |
12166 """"""" | 14101 """"""" |
12167 | 14102 |
12168 :: | 14103 :: |
12169 | 14104 |
12170 declare i32 @llvm.experimental.vector.reduce.smin.i32.v4i32(<4 x i32> %a) | 14105 declare i32 @llvm.experimental.vector.reduce.smin.v4i32(<4 x i32> %a) |
12171 | 14106 |
12172 Overview: | 14107 Overview: |
12173 """"""""" | 14108 """"""""" |
12174 | 14109 |
12175 The '``llvm.experimental.vector.reduce.smin.*``' intrinsics do a signed integer | 14110 The '``llvm.experimental.vector.reduce.smin.*``' intrinsics do a signed integer |
12186 Syntax: | 14121 Syntax: |
12187 """"""" | 14122 """"""" |
12188 | 14123 |
12189 :: | 14124 :: |
12190 | 14125 |
12191 declare i32 @llvm.experimental.vector.reduce.umax.i32.v4i32(<4 x i32> %a) | 14126 declare i32 @llvm.experimental.vector.reduce.umax.v4i32(<4 x i32> %a) |
12192 | 14127 |
12193 Overview: | 14128 Overview: |
12194 """"""""" | 14129 """"""""" |
12195 | 14130 |
12196 The '``llvm.experimental.vector.reduce.umax.*``' intrinsics do an unsigned | 14131 The '``llvm.experimental.vector.reduce.umax.*``' intrinsics do an unsigned |
12207 Syntax: | 14142 Syntax: |
12208 """"""" | 14143 """"""" |
12209 | 14144 |
12210 :: | 14145 :: |
12211 | 14146 |
12212 declare i32 @llvm.experimental.vector.reduce.umin.i32.v4i32(<4 x i32> %a) | 14147 declare i32 @llvm.experimental.vector.reduce.umin.v4i32(<4 x i32> %a) |
12213 | 14148 |
12214 Overview: | 14149 Overview: |
12215 """"""""" | 14150 """"""""" |
12216 | 14151 |
12217 The '``llvm.experimental.vector.reduce.umin.*``' intrinsics do an unsigned | 14152 The '``llvm.experimental.vector.reduce.umin.*``' intrinsics do an unsigned |
12228 Syntax: | 14163 Syntax: |
12229 """"""" | 14164 """"""" |
12230 | 14165 |
12231 :: | 14166 :: |
12232 | 14167 |
12233 declare float @llvm.experimental.vector.reduce.fmax.f32.v4f32(<4 x float> %a) | 14168 declare float @llvm.experimental.vector.reduce.fmax.v4f32(<4 x float> %a) |
12234 declare double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double> %a) | 14169 declare double @llvm.experimental.vector.reduce.fmax.v2f64(<2 x double> %a) |
12235 | 14170 |
12236 Overview: | 14171 Overview: |
12237 """"""""" | 14172 """"""""" |
12238 | 14173 |
12239 The '``llvm.experimental.vector.reduce.fmax.*``' intrinsics do a floating point | 14174 The '``llvm.experimental.vector.reduce.fmax.*``' intrinsics do a floating-point |
12240 ``MAX`` reduction of a vector, returning the result as a scalar. The return type | 14175 ``MAX`` reduction of a vector, returning the result as a scalar. The return type |
12241 matches the element-type of the vector input. | 14176 matches the element-type of the vector input. |
12242 | 14177 |
12243 If the intrinsic call has the ``nnan`` fast-math flag then the operation can | 14178 If the intrinsic call has the ``nnan`` fast-math flag then the operation can |
12244 assume that NaNs are not present in the input vector. | 14179 assume that NaNs are not present in the input vector. |
12245 | 14180 |
12246 Arguments: | 14181 Arguments: |
12247 """""""""" | 14182 """""""""" |
12248 The argument to this intrinsic must be a vector of floating point values. | 14183 The argument to this intrinsic must be a vector of floating-point values. |
12249 | 14184 |
12250 '``llvm.experimental.vector.reduce.fmin.*``' Intrinsic | 14185 '``llvm.experimental.vector.reduce.fmin.*``' Intrinsic |
12251 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 14186 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12252 | 14187 |
12253 Syntax: | 14188 Syntax: |
12254 """"""" | 14189 """"""" |
12255 | 14190 |
12256 :: | 14191 :: |
12257 | 14192 |
12258 declare float @llvm.experimental.vector.reduce.fmin.f32.v4f32(<4 x float> %a) | 14193 declare float @llvm.experimental.vector.reduce.fmin.v4f32(<4 x float> %a) |
12259 declare double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double> %a) | 14194 declare double @llvm.experimental.vector.reduce.fmin.v2f64(<2 x double> %a) |
12260 | 14195 |
12261 Overview: | 14196 Overview: |
12262 """"""""" | 14197 """"""""" |
12263 | 14198 |
12264 The '``llvm.experimental.vector.reduce.fmin.*``' intrinsics do a floating point | 14199 The '``llvm.experimental.vector.reduce.fmin.*``' intrinsics do a floating-point |
12265 ``MIN`` reduction of a vector, returning the result as a scalar. The return type | 14200 ``MIN`` reduction of a vector, returning the result as a scalar. The return type |
12266 matches the element-type of the vector input. | 14201 matches the element-type of the vector input. |
12267 | 14202 |
12268 If the intrinsic call has the ``nnan`` fast-math flag then the operation can | 14203 If the intrinsic call has the ``nnan`` fast-math flag then the operation can |
12269 assume that NaNs are not present in the input vector. | 14204 assume that NaNs are not present in the input vector. |
12270 | 14205 |
12271 Arguments: | 14206 Arguments: |
12272 """""""""" | 14207 """""""""" |
12273 The argument to this intrinsic must be a vector of floating point values. | 14208 The argument to this intrinsic must be a vector of floating-point values. |
12274 | 14209 |
12275 Half Precision Floating Point Intrinsics | 14210 Half Precision Floating-Point Intrinsics |
12276 ---------------------------------------- | 14211 ---------------------------------------- |
12277 | 14212 |
12278 For most target platforms, half precision floating point is a | 14213 For most target platforms, half precision floating-point is a |
12279 storage-only format. This means that it is a dense encoding (in memory) | 14214 storage-only format. This means that it is a dense encoding (in memory) |
12280 but does not support computation in the format. | 14215 but does not support computation in the format. |
12281 | 14216 |
12282 This means that code must first load the half-precision floating point | 14217 This means that code must first load the half-precision floating-point |
12283 value as an i16, then convert it to float with | 14218 value as an i16, then convert it to float with |
12284 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can | 14219 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can |
12285 then be performed on the float value (including extending to double | 14220 then be performed on the float value (including extending to double |
12286 etc). To store the value back to memory, it is first converted to float | 14221 etc). To store the value back to memory, it is first converted to float |
12287 if needed, then converted to i16 with | 14222 if needed, then converted to i16 with |
12303 | 14238 |
12304 Overview: | 14239 Overview: |
12305 """"""""" | 14240 """"""""" |
12306 | 14241 |
12307 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a | 14242 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a |
12308 conventional floating point type to half precision floating point format. | 14243 conventional floating-point type to half precision floating-point format. |
12309 | 14244 |
12310 Arguments: | 14245 Arguments: |
12311 """""""""" | 14246 """""""""" |
12312 | 14247 |
12313 The intrinsic function contains single argument - the value to be | 14248 The intrinsic function contains single argument - the value to be |
12315 | 14250 |
12316 Semantics: | 14251 Semantics: |
12317 """""""""" | 14252 """""""""" |
12318 | 14253 |
12319 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a | 14254 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a |
12320 conventional floating point format to half precision floating point format. The | 14255 conventional floating-point format to half precision floating-point format. The |
12321 return value is an ``i16`` which contains the converted number. | 14256 return value is an ``i16`` which contains the converted number. |
12322 | 14257 |
12323 Examples: | 14258 Examples: |
12324 """"""""" | 14259 """"""""" |
12325 | 14260 |
12343 | 14278 |
12344 Overview: | 14279 Overview: |
12345 """"""""" | 14280 """"""""" |
12346 | 14281 |
12347 The '``llvm.convert.from.fp16``' intrinsic function performs a | 14282 The '``llvm.convert.from.fp16``' intrinsic function performs a |
12348 conversion from half precision floating point format to single precision | 14283 conversion from half precision floating-point format to single precision |
12349 floating point format. | 14284 floating-point format. |
12350 | 14285 |
12351 Arguments: | 14286 Arguments: |
12352 """""""""" | 14287 """""""""" |
12353 | 14288 |
12354 The intrinsic function contains single argument - the value to be | 14289 The intrinsic function contains single argument - the value to be |
12356 | 14291 |
12357 Semantics: | 14292 Semantics: |
12358 """""""""" | 14293 """""""""" |
12359 | 14294 |
12360 The '``llvm.convert.from.fp16``' intrinsic function performs a | 14295 The '``llvm.convert.from.fp16``' intrinsic function performs a |
12361 conversion from half single precision floating point format to single | 14296 conversion from half single precision floating-point format to single |
12362 precision floating point format. The input half-float value is | 14297 precision floating-point format. The input half-float value is |
12363 represented by an ``i16`` value. | 14298 represented by an ``i16`` value. |
12364 | 14299 |
12365 Examples: | 14300 Examples: |
12366 """"""""" | 14301 """"""""" |
12367 | 14302 |
12508 '``llvm.masked.load.*``' Intrinsics | 14443 '``llvm.masked.load.*``' Intrinsics |
12509 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 14444 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12510 | 14445 |
12511 Syntax: | 14446 Syntax: |
12512 """"""" | 14447 """"""" |
12513 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating point or pointer data type. | 14448 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type. |
12514 | 14449 |
12515 :: | 14450 :: |
12516 | 14451 |
12517 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) | 14452 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) |
12518 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) | 14453 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) |
12553 '``llvm.masked.store.*``' Intrinsics | 14488 '``llvm.masked.store.*``' Intrinsics |
12554 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 14489 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12555 | 14490 |
12556 Syntax: | 14491 Syntax: |
12557 """"""" | 14492 """"""" |
12558 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating point or pointer data type. | 14493 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. |
12559 | 14494 |
12560 :: | 14495 :: |
12561 | 14496 |
12562 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) | 14497 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) |
12563 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) | 14498 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) |
12603 '``llvm.masked.gather.*``' Intrinsics | 14538 '``llvm.masked.gather.*``' Intrinsics |
12604 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 14539 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12605 | 14540 |
12606 Syntax: | 14541 Syntax: |
12607 """"""" | 14542 """"""" |
12608 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating point or pointer data type gathered together into one vector. | 14543 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector. |
12609 | 14544 |
12610 :: | 14545 :: |
12611 | 14546 |
12612 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) | 14547 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) |
12613 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) | 14548 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) |
12657 '``llvm.masked.scatter.*``' Intrinsics | 14592 '``llvm.masked.scatter.*``' Intrinsics |
12658 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 14593 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12659 | 14594 |
12660 Syntax: | 14595 Syntax: |
12661 """"""" | 14596 """"""" |
12662 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element. | 14597 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element. |
12663 | 14598 |
12664 :: | 14599 :: |
12665 | 14600 |
12666 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) | 14601 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) |
12667 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) | 14602 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) |
12702 store i32 %val1, i32* %ptr1, align 4 | 14637 store i32 %val1, i32* %ptr1, align 4 |
12703 .. | 14638 .. |
12704 store i32 %val7, i32* %ptr7, align 4 | 14639 store i32 %val7, i32* %ptr7, align 4 |
12705 | 14640 |
12706 | 14641 |
14642 Masked Vector Expanding Load and Compressing Store Intrinsics | |
14643 ------------------------------------------------------------- | |
14644 | |
14645 LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`. | |
14646 | |
14647 .. _int_expandload: | |
14648 | |
14649 '``llvm.masked.expandload.*``' Intrinsics | |
14650 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
14651 | |
14652 Syntax: | |
14653 """"""" | |
14654 This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask. | |
14655 | |
14656 :: | |
14657 | |
14658 declare <16 x float> @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>) | |
14659 declare <2 x i64> @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>) | |
14660 | |
14661 Overview: | |
14662 """"""""" | |
14663 | |
14664 Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "explandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand. | |
14665 | |
14666 | |
14667 Arguments: | |
14668 """""""""" | |
14669 | |
14670 The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type. | |
14671 | |
14672 Semantics: | |
14673 """""""""" | |
14674 | |
14675 The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example: | |
14676 | |
14677 .. code-block:: c | |
14678 | |
14679 // In this loop we load from B and spread the elements into array A. | |
14680 double *A, B; int *C; | |
14681 for (int i = 0; i < size; ++i) { | |
14682 if (C[i] != 0) | |
14683 A[i] = B[j++]; | |
14684 } | |
14685 | |
14686 | |
14687 .. code-block:: llvm | |
14688 | |
14689 ; Load several elements from array B and expand them in a vector. | |
14690 ; The number of loaded elements is equal to the number of '1' elements in the Mask. | |
14691 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef) | |
14692 ; Store the result in A | |
14693 call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask) | |
14694 | |
14695 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. | |
14696 %MaskI = bitcast <8 x i1> %Mask to i8 | |
14697 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) | |
14698 %MaskI64 = zext i8 %MaskIPopcnt to i64 | |
14699 %BNextInd = add i64 %BInd, %MaskI64 | |
14700 | |
14701 | |
14702 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles. | |
14703 If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load. | |
14704 | |
14705 .. _int_compressstore: | |
14706 | |
14707 '``llvm.masked.compressstore.*``' Intrinsics | |
14708 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
14709 | |
14710 Syntax: | |
14711 """"""" | |
14712 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector. | |
14713 | |
14714 :: | |
14715 | |
14716 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, i32* <ptr>, <8 x i1> <mask>) | |
14717 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>) | |
14718 | |
14719 Overview: | |
14720 """"""""" | |
14721 | |
14722 Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask. | |
14723 | |
14724 Arguments: | |
14725 """""""""" | |
14726 | |
14727 The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements. | |
14728 | |
14729 | |
14730 Semantics: | |
14731 """""""""" | |
14732 | |
14733 The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example: | |
14734 | |
14735 .. code-block:: c | |
14736 | |
14737 // In this loop we load elements from A and store them consecutively in B | |
14738 double *A, B; int *C; | |
14739 for (int i = 0; i < size; ++i) { | |
14740 if (C[i] != 0) | |
14741 B[j++] = A[i] | |
14742 } | |
14743 | |
14744 | |
14745 .. code-block:: llvm | |
14746 | |
14747 ; Load elements from A. | |
14748 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef) | |
14749 ; Store all selected elements consecutively in array B | |
14750 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask) | |
14751 | |
14752 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. | |
14753 %MaskI = bitcast <8 x i1> %Mask to i8 | |
14754 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) | |
14755 %MaskI64 = zext i8 %MaskIPopcnt to i64 | |
14756 %BNextInd = add i64 %BInd, %MaskI64 | |
14757 | |
14758 | |
14759 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations. | |
14760 | |
14761 | |
12707 Memory Use Markers | 14762 Memory Use Markers |
12708 ------------------ | 14763 ------------------ |
12709 | 14764 |
12710 This class of intrinsics provides information about the lifetime of | 14765 This class of intrinsics provides information about the lifetime of |
12711 memory objects and ranges where variables are immutable. | 14766 memory objects and ranges where variables are immutable. |
12835 Semantics: | 14890 Semantics: |
12836 """""""""" | 14891 """""""""" |
12837 | 14892 |
12838 This intrinsic indicates that the memory is mutable again. | 14893 This intrinsic indicates that the memory is mutable again. |
12839 | 14894 |
12840 '``llvm.invariant.group.barrier``' Intrinsic | 14895 '``llvm.launder.invariant.group``' Intrinsic |
12841 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 14896 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
12842 | 14897 |
12843 Syntax: | 14898 Syntax: |
12844 """"""" | 14899 """"""" |
12845 This is an overloaded intrinsic. The memory object can belong to any address | 14900 This is an overloaded intrinsic. The memory object can belong to any address |
12846 space. The returned pointer must belong to the same address space as the | 14901 space. The returned pointer must belong to the same address space as the |
12847 argument. | 14902 argument. |
12848 | 14903 |
12849 :: | 14904 :: |
12850 | 14905 |
12851 declare i8* @llvm.invariant.group.barrier.p0i8(i8* <ptr>) | 14906 declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>) |
12852 | 14907 |
12853 Overview: | 14908 Overview: |
12854 """"""""" | 14909 """"""""" |
12855 | 14910 |
12856 The '``llvm.invariant.group.barrier``' intrinsic can be used when an invariant | 14911 The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant |
12857 established by invariant.group metadata no longer holds, to obtain a new pointer | 14912 established by ``invariant.group`` metadata no longer holds, to obtain a new |
12858 value that does not carry the invariant information. | 14913 pointer value that carries fresh invariant group information. It is an |
12859 | 14914 experimental intrinsic, which means that its semantics might change in the |
12860 | 14915 future. |
12861 Arguments: | 14916 |
12862 """""""""" | 14917 |
12863 | 14918 Arguments: |
12864 The ``llvm.invariant.group.barrier`` takes only one argument, which is | 14919 """""""""" |
12865 the pointer to the memory for which the ``invariant.group`` no longer holds. | 14920 |
14921 The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer | |
14922 to the memory. | |
12866 | 14923 |
12867 Semantics: | 14924 Semantics: |
12868 """""""""" | 14925 """""""""" |
12869 | 14926 |
12870 Returns another pointer that aliases its argument but which is considered different | 14927 Returns another pointer that aliases its argument but which is considered different |
12871 for the purposes of ``load``/``store`` ``invariant.group`` metadata. | 14928 for the purposes of ``load``/``store`` ``invariant.group`` metadata. |
12872 | 14929 It does not read any accessible memory and the execution can be speculated. |
12873 Constrained Floating Point Intrinsics | 14930 |
14931 '``llvm.strip.invariant.group``' Intrinsic | |
14932 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
14933 | |
14934 Syntax: | |
14935 """"""" | |
14936 This is an overloaded intrinsic. The memory object can belong to any address | |
14937 space. The returned pointer must belong to the same address space as the | |
14938 argument. | |
14939 | |
14940 :: | |
14941 | |
14942 declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>) | |
14943 | |
14944 Overview: | |
14945 """"""""" | |
14946 | |
14947 The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant | |
14948 established by ``invariant.group`` metadata no longer holds, to obtain a new pointer | |
14949 value that does not carry the invariant information. It is an experimental | |
14950 intrinsic, which means that its semantics might change in the future. | |
14951 | |
14952 | |
14953 Arguments: | |
14954 """""""""" | |
14955 | |
14956 The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer | |
14957 to the memory. | |
14958 | |
14959 Semantics: | |
14960 """""""""" | |
14961 | |
14962 Returns another pointer that aliases its argument but which has no associated | |
14963 ``invariant.group`` metadata. | |
14964 It does not read any memory and can be speculated. | |
14965 | |
14966 | |
14967 | |
14968 .. _constrainedfp: | |
14969 | |
14970 Constrained Floating-Point Intrinsics | |
12874 ------------------------------------- | 14971 ------------------------------------- |
12875 | 14972 |
12876 These intrinsics are used to provide special handling of floating point | 14973 These intrinsics are used to provide special handling of floating-point |
12877 operations when specific rounding mode or floating point exception behavior is | 14974 operations when specific rounding mode or floating-point exception behavior is |
12878 required. By default, LLVM optimization passes assume that the rounding mode is | 14975 required. By default, LLVM optimization passes assume that the rounding mode is |
12879 round-to-nearest and that floating point exceptions will not be monitored. | 14976 round-to-nearest and that floating-point exceptions will not be monitored. |
12880 Constrained FP intrinsics are used to support non-default rounding modes and | 14977 Constrained FP intrinsics are used to support non-default rounding modes and |
12881 accurately preserve exception behavior without compromising LLVM's ability to | 14978 accurately preserve exception behavior without compromising LLVM's ability to |
12882 optimize FP code when the default behavior is used. | 14979 optimize FP code when the default behavior is used. |
12883 | 14980 |
12884 Each of these intrinsics corresponds to a normal floating point operation. The | 14981 Each of these intrinsics corresponds to a normal floating-point operation. The |
12885 first two arguments and the return value are the same as the corresponding FP | 14982 first two arguments and the return value are the same as the corresponding FP |
12886 operation. | 14983 operation. |
12887 | 14984 |
12888 The third argument is a metadata argument specifying the rounding mode to be | 14985 The third argument is a metadata argument specifying the rounding mode to be |
12889 assumed. This argument must be one of the following strings: | 14986 assumed. This argument must be one of the following strings: |
12914 actual runtime rounding mode (as defined in a target-specific manner) matches | 15011 actual runtime rounding mode (as defined in a target-specific manner) matches |
12915 the specified rounding mode, but this is not guaranteed. Using a specific | 15012 the specified rounding mode, but this is not guaranteed. Using a specific |
12916 non-dynamic rounding mode which does not match the actual rounding mode at | 15013 non-dynamic rounding mode which does not match the actual rounding mode at |
12917 runtime results in undefined behavior. | 15014 runtime results in undefined behavior. |
12918 | 15015 |
12919 The fourth argument to the constrained floating point intrinsics specifies the | 15016 The fourth argument to the constrained floating-point intrinsics specifies the |
12920 required exception behavior. This argument must be one of the following | 15017 required exception behavior. This argument must be one of the following |
12921 strings: | 15018 strings: |
12922 | 15019 |
12923 :: | 15020 :: |
12924 | 15021 |
12925 "fpexcept.ignore" | 15022 "fpexcept.ignore" |
12926 "fpexcept.maytrap" | 15023 "fpexcept.maytrap" |
12927 "fpexcept.strict" | 15024 "fpexcept.strict" |
12928 | 15025 |
12929 If this argument is "fpexcept.ignore" optimization passes may assume that the | 15026 If this argument is "fpexcept.ignore" optimization passes may assume that the |
12930 exception status flags will not be read and that floating point exceptions will | 15027 exception status flags will not be read and that floating-point exceptions will |
12931 be masked. This allows transformations to be performed that may change the | 15028 be masked. This allows transformations to be performed that may change the |
12932 exception semantics of the original code. For example, FP operations may be | 15029 exception semantics of the original code. For example, FP operations may be |
12933 speculatively executed in this case whereas they must not be for either of the | 15030 speculatively executed in this case whereas they must not be for either of the |
12934 other possible values of this argument. | 15031 other possible values of this argument. |
12935 | 15032 |
12939 passes are not required to preserve all exceptions that are implied by the | 15036 passes are not required to preserve all exceptions that are implied by the |
12940 original code. For example, exceptions may be potentially hidden by constant | 15037 original code. For example, exceptions may be potentially hidden by constant |
12941 folding. | 15038 folding. |
12942 | 15039 |
12943 If the exception behavior argument is "fpexcept.strict" all transformations must | 15040 If the exception behavior argument is "fpexcept.strict" all transformations must |
12944 strictly preserve the floating point exception semantics of the original code. | 15041 strictly preserve the floating-point exception semantics of the original code. |
12945 Any FP exception that would have been raised by the original code must be raised | 15042 Any FP exception that would have been raised by the original code must be raised |
12946 by the transformed code, and the transformed code must not raise any FP | 15043 by the transformed code, and the transformed code must not raise any FP |
12947 exceptions that would not have been raised by the original code. This is the | 15044 exceptions that would not have been raised by the original code. This is the |
12948 exception behavior argument that will be used if the code being compiled reads | 15045 exception behavior argument that will be used if the code being compiled reads |
12949 the FP exception status flags, but this mode can also be used with code that | 15046 the FP exception status flags, but this mode can also be used with code that |
12950 unmasks FP exceptions. | 15047 unmasks FP exceptions. |
12951 | 15048 |
12952 The number and order of floating point exceptions is NOT guaranteed. For | 15049 The number and order of floating-point exceptions is NOT guaranteed. For |
12953 example, a series of FP operations that each may raise exceptions may be | 15050 example, a series of FP operations that each may raise exceptions may be |
12954 vectorized into a single instruction that raises each unique exception a single | 15051 vectorized into a single instruction that raises each unique exception a single |
12955 time. | 15052 time. |
12956 | 15053 |
12957 | 15054 |
12977 | 15074 |
12978 Arguments: | 15075 Arguments: |
12979 """""""""" | 15076 """""""""" |
12980 | 15077 |
12981 The first two arguments to the '``llvm.experimental.constrained.fadd``' | 15078 The first two arguments to the '``llvm.experimental.constrained.fadd``' |
12982 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>` | 15079 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` |
12983 of floating point values. Both arguments must have identical types. | 15080 of floating-point values. Both arguments must have identical types. |
12984 | 15081 |
12985 The third and fourth arguments specify the rounding mode and exception | 15082 The third and fourth arguments specify the rounding mode and exception |
12986 behavior as described above. | 15083 behavior as described above. |
12987 | 15084 |
12988 Semantics: | 15085 Semantics: |
12989 """""""""" | 15086 """""""""" |
12990 | 15087 |
12991 The value produced is the floating point sum of the two value operands and has | 15088 The value produced is the floating-point sum of the two value operands and has |
12992 the same type as the operands. | 15089 the same type as the operands. |
12993 | 15090 |
12994 | 15091 |
12995 '``llvm.experimental.constrained.fsub``' Intrinsic | 15092 '``llvm.experimental.constrained.fsub``' Intrinsic |
12996 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15093 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13014 | 15111 |
13015 Arguments: | 15112 Arguments: |
13016 """""""""" | 15113 """""""""" |
13017 | 15114 |
13018 The first two arguments to the '``llvm.experimental.constrained.fsub``' | 15115 The first two arguments to the '``llvm.experimental.constrained.fsub``' |
13019 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>` | 15116 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` |
13020 of floating point values. Both arguments must have identical types. | 15117 of floating-point values. Both arguments must have identical types. |
13021 | 15118 |
13022 The third and fourth arguments specify the rounding mode and exception | 15119 The third and fourth arguments specify the rounding mode and exception |
13023 behavior as described above. | 15120 behavior as described above. |
13024 | 15121 |
13025 Semantics: | 15122 Semantics: |
13026 """""""""" | 15123 """""""""" |
13027 | 15124 |
13028 The value produced is the floating point difference of the two value operands | 15125 The value produced is the floating-point difference of the two value operands |
13029 and has the same type as the operands. | 15126 and has the same type as the operands. |
13030 | 15127 |
13031 | 15128 |
13032 '``llvm.experimental.constrained.fmul``' Intrinsic | 15129 '``llvm.experimental.constrained.fmul``' Intrinsic |
13033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15130 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13051 | 15148 |
13052 Arguments: | 15149 Arguments: |
13053 """""""""" | 15150 """""""""" |
13054 | 15151 |
13055 The first two arguments to the '``llvm.experimental.constrained.fmul``' | 15152 The first two arguments to the '``llvm.experimental.constrained.fmul``' |
13056 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>` | 15153 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` |
13057 of floating point values. Both arguments must have identical types. | 15154 of floating-point values. Both arguments must have identical types. |
13058 | 15155 |
13059 The third and fourth arguments specify the rounding mode and exception | 15156 The third and fourth arguments specify the rounding mode and exception |
13060 behavior as described above. | 15157 behavior as described above. |
13061 | 15158 |
13062 Semantics: | 15159 Semantics: |
13063 """""""""" | 15160 """""""""" |
13064 | 15161 |
13065 The value produced is the floating point product of the two value operands and | 15162 The value produced is the floating-point product of the two value operands and |
13066 has the same type as the operands. | 15163 has the same type as the operands. |
13067 | 15164 |
13068 | 15165 |
13069 '``llvm.experimental.constrained.fdiv``' Intrinsic | 15166 '``llvm.experimental.constrained.fdiv``' Intrinsic |
13070 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15167 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13088 | 15185 |
13089 Arguments: | 15186 Arguments: |
13090 """""""""" | 15187 """""""""" |
13091 | 15188 |
13092 The first two arguments to the '``llvm.experimental.constrained.fdiv``' | 15189 The first two arguments to the '``llvm.experimental.constrained.fdiv``' |
13093 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>` | 15190 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` |
13094 of floating point values. Both arguments must have identical types. | 15191 of floating-point values. Both arguments must have identical types. |
13095 | 15192 |
13096 The third and fourth arguments specify the rounding mode and exception | 15193 The third and fourth arguments specify the rounding mode and exception |
13097 behavior as described above. | 15194 behavior as described above. |
13098 | 15195 |
13099 Semantics: | 15196 Semantics: |
13100 """""""""" | 15197 """""""""" |
13101 | 15198 |
13102 The value produced is the floating point quotient of the two value operands and | 15199 The value produced is the floating-point quotient of the two value operands and |
13103 has the same type as the operands. | 15200 has the same type as the operands. |
13104 | 15201 |
13105 | 15202 |
13106 '``llvm.experimental.constrained.frem``' Intrinsic | 15203 '``llvm.experimental.constrained.frem``' Intrinsic |
13107 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15204 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13125 | 15222 |
13126 Arguments: | 15223 Arguments: |
13127 """""""""" | 15224 """""""""" |
13128 | 15225 |
13129 The first two arguments to the '``llvm.experimental.constrained.frem``' | 15226 The first two arguments to the '``llvm.experimental.constrained.frem``' |
13130 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>` | 15227 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` |
13131 of floating point values. Both arguments must have identical types. | 15228 of floating-point values. Both arguments must have identical types. |
13132 | 15229 |
13133 The third and fourth arguments specify the rounding mode and exception | 15230 The third and fourth arguments specify the rounding mode and exception |
13134 behavior as described above. The rounding mode argument has no effect, since | 15231 behavior as described above. The rounding mode argument has no effect, since |
13135 the result of frem is never rounded, but the argument is included for | 15232 the result of frem is never rounded, but the argument is included for |
13136 consistency with the other constrained floating point intrinsics. | 15233 consistency with the other constrained floating-point intrinsics. |
13137 | 15234 |
13138 Semantics: | 15235 Semantics: |
13139 """""""""" | 15236 """""""""" |
13140 | 15237 |
13141 The value produced is the floating point remainder from the division of the two | 15238 The value produced is the floating-point remainder from the division of the two |
13142 value operands and has the same type as the operands. The remainder has the | 15239 value operands and has the same type as the operands. The remainder has the |
13143 same sign as the dividend. | 15240 same sign as the dividend. |
13144 | 15241 |
13145 '``llvm.experimental.constrained.fma``' Intrinsic | 15242 '``llvm.experimental.constrained.fma``' Intrinsic |
13146 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15243 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13147 | 15244 |
13148 Syntax: | 15245 Syntax: |
13149 """"""" | 15246 """"""" |
13150 | 15247 |
13151 :: | 15248 :: |
13163 | 15260 |
13164 Arguments: | 15261 Arguments: |
13165 """""""""" | 15262 """""""""" |
13166 | 15263 |
13167 The first three arguments to the '``llvm.experimental.constrained.fma``' | 15264 The first three arguments to the '``llvm.experimental.constrained.fma``' |
13168 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector | 15265 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector |
13169 <t_vector>` of floating point values. All arguments must have identical types. | 15266 <t_vector>` of floating-point values. All arguments must have identical types. |
13170 | 15267 |
13171 The fourth and fifth arguments specify the rounding mode and exception behavior | 15268 The fourth and fifth arguments specify the rounding mode and exception behavior |
13172 as described above. | 15269 as described above. |
13173 | 15270 |
13174 Semantics: | 15271 Semantics: |
13176 | 15273 |
13177 The result produced is the product of the first two operands added to the third | 15274 The result produced is the product of the first two operands added to the third |
13178 operand computed with infinite precision, and then rounded to the target | 15275 operand computed with infinite precision, and then rounded to the target |
13179 precision. | 15276 precision. |
13180 | 15277 |
15278 '``llvm.experimental.constrained.fptrunc``' Intrinsic | |
15279 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15280 | |
15281 Syntax: | |
15282 """"""" | |
15283 | |
15284 :: | |
15285 | |
15286 declare <ty2> | |
15287 @llvm.experimental.constrained.fptrunc(<type> <value>, | |
15288 metadata <rounding mode>, | |
15289 metadata <exception behavior>) | |
15290 | |
15291 Overview: | |
15292 """"""""" | |
15293 | |
15294 The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value`` | |
15295 to type ``ty2``. | |
15296 | |
15297 Arguments: | |
15298 """""""""" | |
15299 | |
15300 The first argument to the '``llvm.experimental.constrained.fptrunc``' | |
15301 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector | |
15302 <t_vector>` of floating point values. This argument must be larger in size | |
15303 than the result. | |
15304 | |
15305 The second and third arguments specify the rounding mode and exception | |
15306 behavior as described above. | |
15307 | |
15308 Semantics: | |
15309 """""""""" | |
15310 | |
15311 The result produced is a floating point value truncated to be smaller in size | |
15312 than the operand. | |
15313 | |
15314 '``llvm.experimental.constrained.fpext``' Intrinsic | |
15315 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15316 | |
15317 Syntax: | |
15318 """"""" | |
15319 | |
15320 :: | |
15321 | |
15322 declare <ty2> | |
15323 @llvm.experimental.constrained.fpext(<type> <value>, | |
15324 metadata <exception behavior>) | |
15325 | |
15326 Overview: | |
15327 """"""""" | |
15328 | |
15329 The '``llvm.experimental.constrained.fpext``' intrinsic extends a | |
15330 floating-point ``value`` to a larger floating-point value. | |
15331 | |
15332 Arguments: | |
15333 """""""""" | |
15334 | |
15335 The first argument to the '``llvm.experimental.constrained.fpext``' | |
15336 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector | |
15337 <t_vector>` of floating point values. This argument must be smaller in size | |
15338 than the result. | |
15339 | |
15340 The second argument specifies the exception behavior as described above. | |
15341 | |
15342 Semantics: | |
15343 """""""""" | |
15344 | |
15345 The result produced is a floating point value extended to be larger in size | |
15346 than the operand. All restrictions that apply to the fpext instruction also | |
15347 apply to this intrinsic. | |
15348 | |
13181 Constrained libm-equivalent Intrinsics | 15349 Constrained libm-equivalent Intrinsics |
13182 -------------------------------------- | 15350 -------------------------------------- |
13183 | 15351 |
13184 In addition to the basic floating point operations for which constrained | 15352 In addition to the basic floating-point operations for which constrained |
13185 intrinsics are described above, there are constrained versions of various | 15353 intrinsics are described above, there are constrained versions of various |
13186 operations which provide equivalent behavior to a corresponding libm function. | 15354 operations which provide equivalent behavior to a corresponding libm function. |
13187 These intrinsics allow the precise behavior of these operations with respect to | 15355 These intrinsics allow the precise behavior of these operations with respect to |
13188 rounding mode and exception behavior to be controlled. | 15356 rounding mode and exception behavior to be controlled. |
13189 | 15357 |
13190 As with the basic constrained floating point intrinsics, the rounding mode | 15358 As with the basic constrained floating-point intrinsics, the rounding mode |
13191 and exception behavior arguments only control the behavior of the optimizer. | 15359 and exception behavior arguments only control the behavior of the optimizer. |
13192 They do not change the runtime floating point environment. | 15360 They do not change the runtime floating-point environment. |
13193 | 15361 |
13194 | 15362 |
13195 '``llvm.experimental.constrained.sqrt``' Intrinsic | 15363 '``llvm.experimental.constrained.sqrt``' Intrinsic |
13196 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15364 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13197 | 15365 |
13213 functions would, but without setting ``errno``. | 15381 functions would, but without setting ``errno``. |
13214 | 15382 |
13215 Arguments: | 15383 Arguments: |
13216 """""""""" | 15384 """""""""" |
13217 | 15385 |
13218 The first argument and the return type are floating point numbers of the same | 15386 The first argument and the return type are floating-point numbers of the same |
13219 type. | 15387 type. |
13220 | 15388 |
13221 The second and third arguments specify the rounding mode and exception | 15389 The second and third arguments specify the rounding mode and exception |
13222 behavior as described above. | 15390 behavior as described above. |
13223 | 15391 |
13224 Semantics: | 15392 Semantics: |
13225 """""""""" | 15393 """""""""" |
13226 | 15394 |
13227 This function returns the nonnegative square root of the specified value. | 15395 This function returns the nonnegative square root of the specified value. |
13228 If the value is less than negative zero, a floating point exception occurs | 15396 If the value is less than negative zero, a floating-point exception occurs |
13229 and the return value is architecture specific. | 15397 and the return value is architecture specific. |
13230 | 15398 |
13231 | 15399 |
13232 '``llvm.experimental.constrained.pow``' Intrinsic | 15400 '``llvm.experimental.constrained.pow``' Intrinsic |
13233 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13249 raised to the (positive or negative) power specified by the second operand. | 15417 raised to the (positive or negative) power specified by the second operand. |
13250 | 15418 |
13251 Arguments: | 15419 Arguments: |
13252 """""""""" | 15420 """""""""" |
13253 | 15421 |
13254 The first two arguments and the return value are floating point numbers of the | 15422 The first two arguments and the return value are floating-point numbers of the |
13255 same type. The second argument specifies the power to which the first argument | 15423 same type. The second argument specifies the power to which the first argument |
13256 should be raised. | 15424 should be raised. |
13257 | 15425 |
13258 The third and fourth arguments specify the rounding mode and exception | 15426 The third and fourth arguments specify the rounding mode and exception |
13259 behavior as described above. | 15427 behavior as described above. |
13282 Overview: | 15450 Overview: |
13283 """"""""" | 15451 """"""""" |
13284 | 15452 |
13285 The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand | 15453 The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand |
13286 raised to the (positive or negative) power specified by the second operand. The | 15454 raised to the (positive or negative) power specified by the second operand. The |
13287 order of evaluation of multiplications is not defined. When a vector of floating | 15455 order of evaluation of multiplications is not defined. When a vector of |
13288 point type is used, the second argument remains a scalar integer value. | 15456 floating-point type is used, the second argument remains a scalar integer value. |
13289 | 15457 |
13290 | 15458 |
13291 Arguments: | 15459 Arguments: |
13292 """""""""" | 15460 """""""""" |
13293 | 15461 |
13294 The first argument and the return value are floating point numbers of the same | 15462 The first argument and the return value are floating-point numbers of the same |
13295 type. The second argument is a 32-bit signed integer specifying the power to | 15463 type. The second argument is a 32-bit signed integer specifying the power to |
13296 which the first argument should be raised. | 15464 which the first argument should be raised. |
13297 | 15465 |
13298 The third and fourth arguments specify the rounding mode and exception | 15466 The third and fourth arguments specify the rounding mode and exception |
13299 behavior as described above. | 15467 behavior as described above. |
13325 first operand. | 15493 first operand. |
13326 | 15494 |
13327 Arguments: | 15495 Arguments: |
13328 """""""""" | 15496 """""""""" |
13329 | 15497 |
13330 The first argument and the return type are floating point numbers of the same | 15498 The first argument and the return type are floating-point numbers of the same |
13331 type. | 15499 type. |
13332 | 15500 |
13333 The second and third arguments specify the rounding mode and exception | 15501 The second and third arguments specify the rounding mode and exception |
13334 behavior as described above. | 15502 behavior as described above. |
13335 | 15503 |
13361 first operand. | 15529 first operand. |
13362 | 15530 |
13363 Arguments: | 15531 Arguments: |
13364 """""""""" | 15532 """""""""" |
13365 | 15533 |
13366 The first argument and the return type are floating point numbers of the same | 15534 The first argument and the return type are floating-point numbers of the same |
13367 type. | 15535 type. |
13368 | 15536 |
13369 The second and third arguments specify the rounding mode and exception | 15537 The second and third arguments specify the rounding mode and exception |
13370 behavior as described above. | 15538 behavior as described above. |
13371 | 15539 |
13397 exponential of the specified value. | 15565 exponential of the specified value. |
13398 | 15566 |
13399 Arguments: | 15567 Arguments: |
13400 """""""""" | 15568 """""""""" |
13401 | 15569 |
13402 The first argument and the return value are floating point numbers of the same | 15570 The first argument and the return value are floating-point numbers of the same |
13403 type. | 15571 type. |
13404 | 15572 |
13405 The second and third arguments specify the rounding mode and exception | 15573 The second and third arguments specify the rounding mode and exception |
13406 behavior as described above. | 15574 behavior as described above. |
13407 | 15575 |
13433 | 15601 |
13434 | 15602 |
13435 Arguments: | 15603 Arguments: |
13436 """""""""" | 15604 """""""""" |
13437 | 15605 |
13438 The first argument and the return value are floating point numbers of the same | 15606 The first argument and the return value are floating-point numbers of the same |
13439 type. | 15607 type. |
13440 | 15608 |
13441 The second and third arguments specify the rounding mode and exception | 15609 The second and third arguments specify the rounding mode and exception |
13442 behavior as described above. | 15610 behavior as described above. |
13443 | 15611 |
13468 logarithm of the specified value. | 15636 logarithm of the specified value. |
13469 | 15637 |
13470 Arguments: | 15638 Arguments: |
13471 """""""""" | 15639 """""""""" |
13472 | 15640 |
13473 The first argument and the return value are floating point numbers of the same | 15641 The first argument and the return value are floating-point numbers of the same |
13474 type. | 15642 type. |
13475 | 15643 |
13476 The second and third arguments specify the rounding mode and exception | 15644 The second and third arguments specify the rounding mode and exception |
13477 behavior as described above. | 15645 behavior as described above. |
13478 | 15646 |
13504 logarithm of the specified value. | 15672 logarithm of the specified value. |
13505 | 15673 |
13506 Arguments: | 15674 Arguments: |
13507 """""""""" | 15675 """""""""" |
13508 | 15676 |
13509 The first argument and the return value are floating point numbers of the same | 15677 The first argument and the return value are floating-point numbers of the same |
13510 type. | 15678 type. |
13511 | 15679 |
13512 The second and third arguments specify the rounding mode and exception | 15680 The second and third arguments specify the rounding mode and exception |
13513 behavior as described above. | 15681 behavior as described above. |
13514 | 15682 |
13539 logarithm of the specified value. | 15707 logarithm of the specified value. |
13540 | 15708 |
13541 Arguments: | 15709 Arguments: |
13542 """""""""" | 15710 """""""""" |
13543 | 15711 |
13544 The first argument and the return value are floating point numbers of the same | 15712 The first argument and the return value are floating-point numbers of the same |
13545 type. | 15713 type. |
13546 | 15714 |
13547 The second and third arguments specify the rounding mode and exception | 15715 The second and third arguments specify the rounding mode and exception |
13548 behavior as described above. | 15716 behavior as described above. |
13549 | 15717 |
13569 | 15737 |
13570 Overview: | 15738 Overview: |
13571 """"""""" | 15739 """"""""" |
13572 | 15740 |
13573 The '``llvm.experimental.constrained.rint``' intrinsic returns the first | 15741 The '``llvm.experimental.constrained.rint``' intrinsic returns the first |
13574 operand rounded to the nearest integer. It may raise an inexact floating point | 15742 operand rounded to the nearest integer. It may raise an inexact floating-point |
13575 exception if the operand is not an integer. | 15743 exception if the operand is not an integer. |
13576 | 15744 |
13577 Arguments: | 15745 Arguments: |
13578 """""""""" | 15746 """""""""" |
13579 | 15747 |
13580 The first argument and the return value are floating point numbers of the same | 15748 The first argument and the return value are floating-point numbers of the same |
13581 type. | 15749 type. |
13582 | 15750 |
13583 The second and third arguments specify the rounding mode and exception | 15751 The second and third arguments specify the rounding mode and exception |
13584 behavior as described above. | 15752 behavior as described above. |
13585 | 15753 |
13587 """""""""" | 15755 """""""""" |
13588 | 15756 |
13589 This function returns the same values as the libm ``rint`` functions | 15757 This function returns the same values as the libm ``rint`` functions |
13590 would, and handles error conditions in the same way. The rounding mode is | 15758 would, and handles error conditions in the same way. The rounding mode is |
13591 described, not determined, by the rounding mode argument. The actual rounding | 15759 described, not determined, by the rounding mode argument. The actual rounding |
13592 mode is determined by the runtime floating point environment. The rounding | 15760 mode is determined by the runtime floating-point environment. The rounding |
13593 mode argument is only intended as information to the compiler. | 15761 mode argument is only intended as information to the compiler. |
13594 | 15762 |
13595 | 15763 |
13596 '``llvm.experimental.constrained.nearbyint``' Intrinsic | 15764 '``llvm.experimental.constrained.nearbyint``' Intrinsic |
13597 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 15765 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13608 | 15776 |
13609 Overview: | 15777 Overview: |
13610 """"""""" | 15778 """"""""" |
13611 | 15779 |
13612 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first | 15780 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first |
13613 operand rounded to the nearest integer. It will not raise an inexact floating | 15781 operand rounded to the nearest integer. It will not raise an inexact |
13614 point exception if the operand is not an integer. | 15782 floating-point exception if the operand is not an integer. |
13615 | 15783 |
13616 | 15784 |
13617 Arguments: | 15785 Arguments: |
13618 """""""""" | 15786 """""""""" |
13619 | 15787 |
13620 The first argument and the return value are floating point numbers of the same | 15788 The first argument and the return value are floating-point numbers of the same |
13621 type. | 15789 type. |
13622 | 15790 |
13623 The second and third arguments specify the rounding mode and exception | 15791 The second and third arguments specify the rounding mode and exception |
13624 behavior as described above. | 15792 behavior as described above. |
13625 | 15793 |
13627 """""""""" | 15795 """""""""" |
13628 | 15796 |
13629 This function returns the same values as the libm ``nearbyint`` functions | 15797 This function returns the same values as the libm ``nearbyint`` functions |
13630 would, and handles error conditions in the same way. The rounding mode is | 15798 would, and handles error conditions in the same way. The rounding mode is |
13631 described, not determined, by the rounding mode argument. The actual rounding | 15799 described, not determined, by the rounding mode argument. The actual rounding |
13632 mode is determined by the runtime floating point environment. The rounding | 15800 mode is determined by the runtime floating-point environment. The rounding |
13633 mode argument is only intended as information to the compiler. | 15801 mode argument is only intended as information to the compiler. |
15802 | |
15803 | |
15804 '``llvm.experimental.constrained.maxnum``' Intrinsic | |
15805 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15806 | |
15807 Syntax: | |
15808 """"""" | |
15809 | |
15810 :: | |
15811 | |
15812 declare <type> | |
15813 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2> | |
15814 metadata <rounding mode>, | |
15815 metadata <exception behavior>) | |
15816 | |
15817 Overview: | |
15818 """"""""" | |
15819 | |
15820 The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum | |
15821 of the two arguments. | |
15822 | |
15823 Arguments: | |
15824 """""""""" | |
15825 | |
15826 The first two arguments and the return value are floating-point numbers | |
15827 of the same type. | |
15828 | |
15829 The third and forth arguments specify the rounding mode and exception | |
15830 behavior as described above. | |
15831 | |
15832 Semantics: | |
15833 """""""""" | |
15834 | |
15835 This function follows the IEEE-754 semantics for maxNum. The rounding mode is | |
15836 described, not determined, by the rounding mode argument. The actual rounding | |
15837 mode is determined by the runtime floating-point environment. The rounding | |
15838 mode argument is only intended as information to the compiler. | |
15839 | |
15840 | |
15841 '``llvm.experimental.constrained.minnum``' Intrinsic | |
15842 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15843 | |
15844 Syntax: | |
15845 """"""" | |
15846 | |
15847 :: | |
15848 | |
15849 declare <type> | |
15850 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2> | |
15851 metadata <rounding mode>, | |
15852 metadata <exception behavior>) | |
15853 | |
15854 Overview: | |
15855 """"""""" | |
15856 | |
15857 The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum | |
15858 of the two arguments. | |
15859 | |
15860 Arguments: | |
15861 """""""""" | |
15862 | |
15863 The first two arguments and the return value are floating-point numbers | |
15864 of the same type. | |
15865 | |
15866 The third and forth arguments specify the rounding mode and exception | |
15867 behavior as described above. | |
15868 | |
15869 Semantics: | |
15870 """""""""" | |
15871 | |
15872 This function follows the IEEE-754 semantics for minNum. The rounding mode is | |
15873 described, not determined, by the rounding mode argument. The actual rounding | |
15874 mode is determined by the runtime floating-point environment. The rounding | |
15875 mode argument is only intended as information to the compiler. | |
15876 | |
15877 | |
15878 '``llvm.experimental.constrained.ceil``' Intrinsic | |
15879 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15880 | |
15881 Syntax: | |
15882 """"""" | |
15883 | |
15884 :: | |
15885 | |
15886 declare <type> | |
15887 @llvm.experimental.constrained.ceil(<type> <op1>, | |
15888 metadata <rounding mode>, | |
15889 metadata <exception behavior>) | |
15890 | |
15891 Overview: | |
15892 """"""""" | |
15893 | |
15894 The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the | |
15895 first operand. | |
15896 | |
15897 Arguments: | |
15898 """""""""" | |
15899 | |
15900 The first argument and the return value are floating-point numbers of the same | |
15901 type. | |
15902 | |
15903 The second and third arguments specify the rounding mode and exception | |
15904 behavior as described above. The rounding mode is currently unused for this | |
15905 intrinsic. | |
15906 | |
15907 Semantics: | |
15908 """""""""" | |
15909 | |
15910 This function returns the same values as the libm ``ceil`` functions | |
15911 would and handles error conditions in the same way. | |
15912 | |
15913 | |
15914 '``llvm.experimental.constrained.floor``' Intrinsic | |
15915 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15916 | |
15917 Syntax: | |
15918 """"""" | |
15919 | |
15920 :: | |
15921 | |
15922 declare <type> | |
15923 @llvm.experimental.constrained.floor(<type> <op1>, | |
15924 metadata <rounding mode>, | |
15925 metadata <exception behavior>) | |
15926 | |
15927 Overview: | |
15928 """"""""" | |
15929 | |
15930 The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the | |
15931 first operand. | |
15932 | |
15933 Arguments: | |
15934 """""""""" | |
15935 | |
15936 The first argument and the return value are floating-point numbers of the same | |
15937 type. | |
15938 | |
15939 The second and third arguments specify the rounding mode and exception | |
15940 behavior as described above. The rounding mode is currently unused for this | |
15941 intrinsic. | |
15942 | |
15943 Semantics: | |
15944 """""""""" | |
15945 | |
15946 This function returns the same values as the libm ``floor`` functions | |
15947 would and handles error conditions in the same way. | |
15948 | |
15949 | |
15950 '``llvm.experimental.constrained.round``' Intrinsic | |
15951 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15952 | |
15953 Syntax: | |
15954 """"""" | |
15955 | |
15956 :: | |
15957 | |
15958 declare <type> | |
15959 @llvm.experimental.constrained.round(<type> <op1>, | |
15960 metadata <rounding mode>, | |
15961 metadata <exception behavior>) | |
15962 | |
15963 Overview: | |
15964 """"""""" | |
15965 | |
15966 The '``llvm.experimental.constrained.round``' intrinsic returns the first | |
15967 operand rounded to the nearest integer. | |
15968 | |
15969 Arguments: | |
15970 """""""""" | |
15971 | |
15972 The first argument and the return value are floating-point numbers of the same | |
15973 type. | |
15974 | |
15975 The second and third arguments specify the rounding mode and exception | |
15976 behavior as described above. The rounding mode is currently unused for this | |
15977 intrinsic. | |
15978 | |
15979 Semantics: | |
15980 """""""""" | |
15981 | |
15982 This function returns the same values as the libm ``round`` functions | |
15983 would and handles error conditions in the same way. | |
15984 | |
15985 | |
15986 '``llvm.experimental.constrained.trunc``' Intrinsic | |
15987 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
15988 | |
15989 Syntax: | |
15990 """"""" | |
15991 | |
15992 :: | |
15993 | |
15994 declare <type> | |
15995 @llvm.experimental.constrained.trunc(<type> <op1>, | |
15996 metadata <truncing mode>, | |
15997 metadata <exception behavior>) | |
15998 | |
15999 Overview: | |
16000 """"""""" | |
16001 | |
16002 The '``llvm.experimental.constrained.trunc``' intrinsic returns the first | |
16003 operand rounded to the nearest integer not larger in magnitude than the | |
16004 operand. | |
16005 | |
16006 Arguments: | |
16007 """""""""" | |
16008 | |
16009 The first argument and the return value are floating-point numbers of the same | |
16010 type. | |
16011 | |
16012 The second and third arguments specify the truncing mode and exception | |
16013 behavior as described above. The truncing mode is currently unused for this | |
16014 intrinsic. | |
16015 | |
16016 Semantics: | |
16017 """""""""" | |
16018 | |
16019 This function returns the same values as the libm ``trunc`` functions | |
16020 would and handles error conditions in the same way. | |
13634 | 16021 |
13635 | 16022 |
13636 General Intrinsics | 16023 General Intrinsics |
13637 ------------------ | 16024 ------------------ |
13638 | 16025 |
13774 Syntax: | 16161 Syntax: |
13775 """"""" | 16162 """"""" |
13776 | 16163 |
13777 :: | 16164 :: |
13778 | 16165 |
13779 declare void @llvm.trap() noreturn nounwind | 16166 declare void @llvm.trap() cold noreturn nounwind |
13780 | 16167 |
13781 Overview: | 16168 Overview: |
13782 """"""""" | 16169 """"""""" |
13783 | 16170 |
13784 The '``llvm.trap``' intrinsic. | 16171 The '``llvm.trap``' intrinsic. |
13899 Syntax: | 16286 Syntax: |
13900 """"""" | 16287 """"""" |
13901 | 16288 |
13902 :: | 16289 :: |
13903 | 16290 |
13904 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>) | 16291 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) |
13905 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>) | 16292 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) |
13906 | 16293 |
13907 Overview: | 16294 Overview: |
13908 """"""""" | 16295 """"""""" |
13909 | 16296 |
13910 The ``llvm.objectsize`` intrinsic is designed to provide information to | 16297 The ``llvm.objectsize`` intrinsic is designed to provide information to the |
13911 the optimizers to determine at compile time whether a) an operation | 16298 optimizer to determine whether a) an operation (like memcpy) will overflow a |
13912 (like memcpy) will overflow a buffer that corresponds to an object, or | 16299 buffer that corresponds to an object, or b) that a runtime check for overflow |
13913 b) that a runtime check for overflow isn't necessary. An object in this | 16300 isn't necessary. An object in this context means an allocation of a specific |
13914 context means an allocation of a specific class, structure, array, or | 16301 class, structure, array, or other object. |
13915 other object. | 16302 |
13916 | 16303 Arguments: |
13917 Arguments: | 16304 """""""""" |
13918 """""""""" | 16305 |
13919 | 16306 The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a |
13920 The ``llvm.objectsize`` intrinsic takes three arguments. The first argument is | 16307 pointer to or into the ``object``. The second argument determines whether |
13921 a pointer to or into the ``object``. The second argument determines whether | 16308 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is |
13922 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size | 16309 unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` |
13923 is unknown. The third argument controls how ``llvm.objectsize`` acts when | 16310 in address space 0 is used as its pointer argument. If it's ``false``, |
13924 ``null`` is used as its pointer argument. If it's true and the pointer is in | 16311 ``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if |
13925 address space 0, ``null`` is treated as an opaque value with an unknown number | 16312 the ``null`` is in a non-zero address space or if ``true`` is given for the |
13926 of bytes. Otherwise, ``llvm.objectsize`` reports 0 bytes available when given | 16313 third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth |
13927 ``null``. | 16314 argument to ``llvm.objectsize`` determines if the value should be evaluated at |
13928 | 16315 runtime. |
13929 The second and third arguments only accept constants. | 16316 |
13930 | 16317 The second, third, and fourth arguments only accept constants. |
13931 Semantics: | 16318 |
13932 """""""""" | 16319 Semantics: |
13933 | 16320 """""""""" |
13934 The ``llvm.objectsize`` intrinsic is lowered to a constant representing | 16321 |
13935 the size of the object concerned. If the size cannot be determined at | 16322 The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of |
13936 compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending | 16323 the object concerned. If the size cannot be determined, ``llvm.objectsize`` |
13937 on the ``min`` argument). | 16324 returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument). |
13938 | 16325 |
13939 '``llvm.expect``' Intrinsic | 16326 '``llvm.expect``' Intrinsic |
13940 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 16327 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
13941 | 16328 |
13942 Syntax: | 16329 Syntax: |
13959 | 16346 |
13960 Arguments: | 16347 Arguments: |
13961 """""""""" | 16348 """""""""" |
13962 | 16349 |
13963 The ``llvm.expect`` intrinsic takes two arguments. The first argument is | 16350 The ``llvm.expect`` intrinsic takes two arguments. The first argument is |
13964 a value. The second argument is an expected value, this needs to be a | 16351 a value. The second argument is an expected value. |
13965 constant value, variables are not allowed. | |
13966 | 16352 |
13967 Semantics: | 16353 Semantics: |
13968 """""""""" | 16354 """""""""" |
13969 | 16355 |
13970 This intrinsic is lowered to the ``val``. | 16356 This intrinsic is lowered to the ``val``. |
14268 if"); and this allows for "check widening" type optimizations. | 16654 if"); and this allows for "check widening" type optimizations. |
14269 | 16655 |
14270 ``@llvm.experimental.guard`` cannot be invoked. | 16656 ``@llvm.experimental.guard`` cannot be invoked. |
14271 | 16657 |
14272 | 16658 |
16659 '``llvm.experimental.widenable.condition``' Intrinsic | |
16660 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
16661 | |
16662 Syntax: | |
16663 """"""" | |
16664 | |
16665 :: | |
16666 | |
16667 declare i1 @llvm.experimental.widenable.condition() | |
16668 | |
16669 Overview: | |
16670 """"""""" | |
16671 | |
16672 This intrinsic represents a "widenable condition" which is | |
16673 boolean expressions with the following property: whether this | |
16674 expression is `true` or `false`, the program is correct and | |
16675 well-defined. | |
16676 | |
16677 Together with :ref:`deoptimization operand bundles <deopt_opbundles>`, | |
16678 ``@llvm.experimental.widenable.condition`` allows frontends to | |
16679 express guards or checks on optimistic assumptions made during | |
16680 compilation and represent them as branch instructions on special | |
16681 conditions. | |
16682 | |
16683 While this may appear similar in semantics to `undef`, it is very | |
16684 different in that an invocation produces a particular, singular | |
16685 value. It is also intended to be lowered late, and remain available | |
16686 for specific optimizations and transforms that can benefit from its | |
16687 special properties. | |
16688 | |
16689 Arguments: | |
16690 """""""""" | |
16691 | |
16692 None. | |
16693 | |
16694 Semantics: | |
16695 """""""""" | |
16696 | |
16697 The intrinsic ``@llvm.experimental.widenable.condition()`` | |
16698 returns either `true` or `false`. For each evaluation of a call | |
16699 to this intrinsic, the program must be valid and correct both if | |
16700 it returns `true` and if it returns `false`. This allows | |
16701 transformation passes to replace evaluations of this intrinsic | |
16702 with either value whenever one is beneficial. | |
16703 | |
16704 When used in a branch condition, it allows us to choose between | |
16705 two alternative correct solutions for the same problem, like | |
16706 in example below: | |
16707 | |
16708 .. code-block:: text | |
16709 | |
16710 %cond = call i1 @llvm.experimental.widenable.condition() | |
16711 br i1 %cond, label %solution_1, label %solution_2 | |
16712 | |
16713 label %fast_path: | |
16714 ; Apply memory-consuming but fast solution for a task. | |
16715 | |
16716 label %slow_path: | |
16717 ; Cheap in memory but slow solution. | |
16718 | |
16719 Whether the result of intrinsic's call is `true` or `false`, | |
16720 it should be correct to pick either solution. We can switch | |
16721 between them by replacing the result of | |
16722 ``@llvm.experimental.widenable.condition`` with different | |
16723 `i1` expressions. | |
16724 | |
16725 This is how it can be used to represent guards as widenable branches: | |
16726 | |
16727 .. code-block:: text | |
16728 | |
16729 block: | |
16730 ; Unguarded instructions | |
16731 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)] | |
16732 ; Guarded instructions | |
16733 | |
16734 Can be expressed in an alternative equivalent form of explicit branch using | |
16735 ``@llvm.experimental.widenable.condition``: | |
16736 | |
16737 .. code-block:: text | |
16738 | |
16739 block: | |
16740 ; Unguarded instructions | |
16741 %widenable_condition = call i1 @llvm.experimental.widenable.condition() | |
16742 %guard_condition = and i1 %cond, %widenable_condition | |
16743 br i1 %guard_condition, label %guarded, label %deopt | |
16744 | |
16745 guarded: | |
16746 ; Guarded instructions | |
16747 | |
16748 deopt: | |
16749 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] | |
16750 | |
16751 So the block `guarded` is only reachable when `%cond` is `true`, | |
16752 and it should be valid to go to the block `deopt` whenever `%cond` | |
16753 is `true` or `false`. | |
16754 | |
16755 ``@llvm.experimental.widenable.condition`` will never throw, thus | |
16756 it cannot be invoked. | |
16757 | |
16758 Guard widening: | |
16759 """"""""""""""" | |
16760 | |
16761 When ``@llvm.experimental.widenable.condition()`` is used in | |
16762 condition of a guard represented as explicit branch, it is | |
16763 legal to widen the guard's condition with any additional | |
16764 conditions. | |
16765 | |
16766 Guard widening looks like replacement of | |
16767 | |
16768 .. code-block:: text | |
16769 | |
16770 %widenable_cond = call i1 @llvm.experimental.widenable.condition() | |
16771 %guard_cond = and i1 %cond, %widenable_cond | |
16772 br i1 %guard_cond, label %guarded, label %deopt | |
16773 | |
16774 with | |
16775 | |
16776 .. code-block:: text | |
16777 | |
16778 %widenable_cond = call i1 @llvm.experimental.widenable.condition() | |
16779 %new_cond = and i1 %any_other_cond, %widenable_cond | |
16780 %new_guard_cond = and i1 %cond, %new_cond | |
16781 br i1 %new_guard_cond, label %guarded, label %deopt | |
16782 | |
16783 for this branch. Here `%any_other_cond` is an arbitrarily chosen | |
16784 well-defined `i1` value. By making guard widening, we may | |
16785 impose stricter conditions on `guarded` block and bail to the | |
16786 deopt when the new condition is not met. | |
16787 | |
16788 Lowering: | |
16789 """"""""" | |
16790 | |
16791 Default lowering strategy is replacing the result of | |
16792 call of ``@llvm.experimental.widenable.condition`` with | |
16793 constant `true`. However it is always correct to replace | |
16794 it with any other `i1` value. Any pass can | |
16795 freely do it if it can benefit from non-default lowering. | |
16796 | |
16797 | |
14273 '``llvm.load.relative``' Intrinsic | 16798 '``llvm.load.relative``' Intrinsic |
14274 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 16799 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
14275 | 16800 |
14276 Syntax: | 16801 Syntax: |
14277 """"""" | 16802 """"""" |
14322 Semantics: | 16847 Semantics: |
14323 """""""""" | 16848 """""""""" |
14324 | 16849 |
14325 This intrinsic actually does nothing, but optimizers must assume that it | 16850 This intrinsic actually does nothing, but optimizers must assume that it |
14326 has externally observable side effects. | 16851 has externally observable side effects. |
16852 | |
16853 '``llvm.is.constant.*``' Intrinsic | |
16854 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
16855 | |
16856 Syntax: | |
16857 """"""" | |
16858 | |
16859 This is an overloaded intrinsic. You can use llvm.is.constant with any argument type. | |
16860 | |
16861 :: | |
16862 | |
16863 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone | |
16864 declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone | |
16865 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone | |
16866 | |
16867 Overview: | |
16868 """"""""" | |
16869 | |
16870 The '``llvm.is.constant``' intrinsic will return true if the argument | |
16871 is known to be a manifest compile-time constant. It is guaranteed to | |
16872 fold to either true or false before generating machine code. | |
16873 | |
16874 Semantics: | |
16875 """""""""" | |
16876 | |
16877 This intrinsic generates no code. If its argument is known to be a | |
16878 manifest compile-time constant value, then the intrinsic will be | |
16879 converted to a constant true value. Otherwise, it will be converted to | |
16880 a constant false value. | |
16881 | |
16882 In particular, note that if the argument is a constant expression | |
16883 which refers to a global (the address of which _is_ a constant, but | |
16884 not manifest during the compile), then the intrinsic evaluates to | |
16885 false. | |
16886 | |
16887 The result also intentionally depends on the result of optimization | |
16888 passes -- e.g., the result can change depending on whether a | |
16889 function gets inlined or not. A function's parameters are | |
16890 obviously not constant. However, a call like | |
16891 ``llvm.is.constant.i32(i32 %param)`` *can* return true after the | |
16892 function is inlined, if the value passed to the function parameter was | |
16893 a constant. | |
16894 | |
16895 On the other hand, if constant folding is not run, it will never | |
16896 evaluate to true, even in simple cases. | |
14327 | 16897 |
14328 Stack Map Intrinsics | 16898 Stack Map Intrinsics |
14329 -------------------- | 16899 -------------------- |
14330 | 16900 |
14331 LLVM provides experimental intrinsics to support runtime patching | 16901 LLVM provides experimental intrinsics to support runtime patching |
14558 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is | 17128 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is |
14559 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*' | 17129 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*' |
14560 is replaced with an actual element size. | 17130 is replaced with an actual element size. |
14561 | 17131 |
14562 The optimizer is allowed to inline the memory assignment when it's profitable to do so. | 17132 The optimizer is allowed to inline the memory assignment when it's profitable to do so. |
17133 | |
17134 Objective-C ARC Runtime Intrinsics | |
17135 ---------------------------------- | |
17136 | |
17137 LLVM provides intrinsics that lower to Objective-C ARC runtime entry points. | |
17138 LLVM is aware of the semantics of these functions, and optimizes based on that | |
17139 knowledge. You can read more about the details of Objective-C ARC `here | |
17140 <https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_. | |
17141 | |
17142 '``llvm.objc.autorelease``' Intrinsic | |
17143 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17144 | |
17145 Syntax: | |
17146 """"""" | |
17147 :: | |
17148 | |
17149 declare i8* @llvm.objc.autorelease(i8*) | |
17150 | |
17151 Lowering: | |
17152 """"""""" | |
17153 | |
17154 Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_. | |
17155 | |
17156 '``llvm.objc.autoreleasePoolPop``' Intrinsic | |
17157 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17158 | |
17159 Syntax: | |
17160 """"""" | |
17161 :: | |
17162 | |
17163 declare void @llvm.objc.autoreleasePoolPop(i8*) | |
17164 | |
17165 Lowering: | |
17166 """"""""" | |
17167 | |
17168 Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_. | |
17169 | |
17170 '``llvm.objc.autoreleasePoolPush``' Intrinsic | |
17171 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17172 | |
17173 Syntax: | |
17174 """"""" | |
17175 :: | |
17176 | |
17177 declare i8* @llvm.objc.autoreleasePoolPush() | |
17178 | |
17179 Lowering: | |
17180 """"""""" | |
17181 | |
17182 Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_. | |
17183 | |
17184 '``llvm.objc.autoreleaseReturnValue``' Intrinsic | |
17185 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17186 | |
17187 Syntax: | |
17188 """"""" | |
17189 :: | |
17190 | |
17191 declare i8* @llvm.objc.autoreleaseReturnValue(i8*) | |
17192 | |
17193 Lowering: | |
17194 """"""""" | |
17195 | |
17196 Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_. | |
17197 | |
17198 '``llvm.objc.copyWeak``' Intrinsic | |
17199 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17200 | |
17201 Syntax: | |
17202 """"""" | |
17203 :: | |
17204 | |
17205 declare void @llvm.objc.copyWeak(i8**, i8**) | |
17206 | |
17207 Lowering: | |
17208 """"""""" | |
17209 | |
17210 Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_. | |
17211 | |
17212 '``llvm.objc.destroyWeak``' Intrinsic | |
17213 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17214 | |
17215 Syntax: | |
17216 """"""" | |
17217 :: | |
17218 | |
17219 declare void @llvm.objc.destroyWeak(i8**) | |
17220 | |
17221 Lowering: | |
17222 """"""""" | |
17223 | |
17224 Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_. | |
17225 | |
17226 '``llvm.objc.initWeak``' Intrinsic | |
17227 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17228 | |
17229 Syntax: | |
17230 """"""" | |
17231 :: | |
17232 | |
17233 declare i8* @llvm.objc.initWeak(i8**, i8*) | |
17234 | |
17235 Lowering: | |
17236 """"""""" | |
17237 | |
17238 Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_. | |
17239 | |
17240 '``llvm.objc.loadWeak``' Intrinsic | |
17241 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17242 | |
17243 Syntax: | |
17244 """"""" | |
17245 :: | |
17246 | |
17247 declare i8* @llvm.objc.loadWeak(i8**) | |
17248 | |
17249 Lowering: | |
17250 """"""""" | |
17251 | |
17252 Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_. | |
17253 | |
17254 '``llvm.objc.loadWeakRetained``' Intrinsic | |
17255 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17256 | |
17257 Syntax: | |
17258 """"""" | |
17259 :: | |
17260 | |
17261 declare i8* @llvm.objc.loadWeakRetained(i8**) | |
17262 | |
17263 Lowering: | |
17264 """"""""" | |
17265 | |
17266 Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_. | |
17267 | |
17268 '``llvm.objc.moveWeak``' Intrinsic | |
17269 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17270 | |
17271 Syntax: | |
17272 """"""" | |
17273 :: | |
17274 | |
17275 declare void @llvm.objc.moveWeak(i8**, i8**) | |
17276 | |
17277 Lowering: | |
17278 """"""""" | |
17279 | |
17280 Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_. | |
17281 | |
17282 '``llvm.objc.release``' Intrinsic | |
17283 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17284 | |
17285 Syntax: | |
17286 """"""" | |
17287 :: | |
17288 | |
17289 declare void @llvm.objc.release(i8*) | |
17290 | |
17291 Lowering: | |
17292 """"""""" | |
17293 | |
17294 Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_. | |
17295 | |
17296 '``llvm.objc.retain``' Intrinsic | |
17297 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17298 | |
17299 Syntax: | |
17300 """"""" | |
17301 :: | |
17302 | |
17303 declare i8* @llvm.objc.retain(i8*) | |
17304 | |
17305 Lowering: | |
17306 """"""""" | |
17307 | |
17308 Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_. | |
17309 | |
17310 '``llvm.objc.retainAutorelease``' Intrinsic | |
17311 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17312 | |
17313 Syntax: | |
17314 """"""" | |
17315 :: | |
17316 | |
17317 declare i8* @llvm.objc.retainAutorelease(i8*) | |
17318 | |
17319 Lowering: | |
17320 """"""""" | |
17321 | |
17322 Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_. | |
17323 | |
17324 '``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic | |
17325 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17326 | |
17327 Syntax: | |
17328 """"""" | |
17329 :: | |
17330 | |
17331 declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*) | |
17332 | |
17333 Lowering: | |
17334 """"""""" | |
17335 | |
17336 Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_. | |
17337 | |
17338 '``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic | |
17339 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17340 | |
17341 Syntax: | |
17342 """"""" | |
17343 :: | |
17344 | |
17345 declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*) | |
17346 | |
17347 Lowering: | |
17348 """"""""" | |
17349 | |
17350 Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_. | |
17351 | |
17352 '``llvm.objc.retainBlock``' Intrinsic | |
17353 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17354 | |
17355 Syntax: | |
17356 """"""" | |
17357 :: | |
17358 | |
17359 declare i8* @llvm.objc.retainBlock(i8*) | |
17360 | |
17361 Lowering: | |
17362 """"""""" | |
17363 | |
17364 Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_. | |
17365 | |
17366 '``llvm.objc.storeStrong``' Intrinsic | |
17367 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17368 | |
17369 Syntax: | |
17370 """"""" | |
17371 :: | |
17372 | |
17373 declare void @llvm.objc.storeStrong(i8**, i8*) | |
17374 | |
17375 Lowering: | |
17376 """"""""" | |
17377 | |
17378 Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_. | |
17379 | |
17380 '``llvm.objc.storeWeak``' Intrinsic | |
17381 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17382 | |
17383 Syntax: | |
17384 """"""" | |
17385 :: | |
17386 | |
17387 declare i8* @llvm.objc.storeWeak(i8**, i8*) | |
17388 | |
17389 Lowering: | |
17390 """"""""" | |
17391 | |
17392 Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_. | |
17393 | |
17394 Preserving Debug Information Intrinsics | |
17395 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17396 | |
17397 These intrinsics are used to carry certain debuginfo together with | |
17398 IR-level operations. For example, it may be desirable to | |
17399 know the structure/union name and the original user-level field | |
17400 indices. Such information got lost in IR GetElementPtr instruction | |
17401 since the IR types are different from debugInfo types and unions | |
17402 are converted to structs in IR. | |
17403 | |
17404 '``llvm.preserve.array.access.index``' Intrinsic | |
17405 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17406 | |
17407 Syntax: | |
17408 """"""" | |
17409 :: | |
17410 | |
17411 declare <ret_type> | |
17412 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base, | |
17413 i32 dim, | |
17414 i32 index) | |
17415 | |
17416 Overview: | |
17417 """"""""" | |
17418 | |
17419 The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address | |
17420 based on array base ``base``, array dimension ``dim`` and the last access index ``index`` | |
17421 into the array. The return type ``ret_type`` is a pointer type to the array element. | |
17422 The array ``dim`` and ``index`` are preserved which is more robust than | |
17423 getelementptr instruction which may be subject to compiler transformation. | |
17424 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction | |
17425 to provide array or pointer debuginfo type. | |
17426 The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the | |
17427 debuginfo version of ``type``. | |
17428 | |
17429 Arguments: | |
17430 """""""""" | |
17431 | |
17432 The ``base`` is the array base address. The ``dim`` is the array dimension. | |
17433 The ``base`` is a pointer if ``dim`` equals 0. | |
17434 The ``index`` is the last access index into the array or pointer. | |
17435 | |
17436 Semantics: | |
17437 """""""""" | |
17438 | |
17439 The '``llvm.preserve.array.access.index``' intrinsic produces the same result | |
17440 as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``. | |
17441 | |
17442 '``llvm.preserve.union.access.index``' Intrinsic | |
17443 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17444 | |
17445 Syntax: | |
17446 """"""" | |
17447 :: | |
17448 | |
17449 declare <type> | |
17450 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base, | |
17451 i32 di_index) | |
17452 | |
17453 Overview: | |
17454 """"""""" | |
17455 | |
17456 The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index | |
17457 ``di_index`` and returns the ``base`` address. | |
17458 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction | |
17459 to provide union debuginfo type. | |
17460 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. | |
17461 The return type ``type`` is the same as the ``base`` type. | |
17462 | |
17463 Arguments: | |
17464 """""""""" | |
17465 | |
17466 The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo. | |
17467 | |
17468 Semantics: | |
17469 """""""""" | |
17470 | |
17471 The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address. | |
17472 | |
17473 '``llvm.preserve.struct.access.index``' Intrinsic | |
17474 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
17475 | |
17476 Syntax: | |
17477 """"""" | |
17478 :: | |
17479 | |
17480 declare <ret_type> | |
17481 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base, | |
17482 i32 gep_index, | |
17483 i32 di_index) | |
17484 | |
17485 Overview: | |
17486 """"""""" | |
17487 | |
17488 The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address | |
17489 based on struct base ``base`` and IR struct member index ``gep_index``. | |
17490 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction | |
17491 to provide struct debuginfo type. | |
17492 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. | |
17493 The return type ``ret_type`` is a pointer type to the structure member. | |
17494 | |
17495 Arguments: | |
17496 """""""""" | |
17497 | |
17498 The ``base`` is the structure base address. The ``gep_index`` is the struct member index | |
17499 based on IR structures. The ``di_index`` is the struct member index based on debuginfo. | |
17500 | |
17501 Semantics: | |
17502 """""""""" | |
17503 | |
17504 The '``llvm.preserve.struct.access.index``' intrinsic produces the same result | |
17505 as a getelementptr with base ``base`` and access operands ``{0, gep_index}``. |