Mercurial > hg > CbC > CbC_llvm
comparison docs/Statepoints.rst @ 122:36195a0db682
merging ( incomplete )
author | Shinji KONO <kono@ie.u-ryukyu.ac.jp> |
---|---|
date | Fri, 17 Nov 2017 20:32:31 +0900 |
parents | 803732b1fca8 |
children | c2174574ed3a |
comparison
equal
deleted
inserted
replaced
119:d9df2cbd60cd | 122:36195a0db682 |
---|---|
7 :depth: 2 | 7 :depth: 2 |
8 | 8 |
9 Status | 9 Status |
10 ======= | 10 ======= |
11 | 11 |
12 This document describes a set of experimental extensions to LLVM. Use | 12 This document describes a set of extensions to LLVM to support garbage |
13 with caution. Because the intrinsics have experimental status, | 13 collection. By now, these mechanisms are well proven with commercial java |
14 compatibility across LLVM releases is not guaranteed. | 14 implementation with a fully relocating collector having shipped using them. |
15 | 15 There are a couple places where bugs might still linger; these are called out |
16 LLVM currently supports an alternate mechanism for conservative | 16 below. |
17 garbage collection support using the ``gcroot`` intrinsic. The mechanism | 17 |
18 described here shares little in common with the alternate ``gcroot`` | 18 They are still listed as "experimental" to indicate that no forward or backward |
19 implementation and it is hoped that this mechanism will eventually | 19 compatibility guarantees are offered across versions. If your use case is such |
20 replace the gc_root mechanism. | 20 that you need some form of forward compatibility guarantee, please raise the |
21 issue on the llvm-dev mailing list. | |
22 | |
23 LLVM still supports an alternate mechanism for conservative garbage collection | |
24 support using the ``gcroot`` intrinsic. The ``gcroot`` mechanism is mostly of | |
25 historical interest at this point with one exception - its implementation of | |
26 shadow stacks has been used successfully by a number of language frontends and | |
27 is still supported. | |
21 | 28 |
22 Overview | 29 Overview |
23 ======== | 30 ======== |
24 | 31 |
25 To collect dead objects, garbage collectors must be able to identify | 32 To collect dead objects, garbage collectors must be able to identify |
84 #. identify which object each pointer relates to, and | 91 #. identify which object each pointer relates to, and |
85 #. potentially update each of those copies. | 92 #. potentially update each of those copies. |
86 | 93 |
87 This document describes the mechanism by which an LLVM based compiler | 94 This document describes the mechanism by which an LLVM based compiler |
88 can provide this information to a language runtime/collector, and | 95 can provide this information to a language runtime/collector, and |
89 ensure that all pointers can be read and updated if desired. The | 96 ensure that all pointers can be read and updated if desired. |
90 heart of the approach is to construct (or rewrite) the IR in a manner | 97 |
91 where the possible updates performed by the garbage collector are | 98 At a high level, LLVM has been extended to support compiling to an abstract |
99 machine which extends the actual target with a non-integral pointer type | |
100 suitable for representing a garbage collected reference to an object. In | |
101 particular, such non-integral pointer type have no defined mapping to an | |
102 integer representation. This semantic quirk allows the runtime to pick a | |
103 integer mapping for each point in the program allowing relocations of objects | |
104 without visible effects. | |
105 | |
106 Warning: Non-Integral Pointer Types are a newly added concept in LLVM IR. | |
107 It's possible that we've missed disabling some of the optimizations which | |
108 assume an integral value for pointers. If you find such a case, please | |
109 file a bug or share a patch. | |
110 | |
111 Warning: There is one currently known semantic hole in the definition of | |
112 non-integral pointers which has not been addressed upstream. To work around | |
113 this, you need to disable speculation of loads unless the memory type | |
114 (non-integral pointer vs anything else) is known to unchanged. That is, it is | |
115 not safe to speculate a load if doing causes a non-integral pointer value to | |
116 be loaded as any other type or vice versa. In practice, this restriction is | |
117 well isolated to isSafeToSpeculate in ValueTracking.cpp. | |
118 | |
119 This high level abstract machine model is used for most of the LLVM optimizer. | |
120 Before starting code generation, we switch representations to an explicit form. | |
121 In theory, a frontend could directly generate this low level explicit form, but | |
122 doing so is likely to inhibit optimization. | |
123 | |
124 The heart of the explicit approach is to construct (or rewrite) the IR in a | |
125 manner where the possible updates performed by the garbage collector are | |
92 explicitly visible in the IR. Doing so requires that we: | 126 explicitly visible in the IR. Doing so requires that we: |
93 | 127 |
94 #. create a new SSA value for each potentially relocated pointer, and | 128 #. create a new SSA value for each potentially relocated pointer, and |
95 ensure that no uses of the original (non relocated) value is | 129 ensure that no uses of the original (non relocated) value is |
96 reachable after the safepoint, | 130 reachable after the safepoint, |
102 associated with) for each statepoint. | 136 associated with) for each statepoint. |
103 | 137 |
104 At the most abstract level, inserting a safepoint can be thought of as | 138 At the most abstract level, inserting a safepoint can be thought of as |
105 replacing a call instruction with a call to a multiple return value | 139 replacing a call instruction with a call to a multiple return value |
106 function which both calls the original target of the call, returns | 140 function which both calls the original target of the call, returns |
107 it's result, and returns updated values for any live pointers to | 141 its result, and returns updated values for any live pointers to |
108 garbage collected objects. | 142 garbage collected objects. |
109 | 143 |
110 Note that the task of identifying all live pointers to garbage | 144 Note that the task of identifying all live pointers to garbage |
111 collected values, transforming the IR to expose a pointer giving the | 145 collected values, transforming the IR to expose a pointer giving the |
112 base object for every such live pointer, and inserting all the | 146 base object for every such live pointer, and inserting all the |
198 .byte 2 | 232 .byte 2 |
199 .byte 8 | 233 .byte 8 |
200 .short 7 | 234 .short 7 |
201 .long 0 | 235 .long 0 |
202 | 236 |
203 This example was taken from the tests for the :ref:`RewriteStatepointsForGC` utility pass. As such, it's full StackMap can be easily examined with the following command. | 237 This example was taken from the tests for the :ref:`RewriteStatepointsForGC` |
238 utility pass. As such, its full StackMap can be easily examined with the | |
239 following command. | |
204 | 240 |
205 .. code-block:: bash | 241 .. code-block:: bash |
206 | 242 |
207 opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps | 243 opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps |
208 | 244 |
260 | 296 |
261 As a practical consideration, many garbage-collected systems allow code that is | 297 As a practical consideration, many garbage-collected systems allow code that is |
262 collector-aware ("managed code") to call code that is not collector-aware | 298 collector-aware ("managed code") to call code that is not collector-aware |
263 ("unmanaged code"). It is common that such calls must also be safepoints, since | 299 ("unmanaged code"). It is common that such calls must also be safepoints, since |
264 it is desirable to allow the collector to run during the execution of | 300 it is desirable to allow the collector to run during the execution of |
265 unmanaged code. Futhermore, it is common that coordinating the transition from | 301 unmanaged code. Furthermore, it is common that coordinating the transition from |
266 managed to unmanaged code requires extra code generation at the call site to | 302 managed to unmanaged code requires extra code generation at the call site to |
267 inform the collector of the transition. In order to support these needs, a | 303 inform the collector of the transition. In order to support these needs, a |
268 statepoint may be marked as a GC transition, and data that is necessary to | 304 statepoint may be marked as a GC transition, and data that is necessary to |
269 perform the transition (if any) may be provided as additional arguments to the | 305 perform the transition (if any) may be provided as additional arguments to the |
270 statepoint. | 306 statepoint. |
534 | 570 |
535 Semantics: | 571 Semantics: |
536 """""""""" | 572 """""""""" |
537 | 573 |
538 The return value of ``gc.relocate`` is the potentially relocated value | 574 The return value of ``gc.relocate`` is the potentially relocated value |
539 of the pointer specified by it's arguments. It is unspecified how the | 575 of the pointer specified by its arguments. It is unspecified how the |
540 value of the returned pointer relates to the argument to the | 576 value of the returned pointer relates to the argument to the |
541 ``gc.statepoint`` other than that a) it points to the same source | 577 ``gc.statepoint`` other than that a) it points to the same source |
542 language object with the same offset, and b) the 'based-on' | 578 language object with the same offset, and b) the 'based-on' |
543 relationship of the newly relocated pointers is a projection of the | 579 relationship of the newly relocated pointers is a projection of the |
544 unrelocated pointers. In particular, the integer value of the pointer | 580 unrelocated pointers. In particular, the integer value of the pointer |
652 .. _RewriteStatepointsForGC: | 688 .. _RewriteStatepointsForGC: |
653 | 689 |
654 RewriteStatepointsForGC | 690 RewriteStatepointsForGC |
655 ^^^^^^^^^^^^^^^^^^^^^^^^ | 691 ^^^^^^^^^^^^^^^^^^^^^^^^ |
656 | 692 |
657 The pass RewriteStatepointsForGC transforms a functions IR by replacing a | 693 The pass RewriteStatepointsForGC transforms a function's IR to lower from the |
658 ``gc.statepoint`` (with an optional ``gc.result``) with a full relocation | 694 abstract machine model described above to the explicit statepoint model of |
659 sequence, including all required ``gc.relocates``. To function, the pass | 695 relocations. To do this, it replaces all calls or invokes of functions which |
660 requires that the GC strategy specified for the function be able to reliably | 696 might contain a safepoint poll with a ``gc.statepoint`` and associated full |
661 distinguish between GC references and non-GC references in IR it is given. | 697 relocation sequence, including all required ``gc.relocates``. |
698 | |
699 Note that by default, this pass only runs for the "statepoint-example" or | |
700 "core-clr" gc strategies. You will need to add your custom strategy to this | |
701 whitelist or use one of the predefined ones. | |
662 | 702 |
663 As an example, given this code: | 703 As an example, given this code: |
664 | 704 |
665 .. code-block:: llvm | 705 .. code-block:: llvm |
666 | 706 |
667 define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) | 707 define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
668 gc "statepoint-example" { | 708 gc "statepoint-example" { |
669 call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0) | 709 call void @foo() |
670 ret i8 addrspace(1)* %obj | 710 ret i8 addrspace(1)* %obj |
671 } | 711 } |
672 | 712 |
673 The pass would produce this IR: | 713 The pass would produce this IR: |
674 | 714 |
681 ret i8 addrspace(1)* %obj.relocated | 721 ret i8 addrspace(1)* %obj.relocated |
682 } | 722 } |
683 | 723 |
684 In the above examples, the addrspace(1) marker on the pointers is the mechanism | 724 In the above examples, the addrspace(1) marker on the pointers is the mechanism |
685 that the ``statepoint-example`` GC strategy uses to distinguish references from | 725 that the ``statepoint-example`` GC strategy uses to distinguish references from |
686 non references. Address space 1 is not globally reserved for this purpose. | 726 non references. The pass assumes that all addrspace(1) pointers are non-integral |
727 pointer types. Address space 1 is not globally reserved for this purpose. | |
687 | 728 |
688 This pass can be used an utility function by a language frontend that doesn't | 729 This pass can be used an utility function by a language frontend that doesn't |
689 want to manually reason about liveness, base pointers, or relocation when | 730 want to manually reason about liveness, base pointers, or relocation when |
690 constructing IR. As currently implemented, RewriteStatepointsForGC must be | 731 constructing IR. As currently implemented, RewriteStatepointsForGC must be |
691 run after SSA construction (i.e. mem2ref). | 732 run after SSA construction (i.e. mem2ref). |
699 as null) are also assumed to be base pointers. In practice, this constraint | 740 as null) are also assumed to be base pointers. In practice, this constraint |
700 can be relaxed to producing interior derived pointers provided the target | 741 can be relaxed to producing interior derived pointers provided the target |
701 collector can find the associated allocation from an arbitrary interior | 742 collector can find the associated allocation from an arbitrary interior |
702 derived pointer. | 743 derived pointer. |
703 | 744 |
704 In practice, RewriteStatepointsForGC can be run much later in the pass | 745 By default RewriteStatepointsForGC passes in ``0xABCDEF00`` as the statepoint |
705 pipeline, after most optimization is already done. This helps to improve | |
706 the quality of the generated code when compiled with garbage collection support. | |
707 In the long run, this is the intended usage model. At this time, a few details | |
708 have yet to be worked out about the semantic model required to guarantee this | |
709 is always correct. As such, please use with caution and report bugs. | |
710 | |
711 .. _PlaceSafepoints: | |
712 | |
713 PlaceSafepoints | |
714 ^^^^^^^^^^^^^^^^ | |
715 | |
716 The pass PlaceSafepoints transforms a function's IR by replacing any call or | |
717 invoke instructions with appropriate ``gc.statepoint`` and ``gc.result`` pairs, | |
718 and inserting safepoint polls sufficient to ensure running code checks for a | |
719 safepoint request on a timely manner. This pass is expected to be run before | |
720 RewriteStatepointsForGC and thus does not produce full relocation sequences. | |
721 | |
722 As an example, given input IR of the following: | |
723 | |
724 .. code-block:: llvm | |
725 | |
726 define void @test() gc "statepoint-example" { | |
727 call void @foo() | |
728 ret void | |
729 } | |
730 | |
731 declare void @do_safepoint() | |
732 define void @gc.safepoint_poll() { | |
733 call void @do_safepoint() | |
734 ret void | |
735 } | |
736 | |
737 | |
738 This pass would produce the following IR: | |
739 | |
740 .. code-block:: llvm | |
741 | |
742 define void @test() gc "statepoint-example" { | |
743 %safepoint_token = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @do_safepoint, i32 0, i32 0, i32 0, i32 0) | |
744 %safepoint_token1 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0) | |
745 ret void | |
746 } | |
747 | |
748 In this case, we've added an (unconditional) entry safepoint poll and converted the call into a ``gc.statepoint``. Note that despite appearances, the entry poll is not necessarily redundant. We'd have to know that ``foo`` and ``test`` were not mutually recursive for the poll to be redundant. In practice, you'd probably want to your poll definition to contain a conditional branch of some form. | |
749 | |
750 | |
751 At the moment, PlaceSafepoints can insert safepoint polls at method entry and | |
752 loop backedges locations. Extending this to work with return polls would be | |
753 straight forward if desired. | |
754 | |
755 PlaceSafepoints includes a number of optimizations to avoid placing safepoint | |
756 polls at particular sites unless needed to ensure timely execution of a poll | |
757 under normal conditions. PlaceSafepoints does not attempt to ensure timely | |
758 execution of a poll under worst case conditions such as heavy system paging. | |
759 | |
760 The implementation of a safepoint poll action is specified by looking up a | |
761 function of the name ``gc.safepoint_poll`` in the containing Module. The body | |
762 of this function is inserted at each poll site desired. While calls or invokes | |
763 inside this method are transformed to a ``gc.statepoints``, recursive poll | |
764 insertion is not performed. | |
765 | |
766 By default PlaceSafepoints passes in ``0xABCDEF00`` as the statepoint | |
767 ID and ``0`` as the number of patchable bytes to the newly constructed | 746 ID and ``0`` as the number of patchable bytes to the newly constructed |
768 ``gc.statepoint``. These values can be configured on a per-callsite | 747 ``gc.statepoint``. These values can be configured on a per-callsite |
769 basis using the attributes ``"statepoint-id"`` and | 748 basis using the attributes ``"statepoint-id"`` and |
770 ``"statepoint-num-patch-bytes"``. If a call site is marked with a | 749 ``"statepoint-num-patch-bytes"``. If a call site is marked with a |
771 ``"statepoint-id"`` function attribute and its value is a positive | 750 ``"statepoint-id"`` function attribute and its value is a positive |
776 bytes' parameter of the newly constructed ``gc.statepoint``. The | 755 bytes' parameter of the newly constructed ``gc.statepoint``. The |
777 ``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes | 756 ``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes |
778 are not propagated to the ``gc.statepoint`` call or invoke if they | 757 are not propagated to the ``gc.statepoint`` call or invoke if they |
779 could be successfully parsed. | 758 could be successfully parsed. |
780 | 759 |
781 If you are scheduling the RewriteStatepointsForGC pass late in the pass order, | 760 In practice, RewriteStatepointsForGC should be run much later in the pass |
782 you should probably schedule this pass immediately before it. The exception | 761 pipeline, after most optimization is already done. This helps to improve |
783 would be if you need to preserve abstract frame information (e.g. for | 762 the quality of the generated code when compiled with garbage collection support. |
784 deoptimization or introspection) at safepoints. In that case, ask on the | 763 |
785 llvm-dev mailing list for suggestions. | 764 .. _PlaceSafepoints: |
765 | |
766 PlaceSafepoints | |
767 ^^^^^^^^^^^^^^^^ | |
768 | |
769 The pass PlaceSafepoints inserts safepoint polls sufficient to ensure running | |
770 code checks for a safepoint request on a timely manner. This pass is expected | |
771 to be run before RewriteStatepointsForGC and thus does not produce full | |
772 relocation sequences. | |
773 | |
774 As an example, given input IR of the following: | |
775 | |
776 .. code-block:: llvm | |
777 | |
778 define void @test() gc "statepoint-example" { | |
779 call void @foo() | |
780 ret void | |
781 } | |
782 | |
783 declare void @do_safepoint() | |
784 define void @gc.safepoint_poll() { | |
785 call void @do_safepoint() | |
786 ret void | |
787 } | |
788 | |
789 | |
790 This pass would produce the following IR: | |
791 | |
792 .. code-block:: llvm | |
793 | |
794 define void @test() gc "statepoint-example" { | |
795 call void @do_safepoint() | |
796 call void @foo() | |
797 ret void | |
798 } | |
799 | |
800 In this case, we've added an (unconditional) entry safepoint poll. Note that | |
801 despite appearances, the entry poll is not necessarily redundant. We'd have to | |
802 know that ``foo`` and ``test`` were not mutually recursive for the poll to be | |
803 redundant. In practice, you'd probably want to your poll definition to contain | |
804 a conditional branch of some form. | |
805 | |
806 At the moment, PlaceSafepoints can insert safepoint polls at method entry and | |
807 loop backedges locations. Extending this to work with return polls would be | |
808 straight forward if desired. | |
809 | |
810 PlaceSafepoints includes a number of optimizations to avoid placing safepoint | |
811 polls at particular sites unless needed to ensure timely execution of a poll | |
812 under normal conditions. PlaceSafepoints does not attempt to ensure timely | |
813 execution of a poll under worst case conditions such as heavy system paging. | |
814 | |
815 The implementation of a safepoint poll action is specified by looking up a | |
816 function of the name ``gc.safepoint_poll`` in the containing Module. The body | |
817 of this function is inserted at each poll site desired. While calls or invokes | |
818 inside this method are transformed to a ``gc.statepoints``, recursive poll | |
819 insertion is not performed. | |
820 | |
821 This pass is useful for any language frontend which only has to support | |
822 garbage collection semantics at safepoints. If you need other abstract | |
823 frame information at safepoints (e.g. for deoptimization or introspection), | |
824 you can insert safepoint polls in the frontend. If you have the later case, | |
825 please ask on llvm-dev for suggestions. There's been a good amount of work | |
826 done on making such a scheme work well in practice which is not yet documented | |
827 here. | |
786 | 828 |
787 | 829 |
788 Supported Architectures | 830 Supported Architectures |
789 ======================= | 831 ======================= |
790 | 832 |
791 Support for statepoint generation requires some code for each backend. | 833 Support for statepoint generation requires some code for each backend. |
792 Today, only X86_64 is supported. | 834 Today, only X86_64 is supported. |
793 | 835 |
836 Problem Areas and Active Work | |
837 ============================= | |
838 | |
839 #. Support for languages which allow unmanaged pointers to garbage collected | |
840 objects (i.e. pass a pointer to an object to a C routine) via pinning. | |
841 | |
842 #. Support for garbage collected objects allocated on the stack. Specifically, | |
843 allocas are always assumed to be in address space 0 and we need a | |
844 cast/promotion operator to let rewriting identify them. | |
845 | |
846 #. The current statepoint lowering is known to be somewhat poor. In the very | |
847 long term, we'd like to integrate statepoints with the register allocator; | |
848 in the near term this is unlikely to happen. We've found the quality of | |
849 lowering to be relatively unimportant as hot-statepoints are almost always | |
850 inliner bugs. | |
851 | |
852 #. Concerns have been raised that the statepoint representation results in a | |
853 large amount of IR being produced for some examples and that this | |
854 contributes to higher than expected memory usage and compile times. There's | |
855 no immediate plans to make changes due to this, but alternate models may be | |
856 explored in the future. | |
857 | |
858 #. Relocations along exceptional paths are currently broken in ToT. In | |
859 particular, there is current no way to represent a rethrow on a path which | |
860 also has relocations. See `this llvm-dev discussion | |
861 <https://groups.google.com/forum/#!topic/llvm-dev/AE417XjgxvI>`_ for more | |
862 detail. | |
863 | |
794 Bugs and Enhancements | 864 Bugs and Enhancements |
795 ===================== | 865 ===================== |
796 | 866 |
797 Currently known bugs and enhancements under consideration can be | 867 Currently known bugs and enhancements under consideration can be |
798 tracked by performing a `bugzilla search | 868 tracked by performing a `bugzilla search |
799 <http://llvm.org/bugs/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_ | 869 <https://bugs.llvm.org/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_ |
800 for [Statepoint] in the summary field. When filing new bugs, please | 870 for [Statepoint] in the summary field. When filing new bugs, please |
801 use this tag so that interested parties see the newly filed bug. As | 871 use this tag so that interested parties see the newly filed bug. As |
802 with most LLVM features, design discussions take place on `llvm-dev | 872 with most LLVM features, design discussions take place on `llvm-dev |
803 <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches | 873 <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches |
804 should be sent to `llvm-commits | 874 should be sent to `llvm-commits |