comparison docs/Statepoints.rst @ 122:36195a0db682

merging ( incomplete )
author Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date Fri, 17 Nov 2017 20:32:31 +0900
parents 803732b1fca8
children c2174574ed3a
comparison
equal deleted inserted replaced
119:d9df2cbd60cd 122:36195a0db682
7 :depth: 2 7 :depth: 2
8 8
9 Status 9 Status
10 ======= 10 =======
11 11
12 This document describes a set of experimental extensions to LLVM. Use 12 This document describes a set of extensions to LLVM to support garbage
13 with caution. Because the intrinsics have experimental status, 13 collection. By now, these mechanisms are well proven with commercial java
14 compatibility across LLVM releases is not guaranteed. 14 implementation with a fully relocating collector having shipped using them.
15 15 There are a couple places where bugs might still linger; these are called out
16 LLVM currently supports an alternate mechanism for conservative 16 below.
17 garbage collection support using the ``gcroot`` intrinsic. The mechanism 17
18 described here shares little in common with the alternate ``gcroot`` 18 They are still listed as "experimental" to indicate that no forward or backward
19 implementation and it is hoped that this mechanism will eventually 19 compatibility guarantees are offered across versions. If your use case is such
20 replace the gc_root mechanism. 20 that you need some form of forward compatibility guarantee, please raise the
21 issue on the llvm-dev mailing list.
22
23 LLVM still supports an alternate mechanism for conservative garbage collection
24 support using the ``gcroot`` intrinsic. The ``gcroot`` mechanism is mostly of
25 historical interest at this point with one exception - its implementation of
26 shadow stacks has been used successfully by a number of language frontends and
27 is still supported.
21 28
22 Overview 29 Overview
23 ======== 30 ========
24 31
25 To collect dead objects, garbage collectors must be able to identify 32 To collect dead objects, garbage collectors must be able to identify
84 #. identify which object each pointer relates to, and 91 #. identify which object each pointer relates to, and
85 #. potentially update each of those copies. 92 #. potentially update each of those copies.
86 93
87 This document describes the mechanism by which an LLVM based compiler 94 This document describes the mechanism by which an LLVM based compiler
88 can provide this information to a language runtime/collector, and 95 can provide this information to a language runtime/collector, and
89 ensure that all pointers can be read and updated if desired. The 96 ensure that all pointers can be read and updated if desired.
90 heart of the approach is to construct (or rewrite) the IR in a manner 97
91 where the possible updates performed by the garbage collector are 98 At a high level, LLVM has been extended to support compiling to an abstract
99 machine which extends the actual target with a non-integral pointer type
100 suitable for representing a garbage collected reference to an object. In
101 particular, such non-integral pointer type have no defined mapping to an
102 integer representation. This semantic quirk allows the runtime to pick a
103 integer mapping for each point in the program allowing relocations of objects
104 without visible effects.
105
106 Warning: Non-Integral Pointer Types are a newly added concept in LLVM IR.
107 It's possible that we've missed disabling some of the optimizations which
108 assume an integral value for pointers. If you find such a case, please
109 file a bug or share a patch.
110
111 Warning: There is one currently known semantic hole in the definition of
112 non-integral pointers which has not been addressed upstream. To work around
113 this, you need to disable speculation of loads unless the memory type
114 (non-integral pointer vs anything else) is known to unchanged. That is, it is
115 not safe to speculate a load if doing causes a non-integral pointer value to
116 be loaded as any other type or vice versa. In practice, this restriction is
117 well isolated to isSafeToSpeculate in ValueTracking.cpp.
118
119 This high level abstract machine model is used for most of the LLVM optimizer.
120 Before starting code generation, we switch representations to an explicit form.
121 In theory, a frontend could directly generate this low level explicit form, but
122 doing so is likely to inhibit optimization.
123
124 The heart of the explicit approach is to construct (or rewrite) the IR in a
125 manner where the possible updates performed by the garbage collector are
92 explicitly visible in the IR. Doing so requires that we: 126 explicitly visible in the IR. Doing so requires that we:
93 127
94 #. create a new SSA value for each potentially relocated pointer, and 128 #. create a new SSA value for each potentially relocated pointer, and
95 ensure that no uses of the original (non relocated) value is 129 ensure that no uses of the original (non relocated) value is
96 reachable after the safepoint, 130 reachable after the safepoint,
102 associated with) for each statepoint. 136 associated with) for each statepoint.
103 137
104 At the most abstract level, inserting a safepoint can be thought of as 138 At the most abstract level, inserting a safepoint can be thought of as
105 replacing a call instruction with a call to a multiple return value 139 replacing a call instruction with a call to a multiple return value
106 function which both calls the original target of the call, returns 140 function which both calls the original target of the call, returns
107 it's result, and returns updated values for any live pointers to 141 its result, and returns updated values for any live pointers to
108 garbage collected objects. 142 garbage collected objects.
109 143
110 Note that the task of identifying all live pointers to garbage 144 Note that the task of identifying all live pointers to garbage
111 collected values, transforming the IR to expose a pointer giving the 145 collected values, transforming the IR to expose a pointer giving the
112 base object for every such live pointer, and inserting all the 146 base object for every such live pointer, and inserting all the
198 .byte 2 232 .byte 2
199 .byte 8 233 .byte 8
200 .short 7 234 .short 7
201 .long 0 235 .long 0
202 236
203 This example was taken from the tests for the :ref:`RewriteStatepointsForGC` utility pass. As such, it's full StackMap can be easily examined with the following command. 237 This example was taken from the tests for the :ref:`RewriteStatepointsForGC`
238 utility pass. As such, its full StackMap can be easily examined with the
239 following command.
204 240
205 .. code-block:: bash 241 .. code-block:: bash
206 242
207 opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps 243 opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps
208 244
260 296
261 As a practical consideration, many garbage-collected systems allow code that is 297 As a practical consideration, many garbage-collected systems allow code that is
262 collector-aware ("managed code") to call code that is not collector-aware 298 collector-aware ("managed code") to call code that is not collector-aware
263 ("unmanaged code"). It is common that such calls must also be safepoints, since 299 ("unmanaged code"). It is common that such calls must also be safepoints, since
264 it is desirable to allow the collector to run during the execution of 300 it is desirable to allow the collector to run during the execution of
265 unmanaged code. Futhermore, it is common that coordinating the transition from 301 unmanaged code. Furthermore, it is common that coordinating the transition from
266 managed to unmanaged code requires extra code generation at the call site to 302 managed to unmanaged code requires extra code generation at the call site to
267 inform the collector of the transition. In order to support these needs, a 303 inform the collector of the transition. In order to support these needs, a
268 statepoint may be marked as a GC transition, and data that is necessary to 304 statepoint may be marked as a GC transition, and data that is necessary to
269 perform the transition (if any) may be provided as additional arguments to the 305 perform the transition (if any) may be provided as additional arguments to the
270 statepoint. 306 statepoint.
534 570
535 Semantics: 571 Semantics:
536 """""""""" 572 """"""""""
537 573
538 The return value of ``gc.relocate`` is the potentially relocated value 574 The return value of ``gc.relocate`` is the potentially relocated value
539 of the pointer specified by it's arguments. It is unspecified how the 575 of the pointer specified by its arguments. It is unspecified how the
540 value of the returned pointer relates to the argument to the 576 value of the returned pointer relates to the argument to the
541 ``gc.statepoint`` other than that a) it points to the same source 577 ``gc.statepoint`` other than that a) it points to the same source
542 language object with the same offset, and b) the 'based-on' 578 language object with the same offset, and b) the 'based-on'
543 relationship of the newly relocated pointers is a projection of the 579 relationship of the newly relocated pointers is a projection of the
544 unrelocated pointers. In particular, the integer value of the pointer 580 unrelocated pointers. In particular, the integer value of the pointer
652 .. _RewriteStatepointsForGC: 688 .. _RewriteStatepointsForGC:
653 689
654 RewriteStatepointsForGC 690 RewriteStatepointsForGC
655 ^^^^^^^^^^^^^^^^^^^^^^^^ 691 ^^^^^^^^^^^^^^^^^^^^^^^^
656 692
657 The pass RewriteStatepointsForGC transforms a functions IR by replacing a 693 The pass RewriteStatepointsForGC transforms a function's IR to lower from the
658 ``gc.statepoint`` (with an optional ``gc.result``) with a full relocation 694 abstract machine model described above to the explicit statepoint model of
659 sequence, including all required ``gc.relocates``. To function, the pass 695 relocations. To do this, it replaces all calls or invokes of functions which
660 requires that the GC strategy specified for the function be able to reliably 696 might contain a safepoint poll with a ``gc.statepoint`` and associated full
661 distinguish between GC references and non-GC references in IR it is given. 697 relocation sequence, including all required ``gc.relocates``.
698
699 Note that by default, this pass only runs for the "statepoint-example" or
700 "core-clr" gc strategies. You will need to add your custom strategy to this
701 whitelist or use one of the predefined ones.
662 702
663 As an example, given this code: 703 As an example, given this code:
664 704
665 .. code-block:: llvm 705 .. code-block:: llvm
666 706
667 define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 707 define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj)
668 gc "statepoint-example" { 708 gc "statepoint-example" {
669 call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0) 709 call void @foo()
670 ret i8 addrspace(1)* %obj 710 ret i8 addrspace(1)* %obj
671 } 711 }
672 712
673 The pass would produce this IR: 713 The pass would produce this IR:
674 714
681 ret i8 addrspace(1)* %obj.relocated 721 ret i8 addrspace(1)* %obj.relocated
682 } 722 }
683 723
684 In the above examples, the addrspace(1) marker on the pointers is the mechanism 724 In the above examples, the addrspace(1) marker on the pointers is the mechanism
685 that the ``statepoint-example`` GC strategy uses to distinguish references from 725 that the ``statepoint-example`` GC strategy uses to distinguish references from
686 non references. Address space 1 is not globally reserved for this purpose. 726 non references. The pass assumes that all addrspace(1) pointers are non-integral
727 pointer types. Address space 1 is not globally reserved for this purpose.
687 728
688 This pass can be used an utility function by a language frontend that doesn't 729 This pass can be used an utility function by a language frontend that doesn't
689 want to manually reason about liveness, base pointers, or relocation when 730 want to manually reason about liveness, base pointers, or relocation when
690 constructing IR. As currently implemented, RewriteStatepointsForGC must be 731 constructing IR. As currently implemented, RewriteStatepointsForGC must be
691 run after SSA construction (i.e. mem2ref). 732 run after SSA construction (i.e. mem2ref).
699 as null) are also assumed to be base pointers. In practice, this constraint 740 as null) are also assumed to be base pointers. In practice, this constraint
700 can be relaxed to producing interior derived pointers provided the target 741 can be relaxed to producing interior derived pointers provided the target
701 collector can find the associated allocation from an arbitrary interior 742 collector can find the associated allocation from an arbitrary interior
702 derived pointer. 743 derived pointer.
703 744
704 In practice, RewriteStatepointsForGC can be run much later in the pass 745 By default RewriteStatepointsForGC passes in ``0xABCDEF00`` as the statepoint
705 pipeline, after most optimization is already done. This helps to improve
706 the quality of the generated code when compiled with garbage collection support.
707 In the long run, this is the intended usage model. At this time, a few details
708 have yet to be worked out about the semantic model required to guarantee this
709 is always correct. As such, please use with caution and report bugs.
710
711 .. _PlaceSafepoints:
712
713 PlaceSafepoints
714 ^^^^^^^^^^^^^^^^
715
716 The pass PlaceSafepoints transforms a function's IR by replacing any call or
717 invoke instructions with appropriate ``gc.statepoint`` and ``gc.result`` pairs,
718 and inserting safepoint polls sufficient to ensure running code checks for a
719 safepoint request on a timely manner. This pass is expected to be run before
720 RewriteStatepointsForGC and thus does not produce full relocation sequences.
721
722 As an example, given input IR of the following:
723
724 .. code-block:: llvm
725
726 define void @test() gc "statepoint-example" {
727 call void @foo()
728 ret void
729 }
730
731 declare void @do_safepoint()
732 define void @gc.safepoint_poll() {
733 call void @do_safepoint()
734 ret void
735 }
736
737
738 This pass would produce the following IR:
739
740 .. code-block:: llvm
741
742 define void @test() gc "statepoint-example" {
743 %safepoint_token = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @do_safepoint, i32 0, i32 0, i32 0, i32 0)
744 %safepoint_token1 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0)
745 ret void
746 }
747
748 In this case, we've added an (unconditional) entry safepoint poll and converted the call into a ``gc.statepoint``. Note that despite appearances, the entry poll is not necessarily redundant. We'd have to know that ``foo`` and ``test`` were not mutually recursive for the poll to be redundant. In practice, you'd probably want to your poll definition to contain a conditional branch of some form.
749
750
751 At the moment, PlaceSafepoints can insert safepoint polls at method entry and
752 loop backedges locations. Extending this to work with return polls would be
753 straight forward if desired.
754
755 PlaceSafepoints includes a number of optimizations to avoid placing safepoint
756 polls at particular sites unless needed to ensure timely execution of a poll
757 under normal conditions. PlaceSafepoints does not attempt to ensure timely
758 execution of a poll under worst case conditions such as heavy system paging.
759
760 The implementation of a safepoint poll action is specified by looking up a
761 function of the name ``gc.safepoint_poll`` in the containing Module. The body
762 of this function is inserted at each poll site desired. While calls or invokes
763 inside this method are transformed to a ``gc.statepoints``, recursive poll
764 insertion is not performed.
765
766 By default PlaceSafepoints passes in ``0xABCDEF00`` as the statepoint
767 ID and ``0`` as the number of patchable bytes to the newly constructed 746 ID and ``0`` as the number of patchable bytes to the newly constructed
768 ``gc.statepoint``. These values can be configured on a per-callsite 747 ``gc.statepoint``. These values can be configured on a per-callsite
769 basis using the attributes ``"statepoint-id"`` and 748 basis using the attributes ``"statepoint-id"`` and
770 ``"statepoint-num-patch-bytes"``. If a call site is marked with a 749 ``"statepoint-num-patch-bytes"``. If a call site is marked with a
771 ``"statepoint-id"`` function attribute and its value is a positive 750 ``"statepoint-id"`` function attribute and its value is a positive
776 bytes' parameter of the newly constructed ``gc.statepoint``. The 755 bytes' parameter of the newly constructed ``gc.statepoint``. The
777 ``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes 756 ``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes
778 are not propagated to the ``gc.statepoint`` call or invoke if they 757 are not propagated to the ``gc.statepoint`` call or invoke if they
779 could be successfully parsed. 758 could be successfully parsed.
780 759
781 If you are scheduling the RewriteStatepointsForGC pass late in the pass order, 760 In practice, RewriteStatepointsForGC should be run much later in the pass
782 you should probably schedule this pass immediately before it. The exception 761 pipeline, after most optimization is already done. This helps to improve
783 would be if you need to preserve abstract frame information (e.g. for 762 the quality of the generated code when compiled with garbage collection support.
784 deoptimization or introspection) at safepoints. In that case, ask on the 763
785 llvm-dev mailing list for suggestions. 764 .. _PlaceSafepoints:
765
766 PlaceSafepoints
767 ^^^^^^^^^^^^^^^^
768
769 The pass PlaceSafepoints inserts safepoint polls sufficient to ensure running
770 code checks for a safepoint request on a timely manner. This pass is expected
771 to be run before RewriteStatepointsForGC and thus does not produce full
772 relocation sequences.
773
774 As an example, given input IR of the following:
775
776 .. code-block:: llvm
777
778 define void @test() gc "statepoint-example" {
779 call void @foo()
780 ret void
781 }
782
783 declare void @do_safepoint()
784 define void @gc.safepoint_poll() {
785 call void @do_safepoint()
786 ret void
787 }
788
789
790 This pass would produce the following IR:
791
792 .. code-block:: llvm
793
794 define void @test() gc "statepoint-example" {
795 call void @do_safepoint()
796 call void @foo()
797 ret void
798 }
799
800 In this case, we've added an (unconditional) entry safepoint poll. Note that
801 despite appearances, the entry poll is not necessarily redundant. We'd have to
802 know that ``foo`` and ``test`` were not mutually recursive for the poll to be
803 redundant. In practice, you'd probably want to your poll definition to contain
804 a conditional branch of some form.
805
806 At the moment, PlaceSafepoints can insert safepoint polls at method entry and
807 loop backedges locations. Extending this to work with return polls would be
808 straight forward if desired.
809
810 PlaceSafepoints includes a number of optimizations to avoid placing safepoint
811 polls at particular sites unless needed to ensure timely execution of a poll
812 under normal conditions. PlaceSafepoints does not attempt to ensure timely
813 execution of a poll under worst case conditions such as heavy system paging.
814
815 The implementation of a safepoint poll action is specified by looking up a
816 function of the name ``gc.safepoint_poll`` in the containing Module. The body
817 of this function is inserted at each poll site desired. While calls or invokes
818 inside this method are transformed to a ``gc.statepoints``, recursive poll
819 insertion is not performed.
820
821 This pass is useful for any language frontend which only has to support
822 garbage collection semantics at safepoints. If you need other abstract
823 frame information at safepoints (e.g. for deoptimization or introspection),
824 you can insert safepoint polls in the frontend. If you have the later case,
825 please ask on llvm-dev for suggestions. There's been a good amount of work
826 done on making such a scheme work well in practice which is not yet documented
827 here.
786 828
787 829
788 Supported Architectures 830 Supported Architectures
789 ======================= 831 =======================
790 832
791 Support for statepoint generation requires some code for each backend. 833 Support for statepoint generation requires some code for each backend.
792 Today, only X86_64 is supported. 834 Today, only X86_64 is supported.
793 835
836 Problem Areas and Active Work
837 =============================
838
839 #. Support for languages which allow unmanaged pointers to garbage collected
840 objects (i.e. pass a pointer to an object to a C routine) via pinning.
841
842 #. Support for garbage collected objects allocated on the stack. Specifically,
843 allocas are always assumed to be in address space 0 and we need a
844 cast/promotion operator to let rewriting identify them.
845
846 #. The current statepoint lowering is known to be somewhat poor. In the very
847 long term, we'd like to integrate statepoints with the register allocator;
848 in the near term this is unlikely to happen. We've found the quality of
849 lowering to be relatively unimportant as hot-statepoints are almost always
850 inliner bugs.
851
852 #. Concerns have been raised that the statepoint representation results in a
853 large amount of IR being produced for some examples and that this
854 contributes to higher than expected memory usage and compile times. There's
855 no immediate plans to make changes due to this, but alternate models may be
856 explored in the future.
857
858 #. Relocations along exceptional paths are currently broken in ToT. In
859 particular, there is current no way to represent a rethrow on a path which
860 also has relocations. See `this llvm-dev discussion
861 <https://groups.google.com/forum/#!topic/llvm-dev/AE417XjgxvI>`_ for more
862 detail.
863
794 Bugs and Enhancements 864 Bugs and Enhancements
795 ===================== 865 =====================
796 866
797 Currently known bugs and enhancements under consideration can be 867 Currently known bugs and enhancements under consideration can be
798 tracked by performing a `bugzilla search 868 tracked by performing a `bugzilla search
799 <http://llvm.org/bugs/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_ 869 <https://bugs.llvm.org/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_
800 for [Statepoint] in the summary field. When filing new bugs, please 870 for [Statepoint] in the summary field. When filing new bugs, please
801 use this tag so that interested parties see the newly filed bug. As 871 use this tag so that interested parties see the newly filed bug. As
802 with most LLVM features, design discussions take place on `llvm-dev 872 with most LLVM features, design discussions take place on `llvm-dev
803 <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches 873 <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches
804 should be sent to `llvm-commits 874 should be sent to `llvm-commits