annotate flang/docs/FortranIR.md @ 223:5f17cb93ff66 llvm-original

LLVM13 (2021/7/18)
author Shinji KONO <kono@ie.u-ryukyu.ac.jp>
date Sun, 18 Jul 2021 22:43:00 +0900
parents 79ff65ed7e25
children c4bab56944e8
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
221
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
1 <!--===- docs/FortranIR.md
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
2
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
3 Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
4 See https://llvm.org/LICENSE.txt for license information.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
5 SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
6
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
7 -->
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
8
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
9 # Design: Fortran IR
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
10
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
11 ```eval_rst
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
12 .. contents::
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
13 :local:
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
14 ```
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
15
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
16 ## Introduction
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
17
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
18 After semantic analysis is complete and it has been determined that the compiler has a legal Fortran program as input, the parse tree will be lowered to an intermediate representation for the purposes of high-level analysis and optimization. In this document, that intermediate representation will be called Fortran IR or FIR. The pass that converts from the parse tree and other data structures of the front-end to FIR will be called the "Burnside bridge".
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
19
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
20 FIR will be an explicit, operational, and strongly-typed representation, which shall encapsulate control-flow as graphs.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
21
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
22 ## Requirements
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
23
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
24 ### White Paper: [Control Flow Graph](ControlFlowGraph.md)<sup>1</sup>
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
25
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
26 This is a list of requirements extracted from that document, which will be referred to as CFG-WP.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
27
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
28 1. Control flow to be explicit (e.g. ERR= specifiers)
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
29 2. May need critical edge splitting
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
30 3. Lowering of procedures with ENTRY statements is specified
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
31 4. Procedures will have a start node
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
32 5. Non-executable statements will be ignored
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
33 6. Labeled DO loop execution with GOTO specified
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
34 7. Operations and actions (statements) are defined
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
35 8. The last statement in a basic block can represent a change in control flow
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
36 9. Scope transitions to be made explicit (special actions)
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
37 10. The IR will be in SSA form
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
38
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
39 ### Explicit Control Flow
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
40
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
41 In Fortran, there are a number of statements that result in control flow to statements other than the one immediately subsequent. These can be sorted these into two categories: structured and unstructured.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
42
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
43 #### Structured Control Flow
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
44
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
45 Fortran has executable constructs that imply three basic control flow forms. The first form is a structured loop (DO construct)<sup>2</sup>. The second form is a structured cascade of conditional branches (IF construct, IF statement,<sup>3</sup> WHERE construct). The third form is a structured multiway branch (SELECT CASE, SELECT RANK, and SELECT TYPE constructs). The FORALL construct, while it implies a semantic model of interleaved iterations, can be modeled as a special single-entry single-exit region in FIR perhaps with backstage marker statements.<sup>4</sup>
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
46
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
47 The CYCLE and EXIT statements interact with the above structured executable constructs by providing structured transfers of control.<sup>5</sup> CYCLE (possibly named) is only valid in DO constructs and creates an alternate backedge in the enclosing loop. EXIT transfers control out of the enclosing (possibly named) construct, which need not be a DO construct.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
48
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
49 #### Unstructured Control Flow
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
50
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
51 Fortran also has mechanisms of transferring control between a statement and another statement with a corresponding label. The origin of these edges can be GOTO statements, computed GOTO statements, assigned GOTO statements, arithmetic IF statements, alt-return specifications, and END/EOR/ERR I/O specifiers. These statements are "unstructured" in the sense that the target of the control-flow has fewer constraints and the labelled statements must be linked to their origins.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
52
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
53 Another category of unstructured control flow are statements that terminate execution. These include RETURN, FAIL IMAGE, STOP and ERROR STOP statements. The PAUSE statement can be modeled as a call to the runtime.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
54
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
55 ### Operations
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
56
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
57 The compiler's to be determined optimization passes will inform us as to the exact composition of FIR at the operations level. This details here will necessarily change, so please read them with a grain of salt.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
58
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
59 The plan (see CFG-WP) is that statements (actions) will be a veneer model of Fortran syntactical executable constructs. Fortran statements will correspond one to one with actions. Actions will be composed of and own objects of Fortran::evaluate::GenericExprWrapper. Values of type GenericExprWrapper will have Fortran types. This implies that actions will not be in an explicit data flow representation and have optional type information.<sup>6</sup> Initially, values will bind to symbols in a context and have an implicit use-def relation. An action statement may entail a "big step" operation with many side-effects. No semantics has been defined at this time. Actions may reference other non-executable statements from the parse tree in some to be determined manner.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
60
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
61 From the CFG-WP, it is stated that the FIR will ultimately be in an SSA form. It is clear that a later pass can rewrite the values/expressions and construct a factored use-def version of the expressions. This may/should also involve expanding "big step" actions to a series of instructions and introducing typing information for all instructions. Again, the exact "lowered representation" will be informed from the requirements of the optimization passes and is presently to be determined.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
62
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
63 ### Other
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
64
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
65 Overall project goals include becoming part of the LLVM ecosystem as well as using LLVM as a backend.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
66
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
67 Critical edge splitting can be constructed on-demand and as needed.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
68
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
69 Lowering of procedures with ENTRY statements is specified. The plan is to lower procedures with ENTRY statements as specified in the CFG-WP.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
70
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
71 In FIR, a procedure will have a method that returns the start node.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
72
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
73 When lowering to FIR statements, non-executable statements will be discarded.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
74
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
75 Labeled DO loops are converted to non-labeled DO loops in the semantics processing.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
76
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
77 The last statement in a basic block can represent a change in control flow. LLVM-IR and SIL<sup>7</sup> require that basic blocks end with a terminator. FIR will also have terminators.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
78
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
79 The CFG-WP states that scope transitions are to be made explicit. We will cover this more below.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
80
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
81 LLVM does not require the FIR to be in SSA form. LLVM's mem-to-reg pass does the conversion into SSA form. FIR can support SSA for optimization passes on-demand with its own mem-to-reg and reg-to-mem type passes.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
82
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
83 Data objects with process lifetime will be captured indirectly by a reference to the (global) symbol table.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
84
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
85 ## Exploration
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
86
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
87 ### Construction
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
88
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
89 Our aim to construct a CFG where all control-flow is explicitly modeled by relations. A basic block will be a sequence of statements for which if the first statement is executed then all other statements in the basic block will also be executed, in order.<sup>8</sup> A CFG is therefore this set of basic blocks and the control-flow relations between those blocks.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
90
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
91 #### Alternative: direct approach
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
92
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
93 The CFG can be directly constructed by traversing the parse tree, threading contextual state, and building basic blocks along with control-flow relationships.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
94
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
95 * Pro: Straightforward implementation when control-flow is well-structured as the contextual state parallels the syntax of the language closely.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
96 * Con: The contextual state needed can become large and difficult to manage in the presence of unstructured control-flow. For example, not every labeled statement in Fortran may be a control-flow destination.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
97 * Con: The contextual state must deal with the recursive nature of the parse tree.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
98 * Con: Complexity. Since structured constructs cohabitate with unstructured constructs, the context needs to carry information about all combinations until the basic blocks and relations are fully elaborated.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
99
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
100 #### Alternative: linearized approach (decomposing the problem)
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
101
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
102 Instead of constructing the CFG directly from a parse tree traversal, an intermediate form can be constructed to explicitly capture the executable statements, which ones give rise to control-flow graph edge sources, and which are control-flow graph edge targets. This linearized form flattens the tree structure of the parse tree. The linearized form does not require recursive visitation of nested constructs and can be used to directly identify the entries and exits of basic blocks.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
103
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
104 While each control-flow source statement is explicit in the traversal, it can be the case that not all of the targets have been traversed yet (references to forward basic blocks), and those basic blocks will not yet have been created. These relations can be captured at the time the source is traversed, added to a to do list, and then completed when all the basic blocks for the procedure have been created. Specifically, at the point when we create a terminator all information is known to create the FIR terminator, however all basic blocks that may be referenced may not have been created. Those are resolved in one final "clean up" pass over a list of closures.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
105
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
106 * Con: An extra representation must be defined and constructed.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
107 * Pro: This representation reifies all the information that is referred to as contextual state in the direct approach.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
108 * Pro: Constructing the linearized form can be done with a simple traversal of the parse tree.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
109 * Pro: Once composed the linearized form can be traversed and a CFG directly constructed. This greatly reduces bookkeeping of contextual state.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
110
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
111 ### Details
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
112
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
113 #### Grappling with Control Flow
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
114
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
115 Above, various Fortran executable constructs were discussed with respect to how they (may) give rise to control flow. These Fortran statements are mapped to a small number of FIR statements: ReturnStmt, BranchStmt, SwitchStmt, IndirectBrStmt, and UnreachableStmt.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
116
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
117 _ReturnStmt_: execution leaves the enclosing Procedure. A ReturnStmt can return an optional value. This would appear for RETURN statements or at END SUBROUTINE.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
118
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
119 _BranchStmt_: execution of the current basic block ends. If the branch is unconditional then control transfers to exactly one successor basic block. If the branch is conditional then control transfers to exactly one of two successor blocks depending on the true/false value of the condition. All successors must be in the current Procedure. Unconditional branches would appear for GOTO statements. Conditional branches would appear for IF constructs, IF statements, etc.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
120
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
121 _SwitchStmt_: Exactly one of multiple successors is selected based on the control expression. Successors are pairs of case expressions and basic blocks. If the control expression compares to the case expression and returns true, then that control transfers to that block. There may be one special block, the default block, that is selected if none of the case expressions compares true. This would appear for SELECT CASE, SELECT TYPE, SELECT RANK, COMPUTED GOTO, WRITE with exceptional condition label specificers, alternate return specifiers, etc.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
122
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
123 _IndirectBrStmt_: A variable is loaded with the address of a basic block in the containing Procedure. Control is transferred to the contents of this variable. An IndirectBrStmt also requires a complete list of potential basic blocks that may be loaded into the variable. This would appear for ASSIGNED GOTO.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
124
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
125 Supporting ASSIGNED GOTO offers a little extra challenge as the ASSIGN GOTO statement's list of target labels is optional. If that list is not present, then the procedure must be analyzed to find ASSIGN statements. The implementation proactively looks for ASSIGN statements and keeps a dictionary mapping an assigned Symbol to its set of targets. When constructing the CFG, ASSIGNED GOTOs can be processed as to potential targets either from the list provided in the ASSIGNED GOTO or from the analysis pass.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
126
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
127 Alternatively, ASSIGNED GOTO could be implemented as a _SwitchStmt_ that tests on a compiler-defined value and fully elaborates all potential target basic blocks.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
128
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
129 _UnreachableStmt_: If control reaches an unreachable statement, then an error has occurred. Calls to library routines that do not return should be followed by an UnreachableStmt. An example would be the STOP statement.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
130
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
131 #### Scope
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
132
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
133 In the CFG-WP, scopes are meant to be captured by a pair of backstage statements for entering and exiting a particular scope. In structured code, these pairs would not be problematic; however, control flow in Fortran is ad hoc, particularly in legacy Fortran. In short, Fortran does not have a clean sense of structure with respect to scope.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
134
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
135 To separate concerns, FIR will construct the ad hoc CFG and impose bounding boxes over regions of that graph to demarcate and superimpose scope structures on that CFG. Any GOTO-like statements that are side-entries and side-exits to the region will be explicit.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
136
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
137 Once the basic blocks are constructed, CFG edges defined, and the CFG is simplified, a simple pass that analyzes the region bounding boxes can decorate the basic blocks with the SCOPE ENTER and SCOPE EXIT statements and flatten/remove the region structure. It will then be the burden of any optimization passes to guarantee legal orderings of SCOPE ENTER and SCOPE EXIT pairs.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
138
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
139 * Pro: Separation of concerns allows for simpler, easier to maintain code
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
140 * Pro: Simplification of the CFG can be done without worrying about SCOPE markers
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
141 * Pro: Allows a precise superimposing of all Fortran constructs with scoping considerations over an otherwise ad hoc CFG.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
142 * Con: Adds "an extra layer" to FIR as compared to SIL. However, that can be mitigated/made inconsequential by a pass that flattens the Region tree and inserts the backstage SCOPE marker statements.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
143
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
144 #### Structure
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
145
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
146 _Program_: A program instance is the top-level object that contains the representation of all the code being compiled, the compilation unit. It contains a list of procedures and a reference to the global symbol table.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
147
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
148 _Procedure_: This is a named Fortran procedure (subroutine or function). It contains a (hierarchical) list of regions. It also owns the master list of all basic blocks for the procedure.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
149
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
150 _Region_: A region is owned by a procedure or by another region. A region owns a reference to a scope in the symbol table tree. The list of delineated basic blocks can also be requested from a region.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
151
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
152 _Basic block_: A basic block is owned by a procedure. A basic block owns a list of statements. The last statement in the list must be a terminator, and no other statement in the list can be a terminator. A basic block owns a list of its predecessors, which are also basic blocks. (Precisely, it is this level of FIR that is the CFG.)
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
153
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
154 _Statement_: An executable Fortran construct that owns/refers to expressions, symbols, scopes, etc. produced by the front-end.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
155
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
156 _Terminator_: A statement that orchestrates control-flow. Terminator statements may reference other basic blocks and can be accessed by their parent basic block to discover successor blocks, if any.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
157
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
158 #### Support
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
159
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
160 Since there is some state that needs to be maintained and forwarded as the FIR is constructed, a FIRBuilder can be used for convenience. The FIRBuilder constructs statements and updates the CFG accordingly.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
161
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
162 To support visualization, there is a support class to dump the FIR to a dotty graph.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
163
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
164 ### Data Structures
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
165
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
166 FIR is intentionally similar to SIL from the statement level up to the level of a program.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
167
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
168 #### Alternative: LLVM
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
169
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
170 Program, procedure, region, and basic block all leverage code from LLVM, in much the same way as SIL. These data structures have significant investment and engineering behind their use in compilers, and it makes sense to leverage that work.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
171
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
172 * Pro: Uses LLVM data structures, pervasive in compiler projects such as LLVM, SIL, etc.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
173 * Pro: Get used to seeing and using LLVM, as f18 aims to be an LLVM project
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
174 * Con: Uses LLVM data structures, which the project has been avoiding
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
175
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
176 #### Alternative: C++ Standard Template Library
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
177
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
178 Clearly, the STL can be used to maintain lists, etc.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
179
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
180 * Pro: Keeps the number of libraries minimal
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
181 * Con: The STL is general purpose and not necessarily tuned to support compiler construction
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
182
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
183 #### Alternative: Boost, Library XYZ, etc.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
184
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
185 * Con: Don't see a strong motivation at present for adding another library.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
186
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
187 Statements are a bit of a transition point. Instead of the LLVM-IR approach of strictly using subtype polymorphism (for hash consing, etc.), FIR statements are a hybrid between ad hoc/subtype polymorphism and parametric polymorphism. This gives us a middle ground of genericity through superclassing and the strong and exact type-safety of algebraic data types &mdash; effectively providing type casing and type classing.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
188
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
189 The operations (expressions) owned/referenced by a statement, variable references, etc. will be data structures from the Fortran::evaluate, Fortran::semantics, etc. namespaces.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
190
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
191
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
192
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
193 <hr>
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
194
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
195 <sup>1</sup> CFG paper. https://bit.ly/2q9IRaQ
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
196
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
197 <sup>2</sup> All labeled DO sequences will have been translated to DO constructs by semantic analysis.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
198
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
199 <sup>3</sup> IF statements are handled like IF constructs with no ELSE alternatives.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
200
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
201 <sup>4</sup> In a subsequent discussion, we may want to lower FORALL constructs to semantically distinct loops or even another canonical representation.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
202
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
203 <sup>5</sup> These statements are only valid in structured constructs and the branches are well-defined by that executable construct.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
204
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
205 <sup>6</sup> Unlike SIL and LLVM-IR.
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
206
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
207 <sup>7</sup> SIL is the Swift (high-level) intermediate language. https://bit.ly/2RHW0DQ
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
208
79ff65ed7e25 LLVM12 Original
Shinji KONO <kono@ie.u-ryukyu.ac.jp>
parents:
diff changeset
209 <sup>8</sup> Single-threaded semantics.