173
|
1 <!--===- documentation/FortranForCProgrammers.md
|
|
2
|
|
3 Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
4 See https://llvm.org/LICENSE.txt for license information.
|
|
5 SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
6
|
|
7 -->
|
|
8
|
|
9 Fortran For C Programmers
|
|
10 =========================
|
|
11
|
|
12 This note is limited to essential information about Fortran so that
|
|
13 a C or C++ programmer can get started more quickly with the language,
|
|
14 at least as a reader, and avoid some common pitfalls when starting
|
|
15 to write or modify Fortran code.
|
|
16 Please see other sources to learn about Fortran's rich history,
|
|
17 current applications, and modern best practices in new code.
|
|
18
|
|
19 Know This At Least
|
|
20 ------------------
|
|
21 * There have been many implementations of Fortran, often from competing
|
|
22 vendors, and the standard language has been defined by U.S. and
|
|
23 international standards organizations. The various editions of
|
|
24 the standard are known as the '66, '77, '90, '95, 2003, 2008, and
|
|
25 (now) 2018 standards.
|
|
26 * Forward compatibility is important. Fortran has outlasted many
|
|
27 generations of computer systems hardware and software. Standard
|
|
28 compliance notwithstanding, Fortran programmers generally expect that
|
|
29 code that has compiled successfully in the past will continue to
|
|
30 compile and work indefinitely. The standards sometimes designate
|
|
31 features as being deprecated, obsolescent, or even deleted, but that
|
|
32 can be read only as discouraging their use in new code -- they'll
|
|
33 probably always work in any serious implementation.
|
|
34 * Fortran has two source forms, which are typically distinguished by
|
|
35 filename suffixes. `foo.f` is old-style "fixed-form" source, and
|
|
36 `foo.f90` is new-style "free-form" source. All language features
|
|
37 are available in both source forms. Neither form has reserved words
|
|
38 in the sense that C does. Spaces are not required between tokens
|
|
39 in fixed form, and case is not significant in either form.
|
|
40 * Variable declarations are optional by default. Variables whose
|
|
41 names begin with the letters `I` through `N` are implicitly
|
|
42 `INTEGER`, and others are implicitly `REAL`. These implicit typing
|
|
43 rules can be changed in the source.
|
|
44 * Fortran uses parentheses in both array references and function calls.
|
|
45 All arrays must be declared as such; other names followed by parenthesized
|
|
46 expressions are assumed to be function calls.
|
|
47 * Fortran has a _lot_ of built-in "intrinsic" functions. They are always
|
|
48 available without a need to declare or import them. Their names reflect
|
|
49 the implicit typing rules, so you will encounter names that have been
|
|
50 modified so that they have the right type (e.g., `AIMAG` has a leading `A`
|
|
51 so that it's `REAL` rather than `INTEGER`).
|
|
52 * The modern language has means for declaring types, data, and subprogram
|
|
53 interfaces in compiled "modules", as well as legacy mechanisms for
|
|
54 sharing data and interconnecting subprograms.
|
|
55
|
|
56 A Rosetta Stone
|
|
57 ---------------
|
|
58 Fortran's language standard and other documentation uses some terminology
|
|
59 in particular ways that might be unfamiliar.
|
|
60
|
|
61 | Fortran | English |
|
|
62 | ------- | ------- |
|
|
63 | Association | Making a name refer to something else |
|
|
64 | Assumed | Some attribute of an argument or interface that is not known until a call is made |
|
|
65 | Companion processor | A C compiler |
|
|
66 | Component | Class member |
|
|
67 | Deferred | Some attribute of a variable that is not known until an allocation or assignment |
|
|
68 | Derived type | C++ class |
|
|
69 | Dummy argument | C++ reference argument |
|
|
70 | Final procedure | C++ destructor |
|
|
71 | Generic | Overloaded function, resolved by actual arguments |
|
|
72 | Host procedure | The subprogram that contains a nested one |
|
|
73 | Implied DO | There's a loop inside a statement |
|
|
74 | Interface | Prototype |
|
|
75 | Internal I/O | `sscanf` and `snprintf` |
|
|
76 | Intrinsic | Built-in type or function |
|
|
77 | Polymorphic | Dynamically typed |
|
|
78 | Processor | Fortran compiler |
|
|
79 | Rank | Number of dimensions that an array has |
|
|
80 | `SAVE` attribute | Statically allocated |
|
|
81 | Type-bound procedure | Kind of a C++ member function but not really |
|
|
82 | Unformatted | Raw binary |
|
|
83
|
|
84 Data Types
|
|
85 ----------
|
|
86 There are five built-in ("intrinsic") types: `INTEGER`, `REAL`, `COMPLEX`,
|
|
87 `LOGICAL`, and `CHARACTER`.
|
|
88 They are parameterized with "kind" values, which should be treated as
|
|
89 non-portable integer codes, although in practice today these are the
|
|
90 byte sizes of the data.
|
|
91 (For `COMPLEX`, the kind type parameter value is the byte size of one of the
|
|
92 two `REAL` components, or half of the total size.)
|
|
93 The legacy `DOUBLE PRECISION` intrinsic type is an alias for a kind of `REAL`
|
|
94 that should be bigger than the default `REAL`.
|
|
95
|
|
96 `COMPLEX` is a simple structure that comprises two `REAL` components.
|
|
97
|
|
98 `CHARACTER` data also have length, which may or may not be known at compilation
|
|
99 time.
|
|
100 `CHARACTER` variables are fixed-length strings and they get padded out
|
|
101 with space characters when not completely assigned.
|
|
102
|
|
103 User-defined ("derived") data types can be synthesized from the intrinsic
|
|
104 types and from previously-defined user types, much like a C `struct`.
|
|
105 Derived types can be parameterized with integer values that either have
|
|
106 to be constant at compilation time ("kind" parameters) or deferred to
|
|
107 execution ("len" parameters).
|
|
108
|
|
109 Derived types can inherit ("extend") from at most one other derived type.
|
|
110 They can have user-defined destructors (`FINAL` procedures).
|
|
111 They can specify default initial values for their components.
|
|
112 With some work, one can also specify a general constructor function,
|
|
113 since Fortran allows a generic interface to have the same name as that
|
|
114 of a derived type.
|
|
115
|
|
116 Last, there are "typeless" binary constants that can be used in a few
|
|
117 situations, like static data initialization or immediate conversion,
|
|
118 where type is not necessary.
|
|
119
|
|
120 Arrays
|
|
121 ------
|
|
122 Arrays are not types in Fortran.
|
|
123 Being an array is a property of an object or function, not of a type.
|
|
124 Unlike C, one cannot have an array of arrays or an array of pointers,
|
|
125 although can can have an array of a derived type that has arrays or
|
|
126 pointers as components.
|
|
127 Arrays are multidimensional, and the number of dimensions is called
|
|
128 the _rank_ of the array.
|
|
129 In storage, arrays are stored such that the last subscript has the
|
|
130 largest stride in memory, e.g. A(1,1) is followed by A(2,1), not A(1,2).
|
|
131 And yes, the default lower bound on each dimension is 1, not 0.
|
|
132
|
|
133 Expressions can manipulate arrays as multidimensional values, and
|
|
134 the compiler will create the necessary loops.
|
|
135
|
|
136 Allocatables
|
|
137 ------------
|
|
138 Modern Fortran programs use `ALLOCATABLE` data extensively.
|
|
139 Such variables and derived type components are allocated dynamically.
|
|
140 They are automatically deallocated when they go out of scope, much
|
|
141 like C++'s `std::vector<>` class template instances are.
|
|
142 The array bounds, derived type `LEN` parameters, and even the
|
|
143 type of an allocatable can all be deferred to run time.
|
|
144 (If you really want to learn all about modern Fortran, I suggest
|
|
145 that you study everything that can be done with `ALLOCATABLE` data,
|
|
146 and follow up all the references that are made in the documentation
|
|
147 from the description of `ALLOCATABLE` to other topics; it's a feature
|
|
148 that interacts with much of the rest of the language.)
|
|
149
|
|
150 I/O
|
|
151 ---
|
|
152 Fortran's input/output features are built into the syntax of the language,
|
|
153 rather than being defined by library interfaces as in C and C++.
|
|
154 There are means for raw binary I/O and for "formatted" transfers to
|
|
155 character representations.
|
|
156 There are means for random-access I/O using fixed-size records as well as for
|
|
157 sequential I/O.
|
|
158 One can scan data from or format data into `CHARACTER` variables via
|
|
159 "internal" formatted I/O.
|
|
160 I/O from and to files uses a scheme of integer "unit" numbers that is
|
|
161 similar to the open file descriptors of UNIX; i.e., one opens a file
|
|
162 and assigns it a unit number, then uses that unit number in subsequent
|
|
163 `READ` and `WRITE` statements.
|
|
164
|
|
165 Formatted I/O relies on format specifications to map values to fields of
|
|
166 characters, similar to the format strings used with C's `printf` family
|
|
167 of standard library functions.
|
|
168 These format specifications can appear in `FORMAT` statements and
|
|
169 be referenced by their labels, in character literals directly in I/O
|
|
170 statements, or in character variables.
|
|
171
|
|
172 One can also use compiler-generated formatting in "list-directed" I/O,
|
|
173 in which the compiler derives reasonable default formats based on
|
|
174 data types.
|
|
175
|
|
176 Subprograms
|
|
177 -----------
|
|
178 Fortran has both `FUNCTION` and `SUBROUTINE` subprograms.
|
|
179 They share the same name space, but functions cannot be called as
|
|
180 subroutines or vice versa.
|
|
181 Subroutines are called with the `CALL` statement, while functions are
|
|
182 invoked with function references in expressions.
|
|
183
|
|
184 There is one level of subprogram nesting.
|
|
185 A function, subroutine, or main program can have functions and subroutines
|
|
186 nested within it, but these "internal" procedures cannot themselves have
|
|
187 their own internal procedures.
|
|
188 As is the case with C++ lambda expressions, internal procedures can
|
|
189 reference names from their host subprograms.
|
|
190
|
|
191 Modules
|
|
192 -------
|
|
193 Modern Fortran has good support for separate compilation and namespace
|
|
194 management.
|
|
195 The *module* is the basic unit of compilation, although independent
|
|
196 subprograms still exist, of course, as well as the main program.
|
|
197 Modules define types, constants, interfaces, and nested
|
|
198 subprograms.
|
|
199
|
|
200 Objects from a module are made available for use in other compilation
|
|
201 units via the `USE` statement, which has options for limiting the objects
|
|
202 that are made available as well as for renaming them.
|
|
203 All references to objects in modules are done with direct names or
|
|
204 aliases that have been added to the local scope, as Fortran has no means
|
|
205 of qualifying references with module names.
|
|
206
|
|
207 Arguments
|
|
208 ---------
|
|
209 Functions and subroutines have "dummy" arguments that are dynamically
|
|
210 associated with actual arguments during calls.
|
|
211 Essentially, all argument passing in Fortran is by reference, not value.
|
|
212 One may restrict access to argument data by declaring that dummy
|
|
213 arguments have `INTENT(IN)`, but that corresponds to the use of
|
|
214 a `const` reference in C++ and does not imply that the data are
|
|
215 copied; use `VALUE` for that.
|
|
216
|
|
217 When it is not possible to pass a reference to an object, or a sparse
|
|
218 regular array section of an object, as an actual argument, Fortran
|
|
219 compilers must allocate temporary space to hold the actual argument
|
|
220 across the call.
|
|
221 This is always guaranteed to happen when an actual argument is enclosed
|
|
222 in parentheses.
|
|
223
|
|
224 The compiler is free to assume that any aliasing between dummy arguments
|
|
225 and other data is safe.
|
|
226 In other words, if some object can be written to under one name, it's
|
|
227 never going to be read or written using some other name in that same
|
|
228 scope.
|
|
229 ```
|
|
230 SUBROUTINE FOO(X,Y,Z)
|
|
231 X = 3.14159
|
|
232 Y = 2.1828
|
|
233 Z = 2 * X ! CAN BE FOLDED AT COMPILE TIME
|
|
234 END
|
|
235 ```
|
|
236 This is the opposite of the assumptions under which a C or C++ compiler must
|
|
237 labor when trying to optimize code with pointers.
|
|
238
|
|
239 Overloading
|
|
240 -----------
|
|
241 Fortran supports a form of overloading via its interface feature.
|
|
242 By default, an interface is a means for specifying prototypes for a
|
|
243 set of subroutines and functions.
|
|
244 But when an interface is named, that name becomes a *generic* name
|
|
245 for its specific subprograms, and calls via the generic name are
|
|
246 mapped at compile time to one of the specific subprograms based
|
|
247 on the types, kinds, and ranks of the actual arguments.
|
|
248 A similar feature can be used for generic type-bound procedures.
|
|
249
|
|
250 This feature can be used to overload the built-in operators and some
|
|
251 I/O statements, too.
|
|
252
|
|
253 Polymorphism
|
|
254 ------------
|
|
255 Fortran code can be written to accept data of some derived type or
|
|
256 any extension thereof using `CLASS`, deferring the actual type to
|
|
257 execution, rather than the usual `TYPE` syntax.
|
|
258 This is somewhat similar to the use of `virtual` functions in c++.
|
|
259
|
|
260 Fortran's `SELECT TYPE` construct is used to distinguish between
|
|
261 possible specific types dynamically, when necessary. It's a
|
|
262 little like C++17's `std::visit()` on a discriminated union.
|
|
263
|
|
264 Pointers
|
|
265 --------
|
|
266 Pointers are objects in Fortran, not data types.
|
|
267 Pointers can point to data, arrays, and subprograms.
|
|
268 A pointer can only point to data that has the `TARGET` attribute.
|
|
269 Outside of the pointer assignment statement (`P=>X`) and some intrinsic
|
|
270 functions and cases with pointer dummy arguments, pointers are implicitly
|
|
271 dereferenced, and the use of their name is a reference to the data to which
|
|
272 they point instead.
|
|
273
|
|
274 Unlike C, a pointer cannot point to a pointer *per se*, nor can they be
|
|
275 used to implement a level of indirection to the management structure of
|
|
276 an allocatable.
|
|
277 If you assign to a Fortran pointer to make it point at another pointer,
|
|
278 you are making the pointer point to the data (if any) to which the other
|
|
279 pointer points.
|
|
280 Similarly, if you assign to a Fortran pointer to make it point to an allocatable,
|
|
281 you are making the pointer point to the current content of the allocatable,
|
|
282 not to the metadata that manages the allocatable.
|
|
283
|
|
284 Unlike allocatables, pointers do not deallocate their data when they go
|
|
285 out of scope.
|
|
286
|
|
287 A legacy feature, "Cray pointers", implements dynamic base addressing of
|
|
288 one variable using an address stored in another.
|
|
289
|
|
290 Preprocessing
|
|
291 -------------
|
|
292 There is no standard preprocessing feature, but every real Fortran implementation
|
|
293 has some support for passing Fortran source code through a variant of
|
|
294 the standard C source preprocessor.
|
|
295 Since Fortran is very different from C at the lexical level (e.g., line
|
|
296 continuations, Hollerith literals, no reserved words, fixed form), using
|
|
297 a stock modern C preprocessor on Fortran source can be difficult.
|
|
298 Preprocessing behavior varies across implementations and one should not depend on
|
|
299 much portability.
|
|
300 Preprocessing is typically requested by the use of a capitalized filename
|
|
301 suffix (e.g., "foo.F90") or a compiler command line option.
|
|
302 (Since the F18 compiler always runs its built-in preprocessing stage,
|
|
303 no special option or filename suffix is required.)
|
|
304
|
|
305 "Object Oriented" Programming
|
|
306 -----------------------------
|
|
307 Fortran doesn't have member functions (or subroutines) in the sense
|
|
308 that C++ does, in which a function has immediate access to the members
|
|
309 of a specific instance of a derived type.
|
|
310 But Fortran does have an analog to C++'s `this` via *type-bound
|
|
311 procedures*.
|
|
312 This is a means of binding a particular subprogram name to a derived
|
|
313 type, possibly with aliasing, in such a way that the subprogram can
|
|
314 be called as if it were a component of the type (e.g., `X%F(Y)`)
|
|
315 and receive the object to the left of the `%` as an additional actual argument,
|
|
316 exactly as if the call had been written `F(X,Y)`.
|
|
317 The object is passed as the first argument by default, but that can be
|
|
318 changed; indeed, the same specific subprogram can be used for multiple
|
|
319 type-bound procedures by choosing different dummy arguments to serve as
|
|
320 the passed object.
|
|
321 The equivalent of a `static` member function is also available by saying
|
|
322 that no argument is to be associated with the object via `NOPASS`.
|
|
323
|
|
324 There's a lot more that can be said about type-bound procedures (e.g., how they
|
|
325 support overloading) but this should be enough to get you started with
|
|
326 the most common usage.
|
|
327
|
|
328 Pitfalls
|
|
329 --------
|
|
330 Variable initializers, e.g. `INTEGER :: J=123`, are _static_ initializers!
|
|
331 They imply that the variable is stored in static storage, not on the stack,
|
|
332 and the initialized value lasts only until the variable is assigned.
|
|
333 One must use an assignment statement to implement a dynamic initializer
|
|
334 that will apply to every fresh instance of the variable.
|
|
335 Be especially careful when using initializers in the newish `BLOCK` construct,
|
|
336 which perpetuates the interpretation as static data.
|
|
337 (Derived type component initializers, however, do work as expected.)
|
|
338
|
|
339 If you see an assignment to an array that's never been declared as such,
|
|
340 it's probably a definition of a *statement function*, which is like
|
|
341 a parameterized macro definition, e.g. `A(X)=SQRT(X)**3`.
|
|
342 In the original Fortran language, this was the only means for user
|
|
343 function definitions.
|
|
344 Today, of course, one should use an external or internal function instead.
|
|
345
|
|
346 Fortran expressions don't bind exactly like C's do.
|
|
347 Watch out for exponentiation with `**`, which of course C lacks; it
|
|
348 binds more tightly than negation does (e.g., `-2**2` is -4),
|
|
349 and it binds to the right, unlike what any other Fortran and most
|
|
350 C operators do; e.g., `2**2**3` is 256, not 64.
|
|
351 Logical values must be compared with special logical equivalence
|
|
352 relations (`.EQV.` and `.NEQV.`) rather than the usual equality
|
|
353 operators.
|
|
354
|
|
355 A Fortran compiler is allowed to short-circuit expression evaluation,
|
|
356 but not required to do so.
|
|
357 If one needs to protect a use of an `OPTIONAL` argument or possibly
|
|
358 disassociated pointer, use an `IF` statement, not a logical `.AND.`
|
|
359 operation.
|
|
360 In fact, Fortran can remove function calls from expressions if their
|
|
361 values are not required to determine the value of the expression's
|
|
362 result; e.g., if there is a `PRINT` statement in function `F`, it
|
|
363 may or may not be executed by the assignment statement `X=0*F()`.
|
|
364 (Well, it probably will be, in practice, but compilers always reserve
|
|
365 the right to optimize better.)
|