1 # Conversion to the LLVM Dialect
3 Conversion from the Standard to the [LLVM Dialect](Dialects/LLVM.md) can be
4 performed by the specialized dialect conversion pass by running:
6 ```shell
7 mlir-opt -convert-std-to-llvm <filename.mlir>
8 ```
10 It performs type and operation conversions for a subset of operations from
11 standard dialect (operations on scalars and vectors, control flow operations) as
12 described in this document. We use the terminology defined by the
13 [LLVM IR Dialect description](Dialects/LLVM.md) throughout this document.
15 [TOC]
17 ## Type Conversion
19 ### Scalar Types
21 Scalar types are converted to their LLVM counterparts if they exist. The
22 following conversions are currently implemented:
24 - `i*` converts to `!llvm.i*`
25 - `f16` converts to `!llvm.half`
26 - `f32` converts to `!llvm.float`
27 - `f64` converts to `!llvm.double`
29 Note: `bf16` type is not supported by LLVM IR and cannot be converted.
31 ### Index Type
33 Index type is converted to a wrapped LLVM IR integer with bitwidth equal to the
34 bitwidth of the pointer size as specified by the
35 [data layout](https://llvm.org/docs/LangRef.html#data-layout) of the LLVM module
36 [contained](Dialects/LLVM.md#context-and-module-association) in the LLVM Dialect
37 object. For example, on x86-64 CPUs it converts to `!llvm.i64`.
39 ### Vector Types
41 LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can
42 be multi-dimensional. Vector types cannot be nested in either IR. In the
43 one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same
44 size with element type converted using these conversion rules. In the
45 n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types
46 of one-dimensional vectors.
48 For example, `vector<4 x f32>` converts to `!llvm<"<4 x float>">` and `vector<4
49 x 8 x 16 x f32>` converts to `!llvm<"[4 x [8 x <16 x float>]]">`.
51 ### Memref Types
53 Memref types in MLIR have both static and dynamic information associated with
54 them. The dynamic information comprises the buffer pointer as well as sizes and
55 strides of any dynamically-sized dimensions. Memref types are normalized and
56 converted to a descriptor that is only dependent on the rank of the memref. The
57 descriptor contains:
59 1. the pointer to the data buffer, followed by
60 2. the pointer to properly aligned data payload that the memref indexes,
61 followed by
62 3. a lowered `index`-type integer containing the distance between the beginning
63 of the buffer and the first element to be accessed through the memref,
64 followed by
65 4. an array containing as many `index`-type integers as the rank of the memref:
66 the array represents the size, in number of elements, of the memref along
67 the given dimension. For constant MemRef dimensions, the corresponding size
68 entry is a constant whose runtime value must match the static value,
69 followed by
70 5. a second array containing as many 64-bit integers as the rank of the MemRef:
71 the second array represents the "stride" (in tensor abstraction sense), i.e.
72 the number of consecutive elements of the underlying buffer.
74 For constant memref dimensions, the corresponding size entry is a constant whose
75 runtime value matches the static value. This normalization serves as an ABI for
76 the memref type to interoperate with externally linked functions. In the
77 particular case of rank `0` memrefs, the size and stride arrays are omitted,
78 resulting in a struct containing two pointers + offset.
80 Examples:
82 ```mlir
83 memref<f32> -> !llvm<"{ float*, float*, i64 }">
84 memref<1 x f32> -> !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
85 memref<? x f32> -> !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
86 memref<10x42x42x43x123 x f32> -> !llvm<"{ float*, float*, i64, [5 x i64], [5 x i64] }">
87 memref<10x?x42x?x123 x f32> -> !llvm<"{ float*, float*, i64, [5 x i64], [5 x i64] }">
89 // Memref types can have vectors as element types
90 memref<1x? x vector<4xf32>> -> !llvm<"{ <4 x float>*, <4 x float>*, i64, [1 x i64], [1 x i64] }">
91 ```
93 If the rank of the memref is unknown at compile time, the memref is converted to
94 an unranked descriptor that contains:
96 1. a 64-bit integer representing the dynamic rank of the memref, followed by
97 2. a pointer to a ranked memref descriptor with the contents listed above.
99 Dynamic ranked memrefs should be used only to pass arguments to external library
100 calls that expect a unified memref type. The called functions can parse any
101 unranked memref descriptor by reading the rank and parsing the enclosed ranked
102 descriptor pointer.
104 Examples:
106 ```mlir
107 // unranked descriptor
108 memref<*xf32> -> !llvm<"{i64, i8*}">
109 ```
111 **In function signatures,** `memref` is passed as a _pointer_ to the structured
112 defined above to comply with the calling convention.
114 Example:
116 ```mlir
117 // A function type with memref as argument
118 (memref<?xf32>) -> ()
119 // is transformed into the LLVM function with pointer-to-structure argument.
120 !llvm<"void({ float*, float*, i64, [1 x i64], [1 x i64]}*) ">
121 ```
123 ### Function Types
125 Function types get converted to LLVM function types. The arguments are converted
126 individually according to these rules. The result types need to accommodate the
127 fact that LLVM IR functions always have a return type, which may be a Void type.
128 The converted function always has a single result type. If the original function
129 type had no results, the converted function will have one result of the wrapped
130 `void` type. If the original function type had one result, the converted
131 function will also have one result converted using these rules. Otherwise, the result
132 type will be a wrapped LLVM IR structure type where each element of the
133 structure corresponds to one of the results of the original function, converted
134 using these rules. In high-order functions, function-typed arguments and results
135 are converted to a wrapped LLVM IR function pointer type (since LLVM IR does not
136 allow passing functions to functions without indirection) with the pointee type
137 converted using these rules.
139 Examples:
141 ```mlir
142 // zero-ary function type with no results.
143 () -> ()
144 // is converted to a zero-ary function with `void` result
145 !llvm<"void ()">
147 // unary function with one result
148 (i32) -> (i64)
149 // has its argument and result type converted, before creating the LLVM IR function type
150 !llvm<"i64 (i32)">
152 // binary function with one result
153 (i32, f32) -> (i64)
154 // has its arguments handled separately
155 !llvm<"i64 (i32, float)">
157 // binary function with two results
158 (i32, f32) -> (i64, f64)
159 // has its result aggregated into a structure type
160 !llvm<"{i64, double} (i32, f32)">
162 // function-typed arguments or results in higher-order functions
163 (() -> ()) -> (() -> ())
164 // are converted into pointers to functions
165 !llvm<"void ()* (void ()*)">
166 ```
168 ## Calling Convention
170 ### Function Signature Conversion
172 LLVM IR functions are defined by a custom operation. The function itself has a
173 wrapped LLVM IR function type converted as described above. The function
174 definition operation uses MLIR syntax.
176 Examples:
178 ```mlir
179 // zero-ary function type with no results.
180 func @foo() -> ()
181 // gets LLVM type void().
182 llvm.func @foo() -> ()
184 // function with one result
185 func @bar(i32) -> (i64)
186 // gets converted to LLVM type i64(i32).
187 func @bar(!llvm.i32) -> !llvm.i64
189 // function with two results
190 func @qux(i32, f32) -> (i64, f64)
191 // has its result aggregated into a structure type
192 func @qux(!llvm.i32, !llvm.float) -> !llvm<"{i64, double}">
194 // function-typed arguments or results in higher-order functions
195 func @quux(() -> ()) -> (() -> ())
196 // are converted into pointers to functions
197 func @quux(!llvm<"void ()*">) -> !llvm<"void ()*">
198 // the call flow is handled by the LLVM dialect `call` operation supporting both
199 // direct and indirect calls
200 ```
202 ### Result Packing
204 In case of multi-result functions, the returned values are inserted into a
205 structure-typed value before being returned and extracted from it at the call
206 site. This transformation is a part of the conversion and is transparent to the
207 defines and uses of the values being returned.
209 Example:
211 ```mlir
212 func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) {
213 return %arg0, %arg1 : i32, i64
214 }
215 func @bar() {
216 %0 = constant 42 : i32
217 %1 = constant 17 : i64
218 %2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64)
219 "use_i32"(%2#0) : (i32) -> ()
220 "use_i64"(%2#1) : (i64) -> ()
221 }
223 // is transformed into
225 func @foo(%arg0: !llvm.i32, %arg1: !llvm.i64) -> !llvm<"{i32, i64}"> {
226 // insert the vales into a structure
227 %0 = llvm.mlir.undef : !llvm<"{i32, i64}">
228 %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{i32, i64}">
229 %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{i32, i64}">
231 // return the structure value
232 llvm.return %2 : !llvm<"{i32, i64}">
233 }
234 func @bar() {
235 %0 = llvm.mlir.constant(42 : i32) : !llvm.i32
236 %1 = llvm.mlir.constant(17) : !llvm.i64
238 // call and extract the values from the structure
239 %2 = llvm.call @bar(%0, %1) : (%arg0: !llvm.i32, %arg1: !llvm.i32) -> !llvm<"{i32, i64}">
240 %3 = llvm.extractvalue %2[0] : !llvm<"{i32, i64}">
241 %4 = llvm.extractvalue %2[1] : !llvm<"{i32, i64}">
243 // use as before
244 "use_i32"(%3) : (!llvm.i32) -> ()
245 "use_i64"(%4) : (!llvm.i64) -> ()
246 }
247 ```
249 ### Calling Convention for `memref`
251 Function _arguments_ of `memref` type, ranked or unranked, are _expanded_ into a
252 list of arguments of non-aggregate types that the memref descriptor defined
253 above comprises. That is, the outer struct type and the inner array types are
254 replaced with individual arguments.
256 This convention is implemented in the conversion of `std.func` and `std.call` to
257 the LLVM dialect, with the former unpacking the descriptor into a set of
258 individual values and the latter packing those values back into a descriptor so
259 as to make it transparently usable by other operations. Conversions from other
260 dialects should take this convention into account.
262 This specific convention is motivated by the necessity to specify alignment and
263 aliasing attributes on the raw pointers underpinning the memref.
265 Examples:
267 ```mlir
268 func @foo(%arg0: memref<?xf32>) -> () {
269 "use"(%arg0) : (memref<?xf32>) -> ()
270 return
271 }
273 // Gets converted to the following.
275 llvm.func @foo(%arg0: !llvm<"float*">, // Allocated pointer.
276 %arg1: !llvm<"float*">, // Aligned pointer.
277 %arg2: !llvm.i64, // Offset.
278 %arg3: !llvm.i64, // Size in dim 0.
279 %arg4: !llvm.i64) { // Stride in dim 0.
280 // Populate memref descriptor structure.
281 %0 = llvm.mlir.undef : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
282 %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
283 %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
284 %3 = llvm.insertvalue %arg2, %2[2] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
285 %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
286 %5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
288 // Descriptor is now usable as a single value.
289 "use"(%5) : (!llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">) -> ()
290 llvm.return
291 }
292 ```
294 ```mlir
295 func @bar() {
296 %0 = "get"() : () -> (memref<?xf32>)
297 call @foo(%0) : (memref<?xf32>) -> ()
298 return
299 }
301 // Gets converted to the following.
303 llvm.func @bar() {
304 %0 = "get"() : () -> !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
306 // Unpack the memref descriptor.
307 %1 = llvm.extractvalue %0[0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
308 %2 = llvm.extractvalue %0[1] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
309 %3 = llvm.extractvalue %0[2] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
310 %4 = llvm.extractvalue %0[3, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
311 %5 = llvm.extractvalue %0[4, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">
313 // Pass individual values to the callee.
314 llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm<"float*">, !llvm<"float*">, !llvm.i64, !llvm.i64, !llvm.i64) -> ()
315 llvm.return
316 }
318 ```
320 For **unranked** memrefs, the list of function arguments always contains two
321 elements, same as the unranked memref descriptor: an integer rank, and a
322 type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that
323 while the _calling convention_ does not require stack allocation, _casting_ to
324 unranked memref does since one cannot take an address of an SSA value containing
325 the ranked memref. The caller is in charge of ensuring the thread safety and
326 eventually removing unnecessary stack allocations in cast operations.
328 Example
330 ```mlir
331 llvm.func @foo(%arg0: memref<*xf32>) -> () {
332 "use"(%arg0) : (memref<*xf32>) -> ()
333 return
334 }
336 // Gets converted to the following.
338 llvm.func @foo(%arg0: !llvm.i64 // Rank.
339 %arg1: !llvm<"i8*">) { // Type-erased pointer to descriptor.
340 // Pack the unranked memref descriptor.
341 %0 = llvm.mlir.undef : !llvm<"{ i64, i8* }">
342 %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{ i64, i8* }">
343 %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{ i64, i8* }">
345 "use"(%2) : (!llvm<"{ i64, i8* }">) -> ()
346 llvm.return
347 }
348 ```
350 ```mlir
351 llvm.func @bar() {
352 %0 = "get"() : () -> (memref<*xf32>)
353 call @foo(%0): (memref<*xf32>) -> ()
354 return
355 }
357 // Gets converted to the following.
359 llvm.func @bar() {
360 %0 = "get"() : () -> (!llvm<"{ i64, i8* }">)
362 // Unpack the memref descriptor.
363 %1 = llvm.extractvalue %0[0] : !llvm<"{ i64, i8* }">
364 %2 = llvm.extractvalue %0[1] : !llvm<"{ i64, i8* }">
366 // Pass individual values to the callee.
367 llvm.call @foo(%1, %2) : (!llvm.i64, !llvm<"i8*">)
368 llvm.return
369 }
370 ```
372 *This convention may or may not apply if the conversion of MemRef types is
373 overridden by the user.*
375 ### C-compatible wrapper emission
377 In practical cases, it may be desirable to have externally-facing functions with
378 a single attribute corresponding to a MemRef argument. When interfacing with
379 LLVM IR produced from C, the code needs to respect the corresponding calling
380 convention. The conversion to the LLVM dialect provides an option to generate
381 wrapper functions that take memref descriptors as pointers-to-struct compatible
382 with data types produced by Clang when compiling C sources. The generation of
383 such wrapper functions can additionally be controlled at a function granularity
384 by setting the `llvm.emit_c_interface` unit attribute.
386 More specifically, a memref argument is converted into a pointer-to-struct
387 argument of type `{T*, T*, i64, i64[N], i64[N]}*` in the wrapper function, where
388 `T` is the converted element type and `N` is the memref rank. This type is
389 compatible with that produced by Clang for the following C++ structure template
390 instantiations or their equivalents in C.
392 ```cpp
393 template<typename T, size_t N>
394 struct MemRefDescriptor {
395 T *allocated;
396 T *aligned;
397 intptr_t offset;
398 intptr_t sizes[N];
399 intptr_t strides[N];
400 };
401 ```
403 If enabled, the option will do the following. For _external_ functions declared
404 in the MLIR module.
406 1. Declare a new function `_mlir_ciface_<original name>` where memref arguments
407 are converted to pointer-to-struct and the remaining arguments are converted
408 as usual.
409 1. Add a body to the original function (making it non-external) that
410 1. allocates a memref descriptor,
411 1. populates it, and
412 1. passes the pointer to it into the newly declared interface function, then
413 1. collects the result of the call and returns it to the caller.
415 For (non-external) functions defined in the MLIR module.
417 1. Define a new function `_mlir_ciface_<original name>` where memref arguments
418 are converted to pointer-to-struct and the remaining arguments are converted
419 as usual.
420 1. Populate the body of the newly defined function with IR that
421 1. loads descriptors from pointers;
422 1. unpacks descriptor into individual non-aggregate values;
423 1. passes these values into the original function;
424 1. collects the result of the call and returns it to the caller.
426 Examples:
428 ```mlir
430 func @qux(%arg0: memref<?x?xf32>)
432 // Gets converted into the following.
434 // Function with unpacked arguments.
435 llvm.func @qux(%arg0: !llvm<"float*">, %arg1: !llvm<"float*">, %arg2: !llvm.i64,
436 %arg3: !llvm.i64, %arg4: !llvm.i64, %arg5: !llvm.i64,
437 %arg6: !llvm.i64) {
438 // Populate memref descriptor (as per calling convention).
439 %0 = llvm.mlir.undef : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
440 %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
441 %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
442 %3 = llvm.insertvalue %arg2, %2[2] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
443 %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
444 %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
445 %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
446 %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
448 // Store the descriptor in a stack-allocated space.
449 %8 = llvm.mlir.constant(1 : index) : !llvm.i64
450 %9 = llvm.alloca %8 x !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
451 : (!llvm.i64) -> !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">
452 llvm.store %7, %9 : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">
454 // Call the interface function.
455 llvm.call @_mlir_ciface_qux(%9) : (!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">) -> ()
457 // The stored descriptor will be freed on return.
458 llvm.return
459 }
461 // Interface function.
462 llvm.func @_mlir_ciface_qux(!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">)
463 ```
465 ```mlir
466 func @foo(%arg0: memref<?x?xf32>) {
467 return
468 }
470 // Gets converted into the following.
472 // Function with unpacked arguments.
473 llvm.func @foo(%arg0: !llvm<"float*">, %arg1: !llvm<"float*">, %arg2: !llvm.i64,
474 %arg3: !llvm.i64, %arg4: !llvm.i64, %arg5: !llvm.i64,
475 %arg6: !llvm.i64) {
476 llvm.return
477 }
479 // Interface function callable from C.
480 llvm.func @_mlir_ciface_foo(%arg0: !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">) {
481 // Load the descriptor.
482 %0 = llvm.load %arg0 : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">
484 // Unpack the descriptor as per calling convention.
485 %1 = llvm.extractvalue %0[0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
486 %2 = llvm.extractvalue %0[1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
487 %3 = llvm.extractvalue %0[2] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
488 %4 = llvm.extractvalue %0[3, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
489 %5 = llvm.extractvalue %0[3, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
490 %6 = llvm.extractvalue %0[4, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
491 %7 = llvm.extractvalue %0[4, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }">
492 llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
493 : (!llvm<"float*">, !llvm<"float*">, !llvm.i64, !llvm.i64, !llvm.i64,
494 !llvm.i64, !llvm.i64) -> ()
495 llvm.return
496 }
497 ```
499 Rationale: Introducing auxiliary functions for C-compatible interfaces is
500 preferred to modifying the calling convention since it will minimize the effect
501 of C compatibility on intra-module calls or calls between MLIR-generated
502 functions. In particular, when calling external functions from an MLIR module in
503 a (parallel) loop, the fact of storing a memref descriptor on stack can lead to
504 stack exhaustion and/or concurrent access to the same address. Auxiliary
505 interface function serves as an allocation scope in this case. Furthermore, when
506 targeting accelerators with separate memory spaces such as GPUs, stack-allocated
507 descriptors passed by pointer would have to be transferred to the device memory,
508 which introduces significant overhead. In such situations, auxiliary interface
509 functions are executed on host and only pass the values through device function
510 invocation mechanism.
512 ## Repeated Successor Removal
514 Since the goal of the LLVM IR dialect is to reflect LLVM IR in MLIR, the dialect
515 and the conversion procedure must account for the differences between block
516 arguments and LLVM IR PHI nodes. In particular, LLVM IR disallows PHI nodes with
517 different values coming from the same source. Therefore, the LLVM IR dialect
518 disallows operations that have identical successors accepting arguments, which
519 would lead to invalid PHI nodes. The conversion process resolves the potential
520 PHI source ambiguity by injecting dummy blocks if the same block is used more
521 than once as a successor in an instruction. These dummy blocks branch
522 unconditionally to the original successors, pass them the original operands
523 (available in the dummy block because it is dominated by the original block) and
524 are used instead of them in the original terminator operation.
526 Example:
528 ```mlir
529 cond_br %0, ^bb1(%1 : i32), ^bb1(%2 : i32)
530 ^bb1(%3 : i32)
531 "use"(%3) : (i32) -> ()
532 ```
534 leads to a new basic block being inserted,
536 ```mlir
537 cond_br %0, ^bb1(%1 : i32), ^dummy
538 ^bb1(%3 : i32):
539 "use"(%3) : (i32) -> ()
540 ^dummy:
541 br ^bb1(%4 : i32)
542 ```
544 before the conversion to the LLVM IR dialect:
546 ```mlir
547 llvm.cond_br %0, ^bb1(%1 : !llvm.i32), ^dummy
548 ^bb1(%3 : !llvm<"i32">):
549 "use"(%3) : (!llvm.i32) -> ()
550 ^dummy:
551 llvm.br ^bb1(%2 : !llvm.i32)
552 ```
554 ## Default Memref Model
556 ### Memref Descriptor
558 Within a converted function, a `memref`-typed value is represented by a memref
559 _descriptor_, the type of which is the structure type obtained by converting
560 from the memref type. This descriptor holds all the necessary information to
561 produce an address of a specific element. In particular, it holds dynamic values
562 for static sizes, and they are expected to match at all times.
564 It is created by the allocation operation and is updated by the conversion
565 operations that may change static dimensions into dynamic dimensions and vice versa.
567 **Note**: LLVM IR conversion does not support `memref`s with layouts that are
568 not amenable to the strided form.
570 ### Index Linearization
572 Accesses to a memref element are transformed into an access to an element of the
573 buffer pointed to by the descriptor. The position of the element in the buffer
574 is calculated by linearizing memref indices in row-major order (lexically first
575 index is the slowest varying, similar to C, but accounting for strides). The
576 computation of the linear address is emitted as arithmetic operation in the LLVM
577 IR dialect. Strides are extracted from the memref descriptor.
579 Accesses to zero-dimensional memref (that are interpreted as pointers to the
580 elemental type) are directly converted into `llvm.load` or `llvm.store` without
581 any pointer manipulations.
583 Examples:
585 An access to a zero-dimensional memref is converted into a plain load:
587 ```mlir
588 // before
589 %0 = load %m[] : memref<f32>
591 // after
592 %0 = llvm.load %m : !llvm<"float*">
593 ```
595 An access to a memref with indices:
597 ```mlir
598 %0 = load %m[1,2,3,4] : memref<10x?x13x?xf32>
599 ```
601 is transformed into the equivalent of the following code:
603 ```mlir
604 // Compute the linearized index from strides. Each block below extracts one
605 // stride from the descriptor, multiplies it with the index and accumulates
606 // the total offset.
607 %stride1 = llvm.extractvalue[4, 0] : !llvm<"{float*, float*, i64, i64[4], i64[4]}">
608 %idx1 = llvm.mlir.constant(1 : index) !llvm.i64
609 %addr1 = muli %stride1, %idx1 : !llvm.i64
611 %stride2 = llvm.extractvalue[4, 1] : !llvm<"{float*, float*, i64, i64[4], i64[4]}">
612 %idx2 = llvm.mlir.constant(2 : index) !llvm.i64
613 %addr2 = muli %stride2, %idx2 : !llvm.i64
614 %addr3 = addi %addr1, %addr2 : !llvm.i64
616 %stride3 = llvm.extractvalue[4, 2] : !llvm<"{float*, float*, i64, i64[4], i64[4]}">
617 %idx3 = llvm.mlir.constant(3 : index) !llvm.i64
618 %addr4 = muli %stride3, %idx3 : !llvm.i64
619 %addr5 = addi %addr3, %addr4 : !llvm.i64
621 %stride4 = llvm.extractvalue[4, 3] : !llvm<"{float*, float*, i64, i64[4], i64[4]}">
622 %idx4 = llvm.mlir.constant(4 : index) !llvm.i64
623 %addr6 = muli %stride4, %idx4 : !llvm.i64
624 %addr7 = addi %addr5, %addr6 : !llvm.i64
626 // Add the linear offset to the address.
627 %offset = llvm.extractvalue[2] : !llvm<"{float*, float*, i64, i64[4], i64[4]}">
628 %addr8 = addi %addr7, %offset : !llvm.i64
630 // Obtain the aligned pointer.
631 %aligned = llvm.extractvalue[1] : !llvm<"{float*, float*, i64, i64[4], i64[4]}">
633 // Get the address of the data pointer.
634 %ptr = llvm.getelementptr %aligned[%addr8]
635 : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> -> !llvm<"float*">
637 // Perform the actual load.
638 %0 = llvm.load %ptr : !llvm<"float*">
639 ```
641 For stores, the address computation code is identical and only the actual store
642 operation is different.
644 Note: the conversion does not perform any sort of common subexpression
645 elimination when emitting memref accesses.