Mercurial > hg > CbC > CbC_llvm
diff mlir/docs/ConversionToLLVMDialect.md @ 150:1d019706d866
LLVM10
author | anatofuz |
---|---|
date | Thu, 13 Feb 2020 15:10:13 +0900 |
parents | |
children | 0572611fdcc8 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mlir/docs/ConversionToLLVMDialect.md Thu Feb 13 15:10:13 2020 +0900 @@ -0,0 +1,643 @@ +# Conversion to the LLVM Dialect + +Conversion from the Standard to the [LLVM Dialect](Dialects/LLVM.md) can be +performed by the specialized dialect conversion pass by running + +```shell +mlir-opt -convert-std-to-llvm <filename.mlir> +``` + +It performs type and operation conversions for a subset of operations from +standard dialect (operations on scalars and vectors, control flow operations) as +described in this document. We use the terminology defined by the +[LLVM IR Dialect description](Dialects/LLVM.md) throughout this document. + +[TOC] + +## Type Conversion + +### Scalar Types + +Scalar types are converted to their LLVM counterparts if they exist. The +following conversions are currently implemented. + +- `i*` converts to `!llvm.i*` +- `f16` converts to `!llvm.half` +- `f32` converts to `!llvm.float` +- `f64` converts to `!llvm.double` + +Note: `bf16` type is not supported by LLVM IR and cannot be converted. + +### Index Type + +Index type is converted to a wrapped LLVM IR integer with bitwidth equal to the +bitwidth of the pointer size as specified by the +[data layout](https://llvm.org/docs/LangRef.html#data-layout) of the LLVM module +[contained](Dialects/LLVM.md#context-and-module-association) in the LLVM Dialect +object. For example, on x86-64 CPUs it converts to `!llvm.i64`. + +### Vector Types + +LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can +be multi-dimensional. Vector types cannot be nested in either IR. In the +one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same +size with element type converted using these conversion rules. In the +n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types +of one-dimensional vectors. + +For example, `vector<4 x f32>` converts to `!llvm<"<4 x float>">` and `vector<4 +x 8 x 16 x f32>` converts to `!llvm<"[4 x [8 x <16 x float>]]">`. + +### Memref Types + +Memref types in MLIR have both static and dynamic information associated with +them. The dynamic information comprises the buffer pointer as well as sizes and +strides of any dynamically sized dimensions. Memref types are normalized and +converted to a descriptor that is only dependent on the rank of the memref. The +descriptor contains: + +1. the pointer to the data buffer, followed by +2. the pointer to properly aligned data payload that the memref indexes, + followed by +3. a lowered `index`-type integer containing the distance between the beginning + of the buffer and the first element to be accessed through the memref, + followed by +4. an array containing as many `index`-type integers as the rank of the memref: + the array represents the size, in number of elements, of the memref along + the given dimension. For constant MemRef dimensions, the corresponding size + entry is a constant whose runtime value must match the static value, + followed by +5. a second array containing as many 64-bit integers as the rank of the MemRef: + the second array represents the "stride" (in tensor abstraction sense), i.e. + the number of consecutive elements of the underlying buffer. + +For constant memref dimensions, the corresponding size entry is a constant whose +runtime value matches the static value. This normalization serves as an ABI for +the memref type to interoperate with externally linked functions. In the +particular case of rank `0` memrefs, the size and stride arrays are omitted, +resulting in a struct containing two pointers + offset. + +Examples: + +```mlir +memref<f32> -> !llvm<"{ float*, float*, i64 }"> +memref<1 x f32> -> !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> +memref<? x f32> -> !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> +memref<10x42x42x43x123 x f32> -> !llvm<"{ float*, float*, i64, [5 x i64], [5 x i64] }"> +memref<10x?x42x?x123 x f32> -> !llvm<"{ float*, float*, i64, [5 x i64], [5 x i64] }"> + +// Memref types can have vectors as element types +memref<1x? x vector<4xf32>> -> !llvm<"{ <4 x float>*, <4 x float>*, i64, [1 x i64], [1 x i64] }"> +``` + +If the rank of the memref is unknown at compile time, the Memref is converted to +an unranked descriptor that contains: + +1. a 64-bit integer representing the dynamic rank of the memref, followed by +2. a pointer to a ranked memref descriptor with the contents listed above. + +Dynamic ranked memrefs should be used only to pass arguments to external library +calls that expect a unified memref type. The called functions can parse any +unranked memref descriptor by reading the rank and parsing the enclosed ranked +descriptor pointer. + +Examples: + +```mlir +// unranked descriptor +memref<*xf32> -> !llvm<"{i64, i8*}"> +``` + +**In function signatures,** `memref` is passed as a _pointer_ to the structured +defined above to comply with the calling convention. + +Example: + +```mlir +// A function type with memref as argument +(memref<?xf32>) -> () +// is transformed into the LLVM function with pointer-to-structure argument. +!llvm<"void({ float*, float*, i64, [1 x i64], [1 x i64]}*) "> +``` + +### Function Types + +Function types get converted to LLVM function types. The arguments are converted +individually according to these rules. The result types need to accommodate the +fact that LLVM IR functions always have a return type, which may be a Void type. +The converted function always has a single result type. If the original function +type had no results, the converted function will have one result of the wrapped +`void` type. If the original function type had one result, the converted +function will have one result converted using these rules. Otherwise, the result +type will be a wrapped LLVM IR structure type where each element of the +structure corresponds to one of the results of the original function, converted +using these rules. In high-order functions, function-typed arguments and results +are converted to a wrapped LLVM IR function pointer type (since LLVM IR does not +allow passing functions to functions without indirection) with the pointee type +converted using these rules. + +Examples: + +```mlir +// zero-ary function type with no results. +() -> () +// is converted to a zero-ary function with `void` result +!llvm<"void ()"> + +// unary function with one result +(i32) -> (i64) +// has its argument and result type converted, before creating the LLVM IR function type +!llvm<"i64 (i32)"> + +// binary function with one result +(i32, f32) -> (i64) +// has its arguments handled separately +!llvm<"i64 (i32, float)"> + +// binary function with two results +(i32, f32) -> (i64, f64) +// has its result aggregated into a structure type +!llvm<"{i64, double} (i32, f32)"> + +// function-typed arguments or results in higher-order functions +(() -> ()) -> (() -> ()) +// are converted into pointers to functions +!llvm<"void ()* (void ()*)"> +``` + +## Calling Convention + +### Function Signature Conversion + +LLVM IR functions are defined by a custom operation. The function itself has a +wrapped LLVM IR function type converted as described above. The function +definition operation uses MLIR syntax. + +Examples: + +```mlir +// zero-ary function type with no results. +func @foo() -> () +// gets LLVM type void(). +llvm.func @foo() -> () + +// function with one result +func @bar(i32) -> (i64) +// gets converted to LLVM type i64(i32). +func @bar(!llvm.i32) -> !llvm.i64 + +// function with two results +func @qux(i32, f32) -> (i64, f64) +// has its result aggregated into a structure type +func @qux(!llvm.i32, !llvm.float) -> !llvm<"{i64, double}"> + +// function-typed arguments or results in higher-order functions +func @quux(() -> ()) -> (() -> ()) +// are converted into pointers to functions +func @quux(!llvm<"void ()*">) -> !llvm<"void ()*"> +// the call flow is handled by the LLVM dialect `call` operation supporting both +// direct and indirect calls +``` + +### Result Packing + +In case of multi-result functions, the returned values are inserted into a +structure-typed value before being returned and extracted from it at the call +site. This transformation is a part of the conversion and is transparent to the +defines and uses of the values being returned. + +Example: + +```mlir +func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) { + return %arg0, %arg1 : i32, i64 +} +func @bar() { + %0 = constant 42 : i32 + %1 = constant 17 : i64 + %2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64) + "use_i32"(%2#0) : (i32) -> () + "use_i64"(%2#1) : (i64) -> () +} + +// is transformed into + +func @foo(%arg0: !llvm.i32, %arg1: !llvm.i64) -> !llvm<"{i32, i64}"> { + // insert the vales into a structure + %0 = llvm.mlir.undef : !llvm<"{i32, i64}"> + %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{i32, i64}"> + %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{i32, i64}"> + + // return the structure value + llvm.return %2 : !llvm<"{i32, i64}"> +} +func @bar() { + %0 = llvm.mlir.constant(42 : i32) : !llvm.i32 + %1 = llvm.mlir.constant(17) : !llvm.i64 + + // call and extract the values from the structure + %2 = llvm.call @bar(%0, %1) : (%arg0: !llvm.i32, %arg1: !llvm.i32) -> !llvm<"{i32, i64}"> + %3 = llvm.extractvalue %2[0] : !llvm<"{i32, i64}"> + %4 = llvm.extractvalue %2[1] : !llvm<"{i32, i64}"> + + // use as before + "use_i32"(%3) : (!llvm.i32) -> () + "use_i64"(%4) : (!llvm.i64) -> () +} +``` + +### Calling Convention for `memref` + +Function _arguments_ of `memref` type, ranked or unranked, are _expanded_ into a +list of arguments of non-aggregate types that the memref descriptor defined +above comprises. That is, the outer struct type and the inner array types are +replaced with individual arguments. + +This convention is implemented in the conversion of `std.func` and `std.call` to +the LLVM dialect, with the former unpacking the descriptor into a set of +individual values and the latter packing those values back into a descriptor so +as to make it transparently usable by other operations. Conversions from other +dialects should take this convention into account. + +This specific convention is motivated by the necessity to specify alignment and +aliasing attributes on the raw pointers underpinning the memref. + +Examples: + +```mlir +func @foo(%arg0: memref<?xf32>) -> () { + "use"(%arg0) : (memref<?xf32>) -> () + return +} + +// Gets converted to the following. + +llvm.func @foo(%arg0: !llvm<"float*">, // Allocated pointer. + %arg1: !llvm<"float*">, // Aligned pointer. + %arg2: !llvm.i64, // Offset. + %arg3: !llvm.i64, // Size in dim 0. + %arg4: !llvm.i64) { // Stride in dim 0. + // Populate memref descriptor structure. + %0 = llvm.mlir.undef : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %3 = llvm.insertvalue %arg2, %2[2] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + + // Descriptor is now usable as a single value. + "use"(%5) : (!llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }">) -> () + llvm.return +} +``` + +```mlir +func @bar() { + %0 = "get"() : () -> (memref<?xf32>) + call @foo(%0) : (memref<?xf32>) -> () + return +} + +// Gets converted to the following. + +llvm.func @bar() { + %0 = "get"() : () -> !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + + // Unpack the memref descriptor. + %1 = llvm.extractvalue %0[0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %2 = llvm.extractvalue %0[1] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %3 = llvm.extractvalue %0[2] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %4 = llvm.extractvalue %0[3, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + %5 = llvm.extractvalue %0[4, 0] : !llvm<"{ float*, float*, i64, [1 x i64], [1 x i64] }"> + + // Pass individual values to the callee. + llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm<"float*">, !llvm<"float*">, !llvm.i64, !llvm.i64, !llvm.i64) -> () + llvm.return +} + +``` + +For **unranked** memrefs, the list of function arguments always contains two +elements, same as the unranked memref descriptor: an integer rank, and a +type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that +while the _calling convention_ does not require stack allocation, _casting_ to +unranked memref does since one cannot take an address of an SSA value containing +the ranked memref. The caller is in charge of ensuring the thread safety and +eventually removing unnecessary stack allocations in cast operations. + +Example + +```mlir +llvm.func @foo(%arg0: memref<*xf32>) -> () { + "use"(%arg0) : (memref<*xf32>) -> () + return +} + +// Gets converted to the following. + +llvm.func @foo(%arg0: !llvm.i64 // Rank. + %arg1: !llvm<"i8*">) { // Type-erased pointer to descriptor. + // Pack the unranked memref descriptor. + %0 = llvm.mlir.undef : !llvm<"{ i64, i8* }"> + %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{ i64, i8* }"> + %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{ i64, i8* }"> + + "use"(%2) : (!llvm<"{ i64, i8* }">) -> () + llvm.return +} +``` + +```mlir +llvm.func @bar() { + %0 = "get"() : () -> (memref<*xf32>) + call @foo(%0): (memref<*xf32>) -> () + return +} + +// Gets converted to the following. + +llvm.func @bar() { + %0 = "get"() : () -> (!llvm<"{ i64, i8* }">) + + // Unpack the memref descriptor. + %1 = llvm.extractvalue %0[0] : !llvm<"{ i64, i8* }"> + %2 = llvm.extractvalue %0[1] : !llvm<"{ i64, i8* }"> + + // Pass individual values to the callee. + llvm.call @foo(%1, %2) : (!llvm.i64, !llvm<"i8*">) + llvm.return +} +``` + +*This convention may or may not apply if the conversion of MemRef types is +overridden by the user.* + +### C-compatible wrapper emission + +In practical cases, it may be desirable to have externally-facing functions +with a single attribute corresponding to a MemRef argument. When interfacing +with LLVM IR produced from C, the code needs to respect the corresponding +calling convention. The conversion to the LLVM dialect provides an option to +generate wrapper functions that take memref descriptors as pointers-to-struct +compatible with data types produced by Clang when compiling C sources. + +More specifically, a memref argument is converted into a pointer-to-struct +argument of type `{T*, T*, i64, i64[N], i64[N]}*` in the wrapper function, where +`T` is the converted element type and `N` is the memref rank. This type is +compatible with that produced by Clang for the following C++ structure template +instantiations or their equivalents in C. + +```cpp +template<typename T, size_t N> +struct MemRefDescriptor { + T *allocated; + T *aligned; + intptr_t offset; + intptr_t sizes[N]; + intptr_t stides[N]; +}; +``` + +If enabled, the option will do the following. For _external_ functions declared +in the MLIR module. + +1. Declare a new function `_mlir_ciface_<original name>` where memref arguments + are converted to pointer-to-struct and the remaining arguments are converted + as usual. +1. Add a body to the original function (making it non-external) that + 1. allocates a memref descriptor, + 1. populates it, and + 1. passes the pointer to it into the newly declared interface function + 1. collects the result of the call and returns it to the caller. + +For (non-external) functions defined in the MLIR module. + +1. Define a new function `_mlir_ciface_<original name>` where memref arguments + are converted to pointer-to-struct and the remaining arguments are converted + as usual. +1. Populate the body of the newly defined function with IR that + 1. loads descriptors from pointers; + 1. unpacks descriptor into individual non-aggregate values; + 1. passes these values into the original function; + 1. collects the result of the call and returns it to the caller. + +Examples: + +```mlir + +func @qux(%arg0: memref<?x?xf32>) + +// Gets converted into the following. + +// Function with unpacked arguments. +llvm.func @qux(%arg0: !llvm<"float*">, %arg1: !llvm<"float*">, %arg2: !llvm.i64, + %arg3: !llvm.i64, %arg4: !llvm.i64, %arg5: !llvm.i64, + %arg6: !llvm.i64) { + // Populate memref descriptor (as per calling convention). + %0 = llvm.mlir.undef : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %1 = llvm.insertvalue %arg0, %0[0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %2 = llvm.insertvalue %arg1, %1[1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %3 = llvm.insertvalue %arg2, %2[2] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + + // Store the descriptor in a stack-allocated space. + %8 = llvm.mlir.constant(1 : index) : !llvm.i64 + %9 = llvm.alloca %8 x !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + : (!llvm.i64) -> !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*"> + llvm.store %7, %9 : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*"> + + // Call the interface function. + llvm.call @_mlir_ciface_qux(%9) : (!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">) -> () + + // The stored descriptor will be freed on return. + llvm.return +} + +// Interface function. +llvm.func @_mlir_ciface_qux(!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">) +``` + +```mlir +func @foo(%arg0: memref<?x?xf32>) { + return +} + +// Gets converted into the following. + +// Function with unpacked arguments. +llvm.func @foo(%arg0: !llvm<"float*">, %arg1: !llvm<"float*">, %arg2: !llvm.i64, + %arg3: !llvm.i64, %arg4: !llvm.i64, %arg5: !llvm.i64, + %arg6: !llvm.i64) { + llvm.return +} + +// Interface function callable from C. +llvm.func @_mlir_ciface_foo(%arg0: !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">) { + // Load the descriptor. + %0 = llvm.load %arg0 : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*"> + + // Unpack the descriptor as per calling convention. + %1 = llvm.extractvalue %0[0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %2 = llvm.extractvalue %0[1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %3 = llvm.extractvalue %0[2] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %4 = llvm.extractvalue %0[3, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %5 = llvm.extractvalue %0[3, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %6 = llvm.extractvalue %0[4, 0] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + %7 = llvm.extractvalue %0[4, 1] : !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }"> + llvm.call @foo(%1, %2, %3, %4, %5, %6, %7) + : (!llvm<"float*">, !llvm<"float*">, !llvm.i64, !llvm.i64, !llvm.i64, + !llvm.i64, !llvm.i64) -> () + llvm.return +} +``` + +Rationale: Introducing auxiliary functions for C-compatible interfaces is +preferred to modifying the calling convention since it will minimize the effect +of C compatibility on intra-module calls or calls between MLIR-generated +functions. In particular, when calling external functions from an MLIR module in +a (parallel) loop, the fact of storing a memref descriptor on stack can lead to +stack exhaustion and/or concurrent access to the same address. Auxiliary +interface function serves as an allocation scope in this case. Furthermore, when +targeting accelerators with separate memory spaces such as GPUs, stack-allocated +descriptors passed by pointer would have to be transferred to the device memory, +which introduces significant overhead. In such situations, auxiliary interface +functions are executed on host and only pass the values through device function +invocation mechanism. + +## Repeated Successor Removal + +Since the goal of the LLVM IR dialect is to reflect LLVM IR in MLIR, the dialect +and the conversion procedure must account for the differences between block +arguments and LLVM IR PHI nodes. In particular, LLVM IR disallows PHI nodes with +different values coming from the same source. Therefore, the LLVM IR dialect +disallows operations that have identical successors accepting arguments, which +would lead to invalid PHI nodes. The conversion process resolves the potential +PHI source ambiguity by injecting dummy blocks if the same block is used more +than once as a successor in an instruction. These dummy blocks branch +unconditionally to the original successors, pass them the original operands +(available in the dummy block because it is dominated by the original block) and +are used instead of them in the original terminator operation. + +Example: + +```mlir + cond_br %0, ^bb1(%1 : i32), ^bb1(%2 : i32) +^bb1(%3 : i32) + "use"(%3) : (i32) -> () +``` + +leads to a new basic block being inserted, + +```mlir + cond_br %0, ^bb1(%1 : i32), ^dummy +^bb1(%3 : i32): + "use"(%3) : (i32) -> () +^dummy: + br ^bb1(%4 : i32) +``` + +before the conversion to the LLVM IR dialect: + +```mlir + llvm.cond_br %0, ^bb1(%1 : !llvm.i32), ^dummy +^bb1(%3 : !llvm<"i32">): + "use"(%3) : (!llvm.i32) -> () +^dummy: + llvm.br ^bb1(%2 : !llvm.i32) +``` + +## Default Memref Model + +### Memref Descriptor + +Within a converted function, a `memref`-typed value is represented by a memref +_descriptor_, the type of which is the structure type obtained by converting +from the memref type. This descriptor holds all the necessary information to +produce an address of a specific element. In particular, it holds dynamic values +for static sizes, and they are expected to match at all times. + +It is created by the allocation operation and is updated by the conversion +operations that may change static dimensions into dynamic and vice versa. + +**Note**: LLVM IR conversion does not support `memref`s with layouts that are +not amenable to the strided form. + +### Index Linearization + +Accesses to a memref element are transformed into an access to an element of the +buffer pointed to by the descriptor. The position of the element in the buffer +is calculated by linearizing memref indices in row-major order (lexically first +index is the slowest varying, similar to C, but accounting for strides). The +computation of the linear address is emitted as arithmetic operation in the LLVM +IR dialect. Strides are extracted from the memref descriptor. + +Accesses to zero-dimensional memref (that are interpreted as pointers to the +elemental type) are directly converted into `llvm.load` or `llvm.store` without +any pointer manipulations. + +Examples: + +An access to a zero-dimensional memref is converted into a plain load: + +```mlir +// before +%0 = load %m[] : memref<f32> + +// after +%0 = llvm.load %m : !llvm<"float*"> +``` + +An access to a memref with indices: + +```mlir +%0 = load %m[1,2,3,4] : memref<10x?x13x?xf32> +``` + +is transformed into the equivalent of the following code: + +```mlir +// Compute the linearized index from strides. Each block below extracts one +// stride from the descriptor, multiplies it with the index and accumulates +// the total offset. +%stride1 = llvm.extractvalue[4, 0] : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> +%idx1 = llvm.mlir.constant(1 : index) !llvm.i64 +%addr1 = muli %stride1, %idx1 : !llvm.i64 + +%stride2 = llvm.extractvalue[4, 1] : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> +%idx2 = llvm.mlir.constant(2 : index) !llvm.i64 +%addr2 = muli %stride2, %idx2 : !llvm.i64 +%addr3 = addi %addr1, %addr2 : !llvm.i64 + +%stride3 = llvm.extractvalue[4, 2] : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> +%idx3 = llvm.mlir.constant(3 : index) !llvm.i64 +%addr4 = muli %stride3, %idx3 : !llvm.i64 +%addr5 = addi %addr3, %addr4 : !llvm.i64 + +%stride4 = llvm.extractvalue[4, 3] : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> +%idx4 = llvm.mlir.constant(4 : index) !llvm.i64 +%addr6 = muli %stride4, %idx4 : !llvm.i64 +%addr7 = addi %addr5, %addr6 : !llvm.i64 + +// Add the linear offset to the address. +%offset = llvm.extractvalue[2] : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> +%addr8 = addi %addr7, %offset : !llvm.i64 + +// Obtain the aligned pointer. +%aligned = llvm.extractvalue[1] : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> + +// Get the address of the data pointer. +%ptr = llvm.getelementptr %aligned[%addr8] + : !llvm<"{float*, float*, i64, i64[4], i64[4]}"> -> !llvm<"float*"> + +// Perform the actual load. +%0 = llvm.load %ptr : !llvm<"float*"> +``` + +For stores, the address computation code is identical and only the actual store +operation is different. + +Note: the conversion does not perform any sort of common subexpression +elimination when emitting memref accesses.