Implementing Continuation based language in Clang and LLVM |
Kaito Tokumori, Shinji Kono University of the Ryukyus |
A thread of code segments has a context as a meta data segment
__code f() { goto g(); } __code g() { goto h(); } |
|
data segments are arguments of a code segment and a goto statement.
__code code1(struct Allocate* allocate, struct Element* element) { element ->value = 10; struct List* list = (struct List *)malloc(sizeof( struct List)); goto append(allocate,list, element); } __code append(struct Allocate* allocate, struct List* list, struct Element* element) { if(list->head) { list->tail->next = element; } else { list->head = element; } list->tail = element; list->tail->next = 0; goto code2(allocate,list, element); } |
add meta part to both code and data, keep object level computation
In cons cell, pointer part is visible to object level.
Hide pointer part in meta data segment.
Handle pointer operation in meta code segment.
define fastcc void @factorial(i32 %x) #0 { entry: tail call fastcc void @factorial0(i32 1, i32 %x) ret void } |
original input code | Clang generates it internally |
__code code1(int a, int b) { : goto code2(a,b); } __code code2(int a, int b){ : } | __code code2(int a, int b); __code code1(int a, int b) { : goto code2(a,b); } __code code2(int a, int b){ : } |
original input code | Clang generates it |
__code code1() { : goto code2(); } | void code1() { : code2(); return; } |
Tail call elimination pass is added in CodeGen.
Ensure TCE in SelectionDAGISel.
These are the reason why we need __code type for code segments.
How to return to C from a code segment?
|
|
Several ways to implementation
__code caller(int x) { goto code1(1, x); // should be jmp } |
_caller: ## @factorial .cfi_startproc ## BB#0: ## %entry subq $24, %rsp Ltmp5: .cfi_def_cfa_offset 32 movl $1, %eax movl %edi, 20(%rsp) ## 4-byte Spill movl %eax, %edi movl 20(%rsp), %esi ## 4-byte Reload addq $24, %rsp jmp _code1 ## TAILCALL .cfi_endproc |
CbC can be used as a hardware description language (RTL level)
CbC can be used as a parallel programming language
Name | Description |
clang AST | Abstract Syntax Tree. It is a representation of the structure source codes. |
LLVM IR | The main intermediate representation of LLVM. It has three different forms: as an in-memory compiler IR, as an on-disk bitcode representation, and as a human readable assembly language representation. |
SelectionDAG | Directed Acyclic Graph. Its nodes indicate what operation the node performs and the operands to the operation. |
Machine Code | This representation is designed to support both an SSA representation for machine code, as well as register allocated, non-SSA form. |
MC Layer | It is used to represent and process code at the raw machine code level. User can some kinds of file (.s, .o, .ll, a.out) by same API. |
Name | Description |
ccc |
C calling conbentions. This calling convention supports varargs function calls and tolerates some mismatch in the declared prototype and implemented declaration of the function. |
cc10 | This calling convention has been implemented specifically for use by the Glasgow Haskell Compiler (GHC). |
cc11 | This calling convention has been implemented specifically for use by the High-Performance Erlang (HiPE) compiler. |
fastcc | This calling convention attempts to make calls as fast as possible. |
Argument 1 | Argument 2 | Argument 3 | |
Micro-C | 6.875 | 2.4562 | 3.105 |
GCC -O2 | 2.9438 | 0.955 | 1.265 |
LLVM and Clang -O0 | 5.835 | 4.1887 | 5.0625 |
LLVM and Clang -O2 | 3.3875 | 2.29 | 2.5087 |
unit : seconds |