Mercurial > hg > Papers > 2015 > kaito-lola
view presentation/presen.html @ 16:c3d20ec1ec4b
spell check
author | Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> |
---|---|
date | Sat, 04 Jul 2015 23:36:52 +0900 |
parents | bbbeecda034d |
children | 889696aa5018 |
line wrap: on
line source
<!DOCTYPE html> <html> <head> <meta charset='utf-8'> <title>Presentation</title> <!-- style sheet links --> <link rel="stylesheet/less" href="themes/blank/projection.css.less" media="screen,projection"> <link rel="stylesheet/less" href="themes/blank/screen.css.less" media="screen"> <link rel="stylesheet/less" href="themes/blank/print.css.less" media="print"> <link rel="stylesheet/less" href="blank.css.less" media="screen,projection"> <!-- add js libs (less, jquery) --> <script src="js/less-1.1.4.min.js"></script> <script src="js/jquery-1.7.min.js"></script> <!-- S6 JS --> <script src="js/jquery.slideshow.js"></script> <script src="js/jquery.slideshow.counter.js"></script> <script src="js/jquery.slideshow.controls.js"></script> <script src="js/jquery.slideshow.footer.js"></script> <script src="js/jquery.slideshow.autoplay.js"></script> <script> $(document).ready( function() { Slideshow.init(); // Example 2: Start Off in Outline Mode // Slideshow.init( { mode: 'outline' } ); // Example 3: Use Custom Transition // Slideshow.transition = transitionScrollUp; // Slideshow.init(); // Example 4: Start Off in Autoplay Mode with Custom Transition // Slideshow.transition = transitionScrollUp; // Slideshow.init( { mode: 'autoplay' } ); } ); </script> </head> <body> <div class="layout"> <div id="header"></div> <div id="footer"> <div align="right"> <img src="images/concurrency.png" width="200"> </div> </div> </div> <div class="presentation"> <div class='slide cover'> <table width="90%" height="90%" border="0" align="center"> <tr> <td><div align="center"> <h1><font color="#808db5">Implementating Continuation based language in Clang and LLVM</font></h1> </div></td> </tr> <tr> <td><div align="left"> Kaito Tokumori, Shinji Kono <br> University of the Ryukyus <script> document.write("<br>July 5, 2015"); </script> <hr style="color:#ffcc00;background-color:#ffcc00;text-align:left;border:none;width:300%;height:0.2em;"> </div></td> </tr> </table> </div> <div class='slide'> <h2>Objective</h2> <ul> <li>Achieve Reliable computation <li>Extract Concurrent Execution Automatically <li>Modify and Improve software in a Reliable way <li>Get more Reusablity </ul> <h3>Introducing new units of programming</h3> </div> <div class='slide'> <h2>Traditional units of programming</h2> <ul> <li>Machine instruction <li>Statements of programming language <li>Function call / Method <li>Module / Class / Interface <li>Thread / Process <li>Object <li>Record / Table </ul> </div> <div class='slide'> <h2>What we want to do with programming units?</h2> <ul> <li>Divide large functions into small parts. <li>Add hidden arguments without code modification. <li>Add meta computation. <li>Extract concurrency from programming units. </ul> <h3>It is not easy to do this in the traditional units.</h3> </div> <div class='slide'> <h2>New programming units</h2> <ul> <li>Units of programming: code segments, data segments. <li>Code segments are units of calculation. <li>Data segments are sets of typed data. </ul> </div> <div class='slide'> <h2>Code segments</h2> <ul> <li>Function from input data segments to output data segments. <li>Code segments have no states. <li>Access in typed data in the data segments by name. <li>Specify code segments to be executed using goto. </ul> <h3>It is easy to divide or combine.</h3> </div> <div class='slide'> <h2>Data segments</h2> <ul> <li>Set of typed data. <li>Type signatures are in meta data segments. <li>Variable and extendable data structure. <li>Data segments are dominated by connected code segments. <li>Code segments atomically access connected data segments. </ul> <h3>It is easy to divide or combine.</h3> </div> <div class='slide'> <h2>Meta code / data segments</h2> <p>A thread of code segments has a context as a meta data segment</p> <ul> <li>Execution contexts: Thread <li>Type signatures of data segments. <li>Data segment linkages: Pointer <li>Machine code of code segments </ul> </div> <div class='slide'> <h2>Continuation based C (CbC)</h2> <ul> <li>An implementation of code segments. <li>CbC stands for Continuation based C. <li>Basic syntax is the same as the C, except __code and goto. <li>__code is a type of code segment <li>Code segments end with parameterized goto. <li>Data segments are implemented as C structures. </ul> </div> <div class='slide'> <h2>code segment syntax</h2> <table border='1' align='center' width='80%'> <tr><td width='50%'> <pre class='small_code'> __code f() { goto g(); } __code g() { goto h(); } </pre> </td><td valign='top'> <ul> <li>code segment transition is goto. <li>Code segments have no return statement. <li>There are no return values. </ul> </td></tr> </table> </div> <div class='slide'> <h2>code segment syntax with data segments</h2> <table border='1' align='center' width='80%'> <tr><td width='50%'> <pre class='small_code'> __code code1(struct Allocate* allocate, struct Element* element) { element ->value = 10; struct List* list = (struct List *)malloc(sizeof( struct List)); goto append(allocate,list, element); } __code append(struct Allocate* allocate, struct List* list, struct Element* element) { if(list->head) { list->tail->next = element; } else { list->head = element; } list->tail = element; list->tail->next = 0; goto code2(allocate,list, element); } </pre> </td><td valign='top'> <ul> <li>data segments are arguments of a code segment and a goto statement. </ul> </td></tr> </table> </div> <div class='slide'> <h2>CbC compilers</h2> <ul> <li>Micro-C(one pass standalone compiler) 2001 <li>on GCC(GNU Compiler Collection) 2008 <li>on LLVM and Clang (Compiler framework) 2014 <ul> <li><font color='red'>The latest!</font> </ul> </ul> </div> <div class='slide'> <h2>LLVM and Clang's compilation flow</h2> <ul> <li>AST : Abstract Syntax Tree (C++ object) <li>LLVM IR is Intermediate Representation (bit code). <li>Clang translate C/C++/Obj-C into LLVM IR. <li>SelectionDAG : Code generator internal <li>Machine Code : LLVM Machine code </ul> <div align="center"><img src="fig/clang_llvm_structure.svg" width="45%"></div> </div> <div class='slide'> <h2>Advantage of LLVM implementation</h2> <ul> <li>Apple supported (working on OS X). <li>OS X default compiler. <li>LLVM IR is well documented. <li>better than GCC </ul> </div> <div class='slide'> <h2>CbC implementation strategy</h2> <ul> <li>define special type __code for code segments <li>no code segment prototyping (otherwise it becomes quite messy ) <li>code segments are implemented as tail call force functions <li>Do not modify IR (Intermediate representations ) <li>goto statement is actually a function call with following return statement <br> goto f() --> { f() ; return; } <li>allow mixing code segments and normal function calls ( goto with environment ) </ul> </div> <div class='slide'> <h2>LLVM IR</h2> <ul> <li>Intermediate Representation (bit code). <li>Three forms: in-memory IR, bitcode stream, human readable language. <li>it has precise type, data size, alignment <li>function call flags : tail, fastcc, cc10, cc11, Erlang, ghc </ul> <table width='100%'> <tr> <td style="border: double;"> <pre class='code'> define fastcc void @factorial(i32 %x) #0 { entry: tail call fastcc void @factorial0(i32 1, i32 %x) ret void } </pre> </td> </tr> </table> </div> <div class='slide'> <h2>CbC implementation strategy</h2> <ul> <li>Code segments are implemented by C functions with return-type __code. <li>Data segments are implemented by C structs. <li>Goto statement is implemented by setting tail call flag. <li>Goto with environment is implemented by setjmp and longjmp. </ul> </div> <div class='slide'> <h2>Prototype declaration generating</h2> <ul> <li>In CbC, programmer write a lot of code segments. <li>Automatically prototype declarator support it. <!-- <li>When parser meet a code segment call, it stop current parsing and search called code segment declaration.--> <li>If the declaration was not found, search definition and generate declaration. <ul> <li>Of course you can write declaration yourself too. </ul> </ul> <table border='1' width='80%' align='center'> <tr> <td>original input code <td>Clang generates it internally </tr> <tr> <td><pre class='small_code'> __code code1(int a, int b) { : goto code2(a,b); } __code code2(int a, int b){ : } </pre> <td><pre class='small_code'> <font color='red'>__code code2(int a, int b);</font> __code code1(int a, int b) { : goto code2(a,b); } __code code2(int a, int b){ : } </pre> </tr> </table> </div> <div class='slide'> <h2>goto syntax for transition</h2> <ul> <li>Add return statement after goto transition. <li>It is one the requirement force to tail call elimination. </ul> <table border='1' width='80%' align='center'> <tr> <td>original input code <td>Clang generates it </tr> <tr> <td><pre class='small_code'> __code code1() { : goto code2(); } </pre> <td><pre class='small_code'> void code1() { : code2(); <font color='red'>return;</font> } </pre> </tr> </table> </div> <div class='slide'> <h2>Forcing Tail Call Elimination</h2> <p>Tail call flag is set in CodeGen.</p> <p>Ensure TCE in SelectionDAGISel.</p> <div align='center'><img src="fig/clang_llvm_slide_cg_DAG.svg" width="60%"></div> </div> <div class='slide'> <!-- <h2>Jmp instruction based transition</h2> --> <h2>What is tail call elimination?</h2> <ul> <li>Tail call is a function call immediately followed by return. <li>If stack frame have to be unchanged, <li>Tail call elimination replace tail call's call instructions with jmp instructions. <li>that is tail call function should have the same arguments and the same return type </ul> <div align='center'><img src="fig/TCE.svg" width="40%"></div> </div> <div class='slide'> <h2>Tail Call Elimination requirements</h2> <ul> <li>Set tail flag at the code segments call. <li>Tailcallopt path is enabled in the compiler. <li>The caller and callee's calling conventions must be the same and their types should be cc10, cc11 or fastcc. <li>Return value type has to be the same as the caller's. </ul> </div> <div class='slide'> <h2>Forcing Tail Call Elimination</h2> <ul> <li>Always add tail call elimination pass. <li>Tailcallopt is enabled in CbC. <li>Fast cc is used consistently in code segments call. <li>All the code segments return value type is void. </ul> <p>This is the reason why we need __code type for code segments.</p> </div> <div class='slide'> <h2>goto a code segment from a normal C function</h2> <ul> <li>Assume we have code segment g and normal function f <li>simply goto g() in C function f() <li>goto g() never returns to function f </ul> <p>How to return to C from a code segment?</p> </div> <div class='slide'> <h2>Goto with environment</h2> <p> We want provide continuation of function f.<br> The continuation is not a simple code segment,<br> because code segment has not state.<br> We represent the continuation with a code segment (__return) and a meta data segment (__environment).<br> </ul> </div> <div class='slide'> <h2>Syntax of Goto with environment</h2> <table width='100%'> <tr><td valign='top'> <ul> <li>Use new keywords __return and __environment. <li>__return is a code segment pointer for C functions. <li>__environment is a environment for C functions. <li>Code1 use a continuation with environments to return main function. </ul> <td style="border: double;"> <pre class='small_code'><div class='highlight'>__code code1(int n,__code(*exit_code)(int,void *),void *exit_env){ printf("code1 : code entry1\n"); goto exit_code(n,exit_env); } int caller(){ printf("caller : main1 entry\n"); __code (*__ret)(int, void *) = <font color='red'>__return</font>; struct __CbC_env *__env = <font color='red'>__environment</font>; goto code1(1, __ret, __env); return 0; } int main(){ int n; n = caller(); printf("return = %d\n",n); return 0; } </div></pre> </tr> </table> </div> <div class='slide'> <h2>Implementation of goto with environment</h2> <p>Several ways to implementation</p> <ul> <li>setjmp/longjmp (LLVM) <li>nested function closure with thread safe variable (GCC) <li>direct manipulation of frame pointer and return value (Micro-C) </ul> </div> <div class='slide'> <h2>LLVM Implementation of goto with environment</h2> <ul> <li>Include setjmp.h internally. <li>Generate C struct for saving environment. <ul> <li>This struct is __environment. </ul> <li>Insert setjmp in C function, when __return is used. <li>Generate longjmp code segment as __return. </ul> </div> <div class='slide'> <h2>Compiling result</h2> <table width='100%' align='center' border='1'> <tr> <td valign='top'> <pre class='small_code'> __code caller(int x) { goto code1(1, x); // should be jmp } </pre> <td> <pre class='small_code'> _caller: ## @factorial .cfi_startproc ## BB#0: ## %entry subq $24, %rsp Ltmp5: .cfi_def_cfa_offset 32 movl $1, %eax movl %edi, 20(%rsp) ## 4-byte Spill movl %eax, %edi movl 20(%rsp), %esi ## 4-byte Reload addq $24, %rsp <font color='red'>jmp</font> _code1 ## TAILCALL .cfi_endproc </pre> </tr> </table> <ul> <li>Code1 should called by jmp instruction. <li>In assembly code, code1 called by jmp instruction. <li>Tail call elimination was forced. <li>If tail call elimination was failed, compiler output error messages. </ul> </div> <div class='slide'> <h2>Usage of CbC : as an instruction description</h2> <p>CbC can be used as a hardware description language (RTL level)</p> <ul> <li>VU (Vector unit) in PS2 <li>SPU in PS3 </ul> </div> <div class='slide'> <h2>Usage of CbC : Parallel Task representation</h2> <p>CbC can be used as a parallel programming language</p> <ul> <li>run on GPU or Many Core <li>a code segment is a kernel in Open CL <li>a code segment execution is atomic <li>during an execution of code, data segments are owned by the code <li>task structure is a meta data segment <li>task manager is a meta code segment </ul> </div> <div class='slide'> <h2>Usage of CbC : OS API description</h2> <ul> <li>detailed description of open/read/write/select <li>we can implement kernel in CbC </ul> </div> <div class='slide'> <h2>Usage of CbC : meta computation</h2> <ul> <li>call meta code segment during goto <li>thread context is a meta data segment <li>it can be seen as a monadic meta computation </ul> </div> <div class='slide'> <h2>Usage of CbC : Model checking</h2> <ul> <li>call meta code segment during goto <li>try all possible non deterministic computation <li>keep track data segment state <li>Do the model checking without modifying the code <li>cf. Java Pathfinder (Model checking by replacing JVM) </ul> </div> <div class='slide'> <h2>Conclusion</h2> <ul> <li>CbC compiler on LLVM and Clang is implemented. <li>LLVM IR is not modified. <li>goto with environment is implemented by setjmp and longjmp. <li>Automatic prototype generating. <li>Various application of CbC. </ul> </div> <div class='slide'> <h2>Future works</h2> <ul> <li>Write operating system in CbC. <ul> <li>Gears OS </ul> <li>Meta computation syntax. <li>More user friendly syntax. <li>Automatic data segment generator. <li>Signature for data segment. <li>Dependent type and implicit parameter </ul> </div> <div class='slide'> <h2>LLVM and Clang's intermediate representations</h2> <table border='1' align='center' width='80%'> <tr><td width='25%'> Name </td><td> Description </td></tr> <tr><td> clang AST </td><td> Abstract Syntax Tree. It is a representation of the structure source codes. </td></tr> <tr><td> LLVM IR </td><td> The main intermediate representation of LLVM. It has three different forms: as an in-memory compiler IR, as an on-disk bitcode representation, and as a human readable assembly language representation. </td></tr> <tr><td> SelectionDAG </td><td> Directed Acyclic Graph. Its nodes indicate what operation the node performs and the operands to the operation. </td></tr> <tr><td> Machine Code </td><td> This representation is designed to support both an SSA representation for machine code, as well as register allocated, non-SSA form. </td></tr> <tr><td> MC Layer </td><td> It is used to represent and process code at the raw machine code level. User can some kinds of file (.s, .o, .ll, a.out) by same API. </td></tr> </table> </div> </div> <!-- presentation --> <div class='slide'> <h2>Execution Result</h2> <ul> <li>Conv1 program. <ul> <li>Repeat calculation program. <li>Stack is defined in the program. </ul> <li>Select execution code by arguments. <ul> <li>1: not optimized. <li>2,3: optimized stack operation. </ul> <li>Inline optimization is omitted. </ul> <table width='80%' align='center' border='1'> <tr> <td width='30%'> <td>Argument 1 <td>Argument 2 <td>Argument 3 </tr> <tr> <td>Micro-C <td>6.875 <td>2.4562 <td>3.105 </tr> <tr> <td>GCC -O2 <td>2.9438 <td>0.955 <td>1.265 </tr> <tr> <td>LLVM and Clang -O0 <td>5.835 <td>4.1887 <td>5.0625 </tr> <tr> <td>LLVM and Clang -O2 <td>3.3875 <td>2.29 <td>2.5087 </tr> </table> <table width='80%' align='center' border='0'> <tr><td align='right'>unit : seconds</tr> </table> <ul> <li>LLVM and Clang compilers are faster than Micro-C when optimize is enabled. <li>CbC gets benefits from LLVM optimizations. <li>LLVM can compile CbC examples. </ul> </div> </body> </html>