view presentation/presen.html @ 3:c50a033e6635

create presentation slide
author Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
date Wed, 01 Jul 2015 19:06:07 +0900
parents
children 20257f618ddd
line wrap: on
line source

<!DOCTYPE html>
<html>
  <head>
    <meta charset='utf-8'>
    <title>Presen</title>
    <!-- style sheet links -->
    <link rel="stylesheet/less" href="themes/blank/projection.css.less"  media="screen,projection">
    <link rel="stylesheet/less" href="themes/blank/screen.css.less"      media="screen">
    <link rel="stylesheet/less" href="themes/blank/print.css.less"       media="print">

    <link rel="stylesheet/less" href="blank.css.less"    media="screen,projection">

    <!-- add js libs (less, jquery) -->
    <script src="js/less-1.1.4.min.js"></script>
    <script src="js/jquery-1.7.min.js"></script>

    <!-- S6 JS -->
    <script src="js/jquery.slideshow.js"></script>
    <script src="js/jquery.slideshow.counter.js"></script>
    <script src="js/jquery.slideshow.controls.js"></script>
    <script src="js/jquery.slideshow.footer.js"></script>
    <script src="js/jquery.slideshow.autoplay.js"></script>
    <script>
      $(document).ready( function() {
      Slideshow.init();
      
      // Example 2: Start Off in Outline Mode
      // Slideshow.init( { mode: 'outline' } );
      
      // Example 3: Use Custom Transition
      // Slideshow.transition = transitionScrollUp;
      // Slideshow.init();

      // Example 4: Start Off in Autoplay Mode with Custom Transition
      // Slideshow.transition = transitionScrollUp;
      // Slideshow.init( { mode: 'autoplay' } );
      } );
    </script>
  </head>
  <body>

    <div class="layout">
      <div id="header"></div>
      <div id="footer">
        <div align="right">
          <img src="images/concurrency.png" width="200">
        </div>
      </div>
    </div>

    <div class="presentation">

      <div class='slide cover'>
        <table width="90%" height="90%" border="0" align="center">
          <tr>
            <td><div align="center">
                <h1><font color="#808db5">Implimentating Continuation based language in Clang and LLVM</font></h1>
            </div></td>
          </tr>
          <tr>
            <td><div align="left">
                Kaito Tokumori, Shinji Kono
                <script>
                  document.write("<br>July 4, 2015");
                </script>
                <hr style="color:#ffcc00;background-color:#ffcc00;text-align:left;border:none;width:300%;height:0.2em;">
            </div></td>
          </tr>
        </table>
      </div>
      
      <div class='slide'>
        <h2>Objective</h2>
        <ul>
          <li>Reliable computation
          <li>Concurrent execution
          <li>Reliable improvement
          <li>Reusablity
        </ul>
        <h3>Introducing new units of programming</h3>
      </div>


      <div class='slide'>
        <h2>Traditional units of programming</h2>
        <ul>
          <li>Machine instruction
          <li>Statements of programming language
          <li>Function call / Method
          <li>Module / Class / Interface
          <li>Thread / Process
          <li>Object
          <li>Record / Table
        </ul>
      </div>

      <div class='slide'>
        <h2>What we want to do with programming units?</h2>
        <ul>
          <li>Divide large functions into small parts.
          <li>Add hidden arguments without code modification.
          <li>Add meta computation.
          <li>Extract concurrency from programming units.
        </ul>
        <h3>It is not easy in the traditional units.</h3>
      </div>

      <div class='slide'>
        <h2>New programing units</h2>
        <ul>
          <li>Units of programming: code segments, data segments.
          <li>Code segments are units of calculation.
          <li>Data segments are sets of typed data.
        </ul>
      </div>

      <div class='slide'>
        <h2>Code segments</h2>
        <ul>
          <li>Function from input data segments to output data segments.
          <li>Code segments have no states.
          <li>Access in typed data in the data segments by name.
          <li>Specify code segmnets to be executed using goto.
        </ul>
        <h3>It is easy to divide or combine.</h3>
      </div>

      <div class='slide'>
        <h2>Data segments</h2>
        <ul>
          <li>Set of typed data.
          <li>Type signatures are in meta data segments.
          <li>Variable and extendable data structure.
          <li>Data segments are dominated by connected code segments.
          <li>Code segments atomically access connected data segments.
        </ul>
        <h3>It is easy to divide or combine.</h3>
      </div>

      <div class='slide'>
        <h2>Meta code / data segments</h2>
        <ul>
          <li>Execution contexts: Thread
          <li>Type signatures of data segments.
          <li>Data segment linkages: Pointer
          <li>Machine code
        </ul>
        <h3>Meta code segments are executed right after the goto.</h3>
        <h3>Meta data segments are kinds of process data.</h3>
      </div>

      <div class='slide'>
        <h2>Continuation based C (CbC)</h2>
        <ul>
          <li>An implementation of code segments.
          <li>CbC stands for Continuation based C.
          <li>Basic syntax is the same as the C.
          <li>Code segments are set of C statements with goto.
          <li>Data segments are inplemented as C structures.
        </ul>
      </div>

      <div class='slide'>
        <h2>Continuation based C (CbC)</h2>
        <ul>
          <li>CbC uses goto for code segments transition.
            <ul>
              <li>They don't keep states.
            </ul>
          <li>
            <ul>
              <li>Compatible with the C.
            </ul>
          <li>The feature for return to C functions from code segments is named Continuaton with environment.
          <li>CbC can use C function call but you can replace them with CbC goto by replacing roop syntax with recursive continuation.
          <li>Continuation does not use a call instruction, but use a jmp instruction.
          <li>CbC uses data segments for typed data structure.
            <ul>
              <li>They have type information for meta computing.
              <li>You can use normal arguments too.
            </ul>
        </ul>
      </div>
      
      <div class='slide'>
        <h2>CbC sample (with normal arguments)</h2>
        <table border='1' align='center' width='80%'>
          <tr><td width='50%'>
              <pre class='small_code'>
<div class="highlight"><font color='red'>__code</font> code1(int n,__code(*exit_code)(int,void *),void *exit_env){
  printf("code1 : code entry1\n");
  <font color='red'>goto exit_code(n,exit_env);</font>
}</div>

int caller(){
  printf("caller : main1 entry\n");
  __code (*__ret)(int, void *) = __return;
  struct __CbC_env *__env = __environment;
  goto code1(1, __ret, __env);
  return 0;
}

int main(){
  int n;
  n = caller();
  printf("return = %d\n",n);
  return 0;
}      </pre>
            </td><td valign='top'>
              <ul>
                <li>We can write code segments like C functions.
                <li>CbC transition is goto so code segments do not return to previous one.
                <li>There are no return values.
          </td></tr>
        </table>
      </div>

<!--
      <div class='slide'>
        <h2>CbC sample (continuation with environments)</h2>
        <table border='1' align='center' width='80%'>
          <tr><td width='50%'>
              <pre class='small_code'>
__code code1(int n,__code(*exit_code)(int,void *),void *exit_env){
  printf("code1 : code entry1\n");
  goto exit_code(n,exit_env);
}

<div class="highlight">int caller(){
  printf("caller : main1 entry\n");
  __code (*__ret)(int, void *) = <font color='red'>__return</font>;
  struct __CbC_env *__env = <font color='red'>__environment</font>;
  goto code1(1, __ret, __env);
  return 0;
}</div>

int main(){
  int n;
  n = caller();
  printf("return = %d\n",n);
  return 0;
}      </pre>
            </td><td valign='top'>
              <ul>
                <li>The feature for return to C functions from code segments.
                <li>__return is a code segment pointer for return C functions.
                <li>__environment is a envitonment for return C functions.
                <li>Code1 use a continuation with environments to return main function.
          </td></tr>
        </table>
      </div>
-->

<div class='slide'>
  <h2>CbC sample (with data segments)</h2>
  <table border='1' align='center' width='80%'>
    <tr><td width='50%'>
        <pre class='small_code'>
__code code1(Data1 data){
  goto code2(data);
}

__code code2(Data2 data){
  goto code3(data);
}

int main(){
  goto start_code(context, Code1);
}      </pre>
            </td><td valign='top'>
              <ul>
                <li>
                <li>
                <li>
          </td></tr>
        </table>
      </div>

      <div class='slide'>
        <h2>CbC compilers</h2>
        <ul>
          <li>Micro-C(one pass standalone compiler)
          <li>GCC(GNU Compiler Collection)
          <li>LLVM and Clang
            <ul>
              <li>The latest one!
            </ul>
        </ul>
      </div>

      <div class='slide'>
        <h2>What is LLVM and Clang?</h2>
        <ul>
          <li>LLVM is a compiler framework.
          <li>LLVM has a intermidiate language which is called LLVM IR, LLVM language or LLVM bitcode.
          <li>LLVM translates LLVM IR to assembly language.
          <li>LLVM has a many kinds of optimization.
          <li>Clang is C, C++ and Obj-C compiler frontend.
          <li>Clang uses LLVM for compiler backend.
        </ul>
      </div>

      <div class='slide'>
        <h2>Why LLVM?</h2>
        <ul>
          <li>Apple supported.
          <li>OS X default compiler.
          <li>LLVM IR's documantation is useful and readable.
          <li>LLVM and Clang has readable documantation of their source codes.
        </ul>
      </div>

      <div class='slide'>
        <h2>LLVM and Clang's compilation flow</h2>
        <ul>
          <li>Sorce codes are translated into clang AST by parser.
          <li>clang AST is translated into LLVM IR by code generator.
          <li>LLVM IR is translated into machine code by SelectionDAGISel.
          <li>Machine code is optimized by optimizations and then, it is translated into assembly code.
        </ul>
        <div align="center"><img src="fig/clang_llvm_structure.svg" width="45%"></div>
      </div>

      <div class='slide'>
        <h2>LLVM and Clang's intermidiate representations</h2>
        <table border='1' align='center' width='80%'>
          <tr><td width='25%'>
              Name
            </td><td>
              Desctiption
          </td></tr>
          <tr><td>
              clang AST
            </td><td>
              Abstract Syntax Tree. It is a representation of the structure source codes.
          </td></tr>
          <tr><td>
              LLVM IR
            </td><td>
              The main intermidiate representation of LLVM. It has three diffirent forms: as an in-memory compiler IR, as an on-disk bitcode representation, and as a human readable assembly language representation.
          </td></tr>
          <tr><td>
              SelectionDAG
            </td><td>
              Directed Acyclic Graph. Its nodes indicate what operation the node performs and the operands to the operation.
          </td></tr>
          <tr><td>
              Machine Code
            </td><td>
              This representation is designed to support both an SSA representation for machine code, as well as register allocated, non-SSA form.
          </td></tr>
          <tr><td>
              MC Layer
            </td><td>
              It is used to represent and process code at the raw machine code level. User can some kinds of file (.s, .o, .ll, a.out) by same API.
          </td></tr>
        </table>
        <br>
        <p align='center' class='step emphasize'>LLVM's intermidiate representations are do not be modified.</p>
      </div>

      <div class='slide'>
        <h2>Abstract Syntax Tree</h2>
        <ul>
          <li>You can see it if you give clang '-Xclang -ast-dump' options.
          <li>The nodes indicate Decl (declaration), Stmt (statement) or  Expr (expresstion).
        </ul>
        <table border='1'>
          <tr>
            <td>source code
            <td>AST
          </tr>
          <tr>
            <td valign='top' width='20%'>
              <pre class='small_code'>
__code code1(int n,__code(*exit_code)(int,void *),void *exit_env){
  printf("code1 : code entry1\n");
  goto exit_code(n,exit_env);
}
</pre>
            <td><img src="fig/clangAST_char.svg" width="100%">
          </tr>
        </table>
        <p>We modify Clang which come to generate the AST when they get CbC syntax.</p>
      </div>

      <div class='slide'>
        <h2>Problems on implementating</h2>
        <ul>
          <li>How to implement code segments and data segments?
          <li>How to implement jmp instruction based transition?
          <li>How to implement goto with environment syntax?
        </ul>
      </div>

      <div class='slide'>
        <h2>Basic strategy of implementating</h2>
        <ul>
          <li>Code segments are implemented by C functions.
          <li>Data segments are implemented by C structs.
          <li>Transition is implemented by forced tail call elimination.
          <li>Goto with environment is implemented by setjmp and longjmp.
        </ul>
      </div>

      <div class='slide'>
        <h2>Implementating CbC compiler in LLVM and Clang</h2>
        <h3>Implemented</h3>
        <ul>
          <li>add __code type for code segment.
          <li>translate Clang's __code type into LLVM's __code type.
          <li>add goto syntax for transition.
          <li>force to tail call elimination.
          <li>goto with environment.
          <li>automatically prototype declatation genarating.
          <!--<li>connect code segments with meta code segments -->
        </ul>
        <h3>Implementing now</h3>
        <ul>
          <li>connect code segments with meta code segments.
          <li>generate data segment automatically. 
          <li>Syntax for accessing to data segment type.
       </ul>
      </div>

      <div class='slide'>
        <h2>__code type</h2>
        <p>modify parser.</p>
        <div align='center'><img src="fig/clang_llvm_slide_parse.svg" width="70%"></div>
      </div>

      <div class='slide'>
        <h2>__code type</h2>
        <table width='100%'>
          <tr><td>
              <ul>
                <li>Clang and LLVM handle code segments as __code type functions.
                <li>Code segments do not have return value so they are handled like void functions.
                <li>Clang and LLVM use different class for handling type so we have to modify both.
                <li>In Clang, we create type keyword, ID, and Type class.
                <li>In LLVM, we create Type class and ID.
                <li>The following code is the place where Clang parse a __code type.
                <li>DS.SetTypeSpecType() set AST nodes __code type.
              </ul>
          </tr>
          <tr>
            <td style="border: double;">
              <pre class='code'>
  case tok::kw___code: {
    LangOptions* LOP;
    LOP = const_cast<LangOptions*>(&getLangOpts());
    LOP->HasCodeSegment = 1;
    isInvalid = <font color='red'>DS.SetTypeSpecType(DeclSpec::TST___code, Loc, PrevSpec, DiagID);</font>
    break;
  }</pre>
          </tr>
        </table>
      </div>

      <div class='slide'>
        <h2>translation Clang's __code type into LLVM's __code type</h2>
        <p>Clang's types are translated in CodeGen.</p>
        <div align='center'><img src="fig/clang_llvm_slide_cg.svg" width="70%"></div>
      </div>

      <div class='slide'>
        <h2>translation Clang's __code type into LLVM's __code type</h2>
        <table width='100%'>
          <tr><td>
              <ul>
                <li>The following code is the translation place.
                <li>Code segments have no return value so __code type is handled like void type.
              </ul>
          </tr>
          <tr>
            <td style="border: double;">
              <pre class='code'>
case ABIArgInfo::Ignore:
#ifndef noCbC
  if (FI.getReturnType().getTypePtr()->is__CodeType())
    resultType = llvm::Type::get__CodeTy(getLLVMContext());
  else
    resultType = llvm::Type::getVoidTy(getLLVMContext());
#else
  resultType = llvm::Type::getVoidTy(getLLVMContext());
#endif
  break;</pre>
          </tr>
        </table>
      </div>

      <div class='slide'>
        <h2>goto syntax for transition</h2>
        <p>modify parser.</p>
        <div align='center'><img src="fig/clang_llvm_slide_parse.svg" width="70%"></div>
      </div>

      <div class='slide'>
        <h2>goto syntax for transition</h2>
        <table width='100%'>
          <tr><td>
              <ul>
                <li>Add new goto syntax for transition.
                <li>Jmp instraction based transition is enabled by tail call elimination.
                <li>In this part, clang create AST for normal function call and force to tail call elimination later.
                <li>The following code is the place where CbC goto was parsed.
                <li>If the goto is not for C syntax, we judge it is for CbC syntax.
              </ul>
          </tr>
          <tr>
            <td style="border: double;">
              <pre class='code'>
case tok::kw_goto:
#ifndef noCbC
  if (!(NextToken().is(tok::identifier) && PP.LookAhead(1).is(tok::semi)) &&
    NextToken().isNot(tok::star)) {
      SemiError = "goto code segment";
      return ParseCbCGotoStatement(Attrs, Stmts);
    }
#endif
  Res = ParseGotoStatement();
  SemiError = "goto";
  break;</pre>
          </tr>
        </table>
      </div>

      <div class='slide'>
        <h2>goto syntax for transition</h2>
        <ul>
          <li>Add return statement after goto transition.
          <li>It is one the requirement force to tail call elimination.
        </ul>
        <table border='1' width='80%' align='center'>
          <tr>
            <td>original input code
            <td>Clang genarates it
          </tr>
          <tr>
            <td><pre class='small_code'>
__code code1() {
     :
  goto code2();
}
              </pre>
            <td><pre class='small_code'>
void code1() {
     :
  code2();
  <font color='red'>return;</font>
}
              </pre>
          </tr>
        </table>
      </div>

      <div class='slide'>
        <h2>Jmp instruction based transition</h2>
        <ul>
          <li>It is implemented by Tail Call Elimination (TCE).
          <li>TCE is one of the optimization.
          <li>If the function call is immediately followed by return, it is tail call.
          <li>TCE replace tail call's call instructions with jmp instructions.
          <li>Code segments' transition is implemented by forced tail call elimination.
        </ul>
        <div align='center'><img src="fig/TCE.svg" width="40%"></div>
      </div>

      <div class='slide'>
        <h2>Forcing Tail Call Elimination</h2>
        <p>TCE is enabled at CodeGen.</p>
        <p>TCE is act at SelectionDAGISel.</p>
        <div align='center'><img src="fig/clang_llvm_slide_cg_DAG.svg" width="70%"></div>
      </div>

      <div class='slide'>
        <h2>Forcing Tail Call Elimination</h2>
        <p>LLVM IR has function call flags. We can give LLVM some information for function call by them. We use them for force to tail call elimination.
        <p>We have to meet the following requirements.</p>
        <ul>
          <li>set tail flag at the code segments call.
          <li>tailcallopt is enabled.
          <li>the caller and calle's calling conventions must be the same and their types should be cc10, cc11 or fastcc.
          <li>return value type has to be the same as the caller's.
        </ul>
        <br>
        <p>We met them by following ways.</p>
        <ul>
          <li>Always add tail call elimination pass and set flag at the code segments call.
          <li>If the input code contains code segment, tailcallopt is enabled automatically.
          <li>Fast cc is used consistently in code segments call.
          <li>All the code segments return value type is void.
        </ul>
      </div>

      <div class='slide'>
        <h2>Goto with environment</h2>
        <p>Goto with environment is enabled by modifying parser.</p>
        <div align='center'><img src="fig/clang_llvm_slide_parse.svg" width="70%"></div>
      </div>

      <div class='slide'>
        <h2>What is a Goto with environment?</h2>
        <ul>
          <li>Code segments do not have environment cut functions have.
          <li>Usually, code segments can't return to functions.
          <li>Goto with environment enable to it.
          <li>In the GCC, use nested functions to implementing.
          <li>In the LLVM and Clang, use setjmp and longjmp to implementing.
        </ul>
      </div>

      <div class='slide'>
        <h2>Sample code of Goto with environment</h2>
        <table width='100%'>
          <tr><td valign='top'>
              <ul>
                <li>Use new keywords __return and __environment.
                <li>__return is a code segment pointer for C functions.
                <li>__environment is a envitonment for C functions.
                <li>Code1 use a continuation with environments to return main function.
              </ul>
            <td style="border: double;">
              <pre class='small_code'><div class='highlight'>__code code1(int n,__code(*exit_code)(int,void *),void *exit_env){
  printf("code1 : code entry1\n");
  goto exit_code(n,exit_env);
}

int caller(){
  printf("caller : main1 entry\n");
  __code (*__ret)(int, void *) = <font color='red'>__return</font>;
  struct __CbC_env *__env = <font color='red'>__environment</font>;
  goto code1(1, __ret, __env);
  return 0;
}

int main(){
  int n;
  n = caller();
  printf("return = %d\n",n);
  return 0;
}      </div></pre>
          </tr>
        </table>
      </div>

      <div class='slide'>
        <h2>Implementing goto with environment</h2>
        <ul>
          <li>Include setjmp.h always.
          <li>Generate C struct for saving environment.
            <ul>
              <li>This struct is __environment.
            </ul>
          <li>Insert setjmp in C function.
          <li>Generate longjmp code segment as return.
            <ul>
              <li>This code segment is pointed by __return.
            </ul>
        </ul>
      </div>

      <div class='slide'>
        <h2>Prototype declaration generating</h2>
        <p>modify parser.</p>
        <div align='center'><img src="fig/clang_llvm_slide_parse.svg" width="70%"></div>
      </div>

      <div class='slide'>
        <h2>Prototype declaration generating</h2>
        <ul>
          <li>In CbC, programmer write a lot of code segments.
          <li>When function pointer's arguments are omitted, TCE was failed sometimes.
          <li>Automatically prototype declaration generating saves a lot of effort.
          <li>When parser meet a code segment call, it stop current parsing and search called code segment declaration.
          <li>If the declaration was not found, search definision and generate declaration.
            <ul>
              <li>Of course you can write declaration yourself too.
            </ul>
<!--          <li>This feature is important to code segment transition.-->
        </ul>
        <table border='1' width='80%' align='center'>
          <tr>
            <td>original input code
            <td>Clang genarates it
          </tr>
          <tr>
            <td><pre class='small_code'>
__code code1(int a, int b) {
     :
  goto code2(a,b);
}

__code code2(int a, int b){
     :
}
              </pre>
            <td><pre class='small_code'>
<font color='red'>__code code2(int a, int b);</font>
__code code1(int a, int b) {
     :
  goto code2(a,b);
}

__code code2(int a, int b){
     :
}
              </pre>
          </tr>
        </table>
      </div>

      <div class='slide'>
        <h2>Connect code segments with meta code segments</h2>
        <ul>
          <li>All code segments are transition to next one via meta code segments.
<!--          <li>Meta code segments calculate meta computation like a memory allocation, exception, scheduling.-->
          <li>Normal level code segments don't have to know meta code segments.
          <li>When code segments transition to next code segment, compiler connect it with meta code segments.
          <li>Meta code segments use context which has code segments pointer and name.
          <li>Context is added arguments by compiler.
          <li>You can omit meta computing if you do not need it.
            <ul>
              <li>In this case, code segments transition to next one via default meta code segment.
              <li>Default meta code segment get next code segment from context and transition to it.
            </ul>
        </ul>
        <h3>code segments view</h3>
        <div align='center'><img src="fig/cs_meta_csview.svg" width="35%"></div>
        <h3>actual code segments transition</h3>
        <div align='center'><img src="fig/cs_metacs.svg" width="35%"></div>
      </div>

      <div class='slide'>
        <h2>Connect code segments with meta code segments</h2>
        <table border='1' width='80%' align='center'>
          <tr>
            <td>original input code
            <td>Clang genarates it
          </tr>
          <tr>
            <td><pre class='small_code'>
__code code1() {
  goto code2();
}

__code code2(){
  goto code3();
}
              </pre>
            <td><pre class='small_code'>
__code code1(struct Context* context) {
  <font color='red'>goto meta(context,Code2);</font>
}

__code code2(struct Context* context){
  <font color='red'>goto meta(context,Code3);</font>
}

__code meta(struct Context* context, enum Code next) {
  goto (context->code[next])(context);
}
              </pre>
          </tr>
        </table>
      </div>

    </div> <!-- presentation -->
  </body>
</html>