Mercurial > hg > Papers > 2012 > aplas

-title: Recursive type syntax in Continuation based C

\newcommand{\rectype}{{\tt \_\_rectype}}
\newcommand{\code}{{\tt \_\_code}}

--author: Shinji Kono, Nobuyasu Oshiro

--abstract:
We have implemented Continuation based C (CbC).
CbC is an extension of C, which has parameterized goto statement.
It is useful for finite state automaton or many core tasks.
Goto statement is a way to force tail call elimination.
The destination of goto statement is called Code Segment, which is actually a normal function of C.
To represent recursive function call, the type system of C is not enough, because
it has no recursive types.
We introduce \rectype keyword for recursive type, and it is implemented in GCC-4.6.0.
We will compare the conventional methods, \rectype keyword and a method using C structure.
Also we show usage of CbC and it's benchmark.


--Continuation based C

CbC's basic programming unit is a code segment. It is not a subroutine, but it
looks like a function, because it has input and output. We can use C struct
as input and output interfaces.

   struct interface1  { int i; };
   struct interface2  { int o; };

   __code f(struct interface1 a) {
       struct interface2 b; b.o=a.i;
       goto g(b);
   }

In this example, a code segment
\verb+f+ has \verb+input a+ and sends \verb+output b+ to a code segment \verb+g+.
There is no return from code segment \verb+b+, \verb+b+ should call another
continuation using \verb+goto+. Any control structure in C is allowed in CwC
language, but in case of CbC, we restrict ourselves to use \verb+if+ statement
only, because it is sufficient to implement C to CbC translation. In this case,
code segment has one input interface and several output interfaces (fig.\ref{code}).

\includegraphics[width=6cm]{
\begin{figure}[htb]
\begin{center}
\includegraphics[width=6cm]{figure/code.pdf}
\caption{code}
\end{center}
\label{code}
\end{figure}


\verb+__code+ and parameterized global goto statement is an extension of
Continuation based C. Unlike \verb+C--+ \cite{cminusminus}'s parameterized goto,
we cannot goto into normal C function.

--Intermix with C

In CwC, we can go to a code segment from a C function and we can call C functions
in a code segment. So we don't have to shift completely from C to CbC. The later
one is straight forward, but the former one needs further extensions.

   void *env;
   __code (*exit)(int);

   __code h(char *s) {
        printf(s);
        goto (*exit)(0),env;
   }

   int main() {
        env  = __environment;
        exit = __return;
        goto h("hello World\n");
   }

In this hello world example, the environment of \verb+main()+
and its continuation is kept in global variables. The environment
and the continuation can be get using \verb+__environment+,
and \verb+__return+. Arbitrary mixture of code segments and functions
are allowed (in CwC). The continuation of \verb+goto+ statement
never returns to original function, but it goes to caller of original
function. In this case, it returns result 0 to the operating system.


--What's good?

CbC is a kind of high level assembler language. It can do several
original C language cannot do. For examples,

{\small
\begin{verbatim}
    Thread Scheduler
    Context Switch
    Synchronization Primitives
    I/O wait semantics

\end{verbatim}
}

are impossible to write in C. Usually it requires some help of
assembler language such as \verb+__asm+ statement extension which is
of course not portable.

--Scheduler example

We can easily write these things in CbC, because
CbC has no hidden information behind the stack frame of C.
A thread simply go to the scheduler,

    goto scheduler(self, task_list);


and the scheduler simply pass the control to the next
thread in the task queue.

    code scheduler(Thread self,TaskPtr list)
    {
        TaskPtr t = list;
        TaskPtr e;
        list = list->next;
        goto list->thread->next(list->thread,list);
    }

Of course it is a simulator, but it is an implementation
also. If we have a CPU resource API, we can write real multi
CPU scheduler in CbC.

This is impossible in C, because we cannot access the hidden
stack which is necessary to switch in the scheduler. In CbC,
everything is visible, so we can switch threads very easily.

This means we can use CbC as an executable  specification
language of OS API.

--Self Verification

Since we can write a scheduler in CbC, we can also enumerate
all possible interleaving of a concurrent program. We have
implement a model checker in CwC. CbC can be a self verifiable
language\cite{kono08a}.

SPIN\cite{holzmann97model} is a very reliable model checker, but it have to
use special specification language PROMELA. We cannot directly
use PROMELA as an implementation language, and it is slightly
difficult to study its concurrent execution semantics including
communication ports.

There are another kind of model checker for real programming
language, such as Java PathFinder\cite{havelund98model}. Java PathFinder use
Java Virtual Machine (JVM) for state space enumeration which
is very expensive some time.

In CbC, state enumerator itself is written in CbC, and its concurrency
semantics is written in CbC itself. Besides it is very close
to the implementation. Actually we can use CbC as an implementation
language. Since enumerator is written in the application itself, we
can perform abstraction or approximation in the application specific
way, which is a little difficult in Java PathFinder. It is possible
to handle JVM API for the purpose, although.

We can use CPS transformed CbC source code for verification, but
we don't have to transform all of the source code, because CwC
supports all C constructs. (But not in C++... Theoretically it is
possible with using cfront converter, it should be difficult).


--As a target language

Now we have GCC implementation of CbC, it runs very fast. Many
popular languages are implemented on top of C. Some of them
uses very large switch statement for the byte code interpreter.
We don't have to use these hacks, when we use CbC as an implementation
language.

CbC is naturally similar to the state charts. It means it is very
close to UML diagrams. Although CbC does not have Object Oriented
feature such as message passing nor inheritance, which is not
crucial in UML.


--Recursive type syntax

CbC's program pass next pointer of code segment on argument.
It is passed as follows.

   __code csA( __code (*p)() ) {
       goto p(csB);
   }

p is next pointer of codesegment.
But, This declarationd is not right.
Because p have arguments.
We wanted to the same type of p's arguments as type of csA's arguments.
Right declaration is as follows.

    __code csA( __code (*p)( __code (*)( __code (*)( __code *)))) {
       goto p(csB);
    }

The syntax of C Must be declared recursively.
The following declaration if it may be that the type checking of p.

    __code csA( __code (*p)( __code )) {
       goto p(csB);
    }

However this declaration is long.
Therefore we implemented \rectype syntax in CbC on GCC.

\rectype syntax is declare a recursive type.
This example is using \rectype syntax.

    __code csA( __rectype *p) {
       goto p(csB);
    }

*p represent pointer of csA at \ref{code:rectype} .
p's argument type is same csA that function pointer.


--Problems with implementation of \rectype.
Segmentation fault has occurred in the following program on compile.

    __code csA(__rectype *p) {
       goto p(3);
    }

The above code is the wrong argument of p.
The p's argument is converted by GCC.
3 of type int is converted to a pionter type of code segment.
At this time, GCC looks at the type of the argument.
p's argument is pointer of csA.
csA's argument is p.
GCC is also an infinite recursion happens to see the type of tye argument of the argument.

We Solve this problem that does not check the arguments if the \rectype flag is true.
The following program become part was fixed gcc/c-family/c-pretty-print.c.

    static void
    pp_c_abstract_declarator (c_pretty_printer *pp, tree t)
    {
      if (TREE_CODE (t) == POINTER_TYPE)
        {
          if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE
          || TREE_CODE (TREE_TYPE (t)) == FUNCTION_TYPE)
        pp_c_right_paren (pp);
    #ifndef noCbC
          if (IS_RECTYPE(t)) return;
    #endif
          t = TREE_TYPE (t);
        }
      pp_direct_abstract_declarator (pp, t);
    }


Variable t have information of p's argument.
If t is POINTER_TYPE, t is assigned type of POINTER_TYPE(\verb+t = TREE_TYPE (t);+).
We have added code \verb+if (IS_RECTYPE(t)) return;+
Thereby we have solved type checking and infinite recursion problem.


--How to implement \rectype
\rectype syntx is implemented overriding AST.
First, \rectype syntax make Tree same \code(\ref{fig:tree1}).
Second, tree was created to be rectype flag.
Thrid, to override AST(\ref{fig:tree2}).

\begin{figure}[htpb]
  \begin{minipage}{0.5\hsize}
  \begin{center}
\scalebox{0.35}{\includegraphics{figure/tree1.pdf}}
  \end{center}
  \caption{AST of function pointer}
  \label{fig:tree1}
  \end{minipage}
  \begin{minipage}{0.2\hsize}
  \begin{center}
\scalebox{0.35}{\includegraphics{figure/tree2.pdf}}
  \end{center}
  \caption{AST of \rectype}
  \label{fig:tree2}
\end{minipage}
\end{figure}

Above AST(\ref{fig:tree2}) is made by syntax of \verb+__code csA(__rectype *p)+ .
TREE_LIST have infomation of argument.
First TREE_LIST represent that argument is function pointer(\verb+__code (*p)()+) .
Second TREE_LIST represent that csA is Fixed-length argument.
First TREE_LIST is connected with POINTER_TYPE.
POINTER_TYPE have pointer of function(FUNCTION_TYPE).
We have to override it in the pointer of csA.


--


--Method other than \rectype

The recursively program of C's syntax can be solved using struct syntax.
For example, if we write

    struct interface {
       __code (*next)(struct interface);
    };

    __code csA(struct interface p) {
        struct interface ds;
	ds.next = csB;
        goto p.next(ds);
    }

    int main() {
        struct interface ds;
	ds = print;
        goto csA(ds);
        return 0;
    }

there is no need to write recursively.
Because the struct syntax wrapped in a function pointer.
Code segment does not receive function pointer in arguments.
Recursively program does not occur.


\section{Comparision}

Here is our bench mark program.
{\small
    f0(int i) {
        int k,j;
        k = 3+i;
        j = g0(i+3);
        return k+4+j;
    }

    g0(int i) {
        return h0(i+4)+i;
    }

    h0(int i) {
        return i+4;
    }
}

It is written in C, we perform CPS transformation in several
steps by hands. There are several optimization is possible.

{\small
    /* straight conversion case (1) */

    typedef char *stack;

    struct cont_interface {
      // General Return Continuation
        __code (*ret)();
    };

    __code f(int i,stack sp) {
        int k,j;
        k = 3+i;
        goto f_g0(i,k,sp);
    }

    struct f_g0_interface {
       // Specialized Return Continuation
        __code (*ret)();
        int i_,k_,j_;
    };

    __code f_g1(int j,stack sp);

    __code f_g0(int i,int k,stack sp) { // Caller
        struct f_g0_interface *c =
            (struct f_g0_interface *)(
         sp -= sizeof(struct f_g0_interface));

        c->ret = f_g1;
        c->k_ = k;
        c->i_ = i;

        goto g(i+3,sp);
    }

    __code f_g1(int j,stack sp) {  // Continuation
        struct f_g0_interface *c =
          (struct f_g0_interface *)sp;
        int k = c->k_;
        sp+=sizeof(struct f_g0_interface);
        c = (struct f_g0_interface *)sp;
        goto (c->ret)(k+4+j,sp);
    }

    __code g_h1(int j,stack sp);

    __code g(int i,stack sp) { // Caller
        struct f_g0_interface *c =
            (struct f_g0_interface *)(
           sp -= sizeof(struct f_g0_interface));

        c->ret = g_h1;
        c->i_ = i;

        goto h(i+3,sp);
    }

    __code g_h1(int j,stack sp) {
        // Continuation
        struct f_g0_interface *c =
          (struct f_g0_interface *)sp;
        int i = c->i_;
        sp+=sizeof(struct f_g0_interface);
        c = (struct f_g0_interface *)sp;
        goto (c->ret)(j+i,sp);
    }

    __code h(int i,stack sp) {
        struct f_g0_interface *c =
          (struct f_g0_interface *)sp;
        goto (c->ret)(i+4,sp);
    }

    struct main_continuation {
        // General Return Continuation
        __code (*ret)();
        __code (*main_ret)();
        void *env;
    };

    __code main_return(int i,stack sp) {
        if (loop-->0)
            goto f(233,sp);
        printf("#0103:%d\n",i);
        goto (( (struct main_continuation *)sp)->main_ret)(0),
               ((struct main_continuation *)sp)->env;
    }
}

This is awfully long, but it is straight forward. Several forward
prototyping is necessary, and we find strict prototyping is
painful in CbC, because we have to use many code segments to
perform simple thing.  CbC is not a language for human, but for
automatic generation, verification or IDE directed programming.

We can shorten the result in this way.
{\small
    /* little optimized case (3) */

    __code f2_1(int i,char *sp) {
        int k,j;
        k = 3+i;
        goto g2_1(k,i+3,sp);
    }

    __code g2_1(int k,int i,char *sp) {
        goto h2_11(k,i+4,sp);
    }

    __code f2_0_1(int k,int j,char *sp);
    __code h2_1_1(int i,int k,int j,char *sp) {
        goto f2_0_1(k,i+j,sp);
    }

    __code h2_11(int i,int k,char *sp) {
        goto h2_1_1(i,k,i+4,sp);
    }

    __code f2_0_1(int k,int j,char *sp) {
        goto (( (struct cont_interface *)
           sp)->ret)(k+4+j,sp);
    }

    __code main_return2_1(int i,stack sp) {
        if (loop-->0)
            goto f2_1(233,sp);
        printf("#0165:%d\n",i);
        goto (( (struct main_continuation *)sp)->main_ret)(0),
               ((struct main_continuation *)sp)->env;
    }
}

In this example, CPS transformed source is faster than
original function call form. There are not so much area for the
optimization in function call form, because function call API
have to be strict.
CbC does not need standard call API other than interface which
is simply a struct and there are no need for register save. (This
bench mark is designed to require the register save).

Here is the result in x86_64 architecture (Table.\ref{tab:gcc,compare}).
is function call. \verb+conv1 2+, \verb+conv1 3+ is optimized CPS transformed
source.

\begin{table}[htpb]
\centering
\small
\begin{tabular}{|l|r|r|r|} \hline
(unit: s) & ./conv1 1 & ./conv1 2 &  ./conv1 3 \\ \hline
GCC -O3(32bit)       & 2.52 & 2.34 & 1.53 \\ \hline
GCC -O3(64bit)       & 1.80 & 1.20 & 1.44 \\ \hline
\end{tabular}
\caption{Micro-C, GCC bench mark (in sec)}
\label{tab:gcc,compare}
\end{table}


--Conclusion
author	Nobuyasu Oshiro <dimolto@cr.ie.u-ryukyu.ac.jp>
date	Fri, 15 Jun 2012 17:33:28 +0900
parents	f1839aae06dc
children	17ce985c1cb3