Mercurial > hg > Papers > 2015 > kaito-lola
annotate cfopm.tex @ 11:4cff1ef8fbf6
fix
author | Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp> |
---|---|
date | Sat, 04 Jul 2015 17:25:27 +0900 |
parents | 117760bfcae9 |
children | e13520c327f1 |
rev | line source |
---|---|
0 | 1 \documentclass[conference]{IEEEtran} |
2 | |
3 \usepackage[cmex10]{amsmath} | |
4 \usepackage{url} | |
5 \usepackage{listings} | |
6 \usepackage[dvipdfmx]{graphicx} | |
7 | |
8 \lstset{ | |
9 frame=single, | |
10 keepspaces=true, | |
11 stringstyle={\ttfamily}, | |
12 commentstyle={\ttfamily}, | |
13 identifierstyle={\ttfamily}, | |
14 keywordstyle={\ttfamily}, | |
15 basicstyle={\ttfamily}, | |
16 breaklines=true, | |
17 xleftmargin=0zw, | |
18 xrightmargin=0zw, | |
19 framerule=.2pt, | |
20 columns=[l]{fullflexible}, | |
21 numbers=left, | |
22 stepnumber=1, | |
23 numberstyle={\scriptsize}, | |
24 numbersep=1em, | |
25 language=c, | |
26 tabsize=4, | |
27 lineskip=-0.5zw, | |
28 escapechar={@}, | |
29 } | |
30 | |
31 \ifCLASSINFOpdf | |
32 % \usepackage[pdftex]{graphicx} | |
33 % declare the path(s) where your graphic files are | |
34 % \graphicspath{{../pdf/}{../jpeg/}} | |
35 % and their extensions so you won't have to specify these with | |
36 % every instance of \includegraphics | |
37 % \DeclareGraphicsExtensions{.pdf,.jpeg,.png} | |
38 \else | |
39 % or other class option (dvipsone, dvipdf, if not using dvips). graphicx | |
40 % will default to the driver specified in the system graphics.cfg if no | |
41 % driver is specified. | |
42 % \usepackage[dvips]{graphicx} | |
43 % declare the path(s) where your graphic files are | |
44 % \graphicspath{{../eps/}} | |
45 % and their extensions so you won't have to specify these with | |
46 % every instance of \includegraphics | |
47 % \DeclareGraphicsExtensions{.eps} | |
48 \fi | |
49 | |
50 | |
51 % correct bad hyphenation here | |
52 \hyphenation{op-tical net-works semi-conduc-tor} | |
53 | |
54 | |
55 \begin{document} | |
56 % | |
57 % paper title | |
58 % Titles are generally capitalized except for words such as a, an, and, as, | |
59 % at, but, by, for, in, nor, of, on, or, the, to and up, which are usually | |
60 % not capitalized unless they are the first or last word of the title. | |
61 % Linebreaks \\ can be used within to get better formatting as desired. | |
62 % Do not put math or special symbols in the title. | |
63 \title{Implementing Continuation based language in LLVM and Clang} | |
64 | |
65 | |
66 % author names and affiliations | |
67 % use a multiple column layout for up to three different | |
68 % affiliations | |
69 \author{ | |
70 \IEEEauthorblockN{Kaito TOKUMORI} | |
71 \IEEEauthorblockA{University of the Ryukyus \\ Email: kaito@cr.ie.u-ryukyu.ac.jp} | |
72 \and | |
73 \IEEEauthorblockN{Shinji KONO} | |
74 \IEEEauthorblockA{University of the Ryukyus \\ Email: kono@ie.u-ryukyu.ac.jp} | |
75 } | |
76 | |
77 % make the title area | |
78 \maketitle | |
79 | |
80 % As a general rule, do not put math, special symbols or citations | |
81 % in the abstract | |
82 \begin{abstract} | |
7 | 83 The programming paradigm which use data segments and code segments |
84 are proposed. | |
85 This paradigm uses Continuation based C (CbC), | |
86 which a slight modified C language. | |
87 Code segments are units of calculation and | |
88 Data segments are sets of typed data. | |
89 We use these segments as units of computation and meta computation. | |
90 In this paper we show the implementation of CbC on LLVM and Clang 3.7. | |
0 | 91 \end{abstract} |
92 | |
93 % no keywords | |
94 | |
95 | |
96 | |
97 | |
98 % For peer review papers, you can put extra information on the cover | |
99 % page as needed: | |
100 % \ifCLASSOPTIONpeerreview | |
101 % \begin{center} \bfseries EDICS Category: 3-BBND \end{center} | |
102 % \fi | |
103 % | |
104 % For peerreview papers, this IEEEtran command inserts a page break and | |
105 % creates the second title. It will be ignored for other modes. | |
106 \IEEEpeerreviewmaketitle | |
107 | |
108 \section{A Practical Continuation based Language} | |
2 | 109 The proposed units of programming are named code segments and data segments. |
110 Code segments are units of calculation which have no state. | |
111 Data segments are sets of typed data. | |
112 Code segments are connected to data segments by a meta data segment called a context. | |
113 After the execution of a code segment and its context, the next code segment (continuation) is executed. | |
0 | 114 |
3
c50a033e6635
create presentation slide
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
2
diff
changeset
|
115 Continuation based C (CbC) \cite{DBLP:journals/corr/abs-1109-4048}, hereafter referred to as CbC, is a slight modified C which supports code segments. It is compatible with C and has continuation as a goto statement. |
0 | 116 |
2 | 117 Code segments and data segments are low level enough to represent computational details, |
118 and are architecture independent. They can be used as architecture independent assemblers. | |
0 | 119 |
7 | 120 % CbC has stand alone compiler and GCC version. Here we report new partial implementation of CbC compiler based on LLVM and Clang 3.7. |
121 CbC was first implemented on micro-C one path compiler. | |
122 GCC based CbC compiler is developed in 2008\cite{DBLP:journals/corr/abs-1109-4048}. | |
123 GCC is GNU Compiler Collection \cite{gcc}. | |
124 In GCC version, nested function is used to implement goto with environment in 2011\cite{CbC2011}. | |
125 In this study, we report a latest CbC compiler which is implemented | |
126 in LLVM and Clang 3.7. | |
127 | |
128 \verb+C--+ \cite{cminusminus} is also known as a lower level language of C. It has precise type specification and goto statement with parameters. | |
129 CbC introduces \verb+__code++ type for code segment which makes clear separation of functions and code segments. | |
0 | 130 |
131 \section{Continuation based C} | |
2 | 132 CbC's basic programming unit is the code segment. These are not subroutines, but they look like functions because they take input and produce output. Both input and output should be data segments. Table \ref{src:example} details the definition of the data segment. |
0 | 133 |
134 \begin{table}[html] | |
135 \begin{lstlisting} | |
136 __code f(Allocate allocate){ | |
137 allocate.size = 0; | |
138 goto g(allocate); | |
139 } | |
140 | |
141 // data segment definition | |
142 // (generated automatically) | |
143 union Data { | |
144 struct Allocate { | |
145 long size; | |
146 } allocate; | |
147 }; | |
148 \end{lstlisting} | |
149 \caption{CbC Example} | |
150 \label{src:example} | |
151 \end{table} | |
152 | |
7 | 153 In this example, the code segment {\bf f} takes the input data segment {\bf allocate} (allocate is the data segments identifier) and sends f's output to the code segment {\bf g}. The CbC compiler generates the data segment definition automatically, so writing it is unnecessary. There is no return from code segment {\bf g}. {\bf G} should call another continuation using {\bf goto}. Code segments have input data segments and output data segments. Data segments have two kind of dependency with code segments. |
3
c50a033e6635
create presentation slide
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
2
diff
changeset
|
154 First, Code segments access the contents of data segments using field names. So data segments should have the named fields. |
c50a033e6635
create presentation slide
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
2
diff
changeset
|
155 The second dependency is a data dependency, that is all input data segments should be ready before their execution. |
2 | 156 |
0 | 157 \begin{figure}[htp] |
158 \begin{center} | |
159 \scalebox{0.5}{\includegraphics{fig/csds.pdf}} | |
160 \end{center} | |
1 | 161 \caption{Code Segments and Data Segments on CbC} |
0 | 162 \label{fig:csds} |
163 \end{figure} | |
164 | |
2 | 165 Shifting completely, from C to CbC is unnecessary as in CbC we can go to code segments from C functions and call C functions in code segments The latter is straightforward but the former needs further extensions. |
0 | 166 |
167 \begin{table}[html] | |
168 \begin{lstlisting} | |
169 int main() { | |
170 goto hello("Hello World\n", __return, __environment); | |
171 } | |
172 | |
173 __code hello(char *s, __code(*ret)(int, void*), void *env) { | |
174 printf(s); | |
3
c50a033e6635
create presentation slide
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
2
diff
changeset
|
175 goto (*ret)(123); |
0 | 176 } |
177 \end{lstlisting} | |
178 \caption{Call C Functions in a Code Segment} | |
179 \label{src:example} | |
180 \end{table} | |
181 | |
3
c50a033e6635
create presentation slide
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
2
diff
changeset
|
182 In this hello world example, the environment of {\bf main}() and its continuation is kept in the variable {\bf \_\_environment}. The environment and the continuation can be accessed using {\bf \_\_environment} and {\bf \_\_return}.The arbitrary mixing of code segments and functions is allowed. The continuation of a {\bf goto} statement never returns to the original function, but goes to the caller or the original function. In that case, it returns the result 123 to the operating system. This continuation is called {\bf goto with environment}. |
0 | 183 |
184 \section{LLVM and Clang} | |
7 | 185 The LLVM Project is a collection of modular and reusable compilers and tool chain technologies, and the LLVM Core libraries provide a modern source and target independent optimizer, along with code generation support for many popular CPUs. Clang is an LLVM native C/C++/Objective-C compiler. Figure \ref{fig:structure} shows Clang and LLVM's compilation flow. |
0 | 186 |
187 \begin{figure}[htp] | |
188 \begin{center} | |
189 \scalebox{0.25}{\includegraphics{fig/clang_llvm_structure.pdf}} | |
190 \end{center} | |
191 \caption{LLVM and Clang structure} | |
192 \label{fig:structure} | |
193 \end{figure} | |
194 | |
2 | 195 LLVM has an intermediate representation which is called LLVM IR\cite{LLVMIR}. This part remains unmodified so that the optimization part does not need to be modified. |
0 | 196 |
197 \section{Implementation in LLVM and Clang} | |
2 | 198 The CbC compiler is implemented in LLVM and Clang using the following ideas. |
0 | 199 |
200 \begin{itemize} | |
201 \item Code segments are implemented by C functions. | |
202 \item Transition is implemented by forced tail call. | |
203 \item Goto with environment is implemented by setjmp and longjmp. | |
204 \end{itemize} | |
205 | |
2 | 206 {\bf \_\_code} is implemented as a new type keyword in LLVM and Clang. {\bf \_\_code} is similar to an attribute of a function, which means that the function can only be called in tail call elimination. Because of this implementation, code segments can actually be called as C function calls. |
0 | 207 |
2 | 208 Forcing a tail call requires many conditions be met. For example, there should not be a statement after a tail call, the caller and callee's calling conventions must be the same and their types should be cc10, cc11 or fastcc and the callee's return value type has to be the same as the caller's. |
0 | 209 |
2 | 210 All code segments have the void return type and writing statements after continuation is not allowed. As a result, type problems and after statement problems are solved. |
0 | 211 |
2 | 212 Tail call elimination passes are enabled in {\bf BackendUtil.cpp}. In Clang, when the optimization level is two or more, tail call elimination passing is enable. Here it has been modified to be enabled anytime, however if the optimization level is one or less, tail call elimination passes only work for code segments. |
3
c50a033e6635
create presentation slide
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
2
diff
changeset
|
213 A calling convention problem was also solved. fastcc was selected for a code segment's calling convention. In Clang, calling conventions are managed by the CGFunctionInfo class and its information is set in {\bf CGCall.ccp} ( a part of CodeGen ), which is where code segments calling conventions were set to fastcc. |
0 | 214 |
2 | 215 Goto with environment is implemented by code rearranging. If the {\bf \_\_environment} or {\bf \_\_return} is declared, the CbC compiler rearranges the code for goto with environment. Setjmp and longjmp are used to do this. setjmp to save the environment before continuation and longjmp to restore it. |
0 | 216 |
217 \section{Result} | |
2 | 218 Table \ref{src:example} shows the benchmark program. |
0 | 219 |
220 \begin{table}[html] | |
221 \begin{lstlisting} | |
222 int f0(int i) { | |
223 int k,j; | |
224 k = 3+i; | |
225 j = g0(i+3); | |
226 return k+4+j; | |
227 } | |
228 | |
229 int g0(int i) { | |
230 return h0(i+4)+i; | |
231 } | |
232 | |
233 int h0(int i) { | |
234 return i+4; | |
235 } | |
236 \end{lstlisting} | |
7 | 237 \caption{benchmark program in C} |
0 | 238 \label{src:example} |
239 \end{table} | |
240 | |
7 | 241 Fig.\ref{src:example} is a normal C program. We can rewrite this program into CbC in several way, \verb+conv1,conv2,conv3+. |
242 Basicaly function call is emulated by goto statement with explicit stack. \verb+conv2,conv3+ uses extra | |
243 argument to eliminate these stacks. | |
244 The CbC \verb+conv3+ source is shown in fig.\ref{src:cbc} | |
245 Using this benchmark, function call overhead become visible. | |
246 In order to see the overhead, inline function expansion is prohibited. | |
2 | 247 The benchmark results are shown in TABLE \ref{result}. |
0 | 248 |
7 | 249 \begin{table}[html] |
250 \begin{lstlisting} | |
251 | |
252 struct cont_interface { // General Return Continuation | |
253 __code (*ret)(); | |
254 }; | |
255 | |
256 __code f2_1(int i,char *sp) { | |
257 int k,j; | |
258 k = 3+i; | |
259 goto g2_1(k,i+3,sp); | |
260 } | |
261 | |
262 __code g2_1(int k,int i,char *sp) { | |
263 goto h2_11(k,i+4,sp); | |
264 } | |
265 | |
266 __code h2_1_1(int i,int k,int j,char *sp) { | |
267 goto f2_0_1(k,i+j,sp); | |
268 } | |
269 | |
270 __code h2_11(int i,int k,char *sp) { | |
271 goto h2_1_1(i,k,i+4,sp); | |
272 } | |
273 | |
274 __code f2_0_1(int k,int j,char *sp) { | |
275 goto (( (struct cont_interface *)sp)->ret)(k+4+j,sp); | |
276 } | |
277 | |
278 \end{lstlisting} | |
279 \caption{benchmark program conv3 in CbC} | |
280 \label{src:cbc} | |
281 \end{table} | |
282 | |
0 | 283 \begin{table}[htpb] |
284 \centering | |
285 \begin{tabular}{|l|r|r|r|} \hline | |
7 | 286 & conv1 & conv2 & conv3 \\ \hline |
0 | 287 Micro-C & 6.875 & 2.4562 & 3.105 \\ \hline |
288 GCC -O2 & 2.9438 & 0.955 & 1.265 \\ \hline | |
289 LLVM/clang -O0 & 5.835 & 4.1887 & 5.0625 \\ \hline | |
290 LLVM/clang -O2 & 3.3875 & 2.29 & 2.5087 \\ \hline | |
291 \end{tabular} | |
292 \caption{Execution time(s)} | |
293 \label{result} | |
294 \end{table} | |
295 | |
3
c50a033e6635
create presentation slide
Kaito Tokumori <e105711@ie.u-ryukyu.ac.jp>
parents:
2
diff
changeset
|
296 LLVM and Clang compilers are faster than Micro-C when optimization is enabled. This means CbC get benefits from LLVM optimizations. The LLVM and Clang complier is slower than GCC, but GCC cannot compile safely without optimization. This means LLVM can compile more reliably than GCC. |
0 | 297 |
298 \section{Conclusion} | |
2 | 299 This Continuation based language has been designed and implemented for practical use. CbC has been partially implemented using LLVM and Clang 3.7. |
300 CbC can use LLVM's optimization. LLVM IR was not modified to implement CbC's compiler. | |
0 | 301 |
2 | 302 In the future, data segments, meta code segments and meta data segments for meta computation will be designed and implemented. |
0 | 303 \nocite{opac-b1092711, LLVMIR, LLVM, clang} |
304 \bibliographystyle{IEEEtran} | |
305 \bibliography{IEEEabrv,reference} | |
306 | |
307 | |
308 | |
309 % that's all folks | |
310 \end{document} |