changeset 32:ebcf093795f3

Add twice examples
author Tatsuki IHA <innparusu@cr.ie.u-ryukyu.ac.jp>
date Sat, 03 Feb 2018 20:56:35 +0900
parents 8793903e4a0d
children 6da0158a17a1
files paper/evaluation.tex paper/fig/bitonicNetwork.pdf paper/fig/bitonicNetwork.xbb paper/fig/twice.pdf paper/fig/twice.svg paper/fig/twice.xbb paper/gpu.tex paper/master_paper.pdf paper/src/cudaTwice.cu
diffstat 9 files changed, 315 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/paper/evaluation.tex	Fri Feb 02 04:14:20 2018 +0900
+++ b/paper/evaluation.tex	Sat Feb 03 20:56:35 2018 +0900
@@ -1,3 +1,99 @@
 \chapter{Gears OS の評価}
+
+\section{実験環境}
+今回 Twice、 BitonicSort をそれぞれ CPU、GPU環境で Gears OS の測定を行う。
+
+使用する実験環境を\tabref{powerEdge}、 GPU 環境を\tabref{gtx1070} に示す。
+
+\begin{table}[htbp]
+    \begin{center}
+        \begin{tabular}{|l|l|} \hline
+            Model & Dell PowerEdgeR630 \\ \hline
+            OS    & CentOS 7.4.1708 \\ \hline
+            Memory & 768GB \\ \hline
+            CPU & 2 x 18-Core Intel Xeon 2.30GHz \\ \hline
+        \end{tabular}
+         \caption{実行環境}
+         \label{tab:powerEdge}
+    \end{center}
+\end{table}
+
+\begin{table}[htbp]
+    \begin{center}
+        \begin{tabular}{|l||l|} \hline
+            GPU & GeForce GTX 1070 \\ \hline
+            Cores & 1920 \\ \hline
+            Clock Speed & 1683MHz \\ \hline
+            Memory Size & 8GB GDDR5 \\ \hline
+            Memory Bandwidth & 256GB/s \\ \hline
+        \end{tabular}
+        \caption{GPU 環境}
+        \label{tab:gtx1070}
+    \end{center}
+\end{table}
+
 \section{Twice}
+Twice は与えられた整数配列のすべての要素を2倍にする例題である。
+
+Twice のTask生成の方針として、CPU の場合は配列ある程度の範囲に分割してTaskを生成する。
+これは要素毎に Task を生成するとその分の Context を生成するために時間を取ってしまうからである。
+
+GPU での実行は データ並列を使用して行う。
+GPU でデータ並列を実行する場合は Context とコピーなどは発生しないため、1要素に1スレッドを割り振って実行を行う。
+
+Twice は並列実行の依存関係もなく、データ並列での実行に適した課題である。
+そのため、 通信時間を考慮しなければ CPU よりコア数が多い GPU が有利となる。
+
+要素数$2^{23}$ のデータに対する Twice の実行結果を \tabref{twice}、\figref{twice}に示す。
+CPU 実行の際は $2^{23}$ のデータを 64 個のTask に分割して並列実行を行っている。
+ここでの ``GPU`` は CPU、 GPU 間のデータの通信時間を含めた時間、 ``GPU(kernel only)`` は kernel のみの実行時間である。
+
+\begin{table}[htbp]
+    \begin{center}
+        \begin{tabular}{|l||l|} \hline
+            Processor & Time(ms) \\ \hline
+            1 CPU & 147.946 \\ \hline
+            2 CPUs & 80.773\\ \hline
+            4 CPUs & 40.527\\ \hline
+            8 CPUs & 20.267\\ \hline
+            16 CPUs & 10.936\\ \hline
+            32 CPUs & 5.878\\ \hline
+            GPU & 542.816\\ \hline
+            GPU(kernel only)& 0.755\\ \hline
+        \end{tabular}
+        \caption{$2^{23}$ のデータに対する Twice}
+        \label{tab:twice}
+    \end{center}
+\end{table}
+
+\begin{figure}[htbp]
+    \begin{center}
+        \includegraphics[scale=0.6]{./fig/twice.pdf}
+    \end{center}
+    \caption{$2^{23}$ のデータに対する twice}
+    \label{fig:twice}
+\end{figure}
+
+1 CPU と 32 CPU では 約 25.1 倍の速度向上が見られた。
+ある程度の台数効果があると考えられる。
+
+GPU での実行は kernel のみの実行時間は 32CPU に比べて 約 7.8 倍の実行向上が見られた。
+しかし、通信時間を含めると 1 CPU より著しく遅い結果となってしまった。
+CPU、GPU の通信時間かボトルネックになっている事がわかる。
+
 \section{BitonicSort}
+BitonicSort は並列処理向けのソートアルゴリズムである。
+代表的なソートアルゴリズムである Quick Sort も並列処理 を行うことが可能であるが、 QuickSort では ソートの過程で並列度が変動するため、台数効果が出づらい。
+一方でBitonic Sort は最初から最後まで並列度が変わらずに並列処理を行う。
+\figref{bitonicNetwork} は要素数8のデータに対する BitonicSort のソートネットワークである。
+
+\begin{figure}[htbp]
+    \begin{center}
+        \includegraphics[scale=0.6]{./fig/bitonicNetwork.pdf}
+    \end{center}
+    \caption{要素数8の BtionicNetwork}
+    \label{fig:bitonicNetwork}
+\end{figure}
+
+BitonicSort はステージ毎に決まった2点間の要素の入れ替えを並列に実行することによってソートを行う。
+
Binary file paper/fig/bitonicNetwork.pdf has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/paper/fig/bitonicNetwork.xbb	Sat Feb 03 20:56:35 2018 +0900
@@ -0,0 +1,8 @@
+%%Title: fig/bitonicNetwork.pdf
+%%Creator: extractbb 20170318
+%%BoundingBox: 0 0 591 422
+%%HiResBoundingBox: 0.000000 0.000000 591.000000 422.000000
+%%PDFVersion: 1.3
+%%Pages: 1
+%%CreationDate: Sat Feb  3 19:57:43 2018
+
Binary file paper/fig/twice.pdf has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/paper/fig/twice.svg	Sat Feb 03 20:56:35 2018 +0900
@@ -0,0 +1,194 @@
+<?xml version="1.0" encoding="utf-8"  standalone="no"?>
+<svg 
+ width="600" height="480"
+ viewBox="0 0 600 480"
+ xmlns="http://www.w3.org/2000/svg"
+ xmlns:xlink="http://www.w3.org/1999/xlink"
+>
+
+<title>Gnuplot</title>
+<desc>Produced by GNUPLOT 5.2 patchlevel 2 </desc>
+
+<g id="gnuplot_canvas">
+
+<rect x="0" y="0" width="600" height="480" fill="none"/>
+<defs>
+
+	<circle id='gpDot' r='0.5' stroke-width='0.5'/>
+	<path id='gpPt0' stroke-width='0.222' stroke='currentColor' d='M-1,0 h2 M0,-1 v2'/>
+	<path id='gpPt1' stroke-width='0.222' stroke='currentColor' d='M-1,-1 L1,1 M1,-1 L-1,1'/>
+	<path id='gpPt2' stroke-width='0.222' stroke='currentColor' d='M-1,0 L1,0 M0,-1 L0,1 M-1,-1 L1,1 M-1,1 L1,-1'/>
+	<rect id='gpPt3' stroke-width='0.222' stroke='currentColor' x='-1' y='-1' width='2' height='2'/>
+	<rect id='gpPt4' stroke-width='0.222' stroke='currentColor' fill='currentColor' x='-1' y='-1' width='2' height='2'/>
+	<circle id='gpPt5' stroke-width='0.222' stroke='currentColor' cx='0' cy='0' r='1'/>
+	<use xlink:href='#gpPt5' id='gpPt6' fill='currentColor' stroke='none'/>
+	<path id='gpPt7' stroke-width='0.222' stroke='currentColor' d='M0,-1.33 L-1.33,0.67 L1.33,0.67 z'/>
+	<use xlink:href='#gpPt7' id='gpPt8' fill='currentColor' stroke='none'/>
+	<use xlink:href='#gpPt7' id='gpPt9' stroke='currentColor' transform='rotate(180)'/>
+	<use xlink:href='#gpPt9' id='gpPt10' fill='currentColor' stroke='none'/>
+	<use xlink:href='#gpPt3' id='gpPt11' stroke='currentColor' transform='rotate(45)'/>
+	<use xlink:href='#gpPt11' id='gpPt12' fill='currentColor' stroke='none'/>
+	<path id='gpPt13' stroke-width='0.222' stroke='currentColor' d='M0,1.330 L1.265,0.411 L0.782,-1.067 L-0.782,-1.076 L-1.265,0.411 z'/>
+	<use xlink:href='#gpPt13' id='gpPt14' fill='currentColor' stroke='none'/>
+	<filter id='textbox' filterUnits='objectBoundingBox' x='0' y='0' height='1' width='1'>
+	  <feFlood flood-color='white' flood-opacity='1' result='bgnd'/>
+	  <feComposite in='SourceGraphic' in2='bgnd' operator='atop'/>
+	</filter>
+	<filter id='greybox' filterUnits='objectBoundingBox' x='0' y='0' height='1' width='1'>
+	  <feFlood flood-color='lightgrey' flood-opacity='1' result='grey'/>
+	  <feComposite in='SourceGraphic' in2='grey' operator='atop'/>
+	</filter>
+</defs>
+<g fill="none" color="white" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,422.4 L80.9,422.4 M575.0,422.4 L566.0,422.4  '/>	<g transform="translate(63.6,426.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="end">
+		<text><tspan font-family="Arial" > 0</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,361.0 L80.9,361.0 M575.0,361.0 L566.0,361.0  '/>	<g transform="translate(63.6,364.9)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="end">
+		<text><tspan font-family="Arial" > 100</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,299.6 L80.9,299.6 M575.0,299.6 L566.0,299.6  '/>	<g transform="translate(63.6,303.5)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="end">
+		<text><tspan font-family="Arial" > 200</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,238.2 L80.9,238.2 M575.0,238.2 L566.0,238.2  '/>	<g transform="translate(63.6,242.1)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="end">
+		<text><tspan font-family="Arial" > 300</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,176.9 L80.9,176.9 M575.0,176.9 L566.0,176.9  '/>	<g transform="translate(63.6,180.8)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="end">
+		<text><tspan font-family="Arial" > 400</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,115.5 L80.9,115.5 M575.0,115.5 L566.0,115.5  '/>	<g transform="translate(63.6,119.4)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="end">
+		<text><tspan font-family="Arial" > 500</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,54.1 L80.9,54.1 M575.0,54.1 L566.0,54.1  '/>	<g transform="translate(63.6,58.0)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="end">
+		<text><tspan font-family="Arial" > 600</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M103.3,422.4 L103.3,413.4 M103.3,54.1 L103.3,63.1  '/>	<g transform="translate(103.3,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >1 cpu</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M166.2,422.4 L166.2,413.4 M166.2,54.1 L166.2,63.1  '/>	<g transform="translate(166.2,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >2 cpus</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M229.1,422.4 L229.1,413.4 M229.1,54.1 L229.1,63.1  '/>	<g transform="translate(229.1,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >4 cpus</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M292.0,422.4 L292.0,413.4 M292.0,54.1 L292.0,63.1  '/>	<g transform="translate(292.0,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >8 cpus</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M354.9,422.4 L354.9,413.4 M354.9,54.1 L354.9,63.1  '/>	<g transform="translate(354.9,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >16 cpus</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M417.8,422.4 L417.8,413.4 M417.8,54.1 L417.8,63.1  '/>	<g transform="translate(417.8,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >32 cpus</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M480.7,422.4 L480.7,413.4 M480.7,54.1 L480.7,63.1  '/>	<g transform="translate(480.7,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >gpu</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M543.6,422.4 L543.6,413.4 M543.6,54.1 L543.6,63.1  '/>	<g transform="translate(543.6,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >gpu(only)</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,54.1 L71.9,422.4 L575.0,422.4 L575.0,54.1 L71.9,54.1 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g transform="translate(16.3,238.3) rotate(270)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >time(ms)</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g transform="translate(323.4,471.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >CPUs</tspan></text>
+	</g>
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g transform="translate(323.4,31.0)" stroke="none" fill="black" font-family="Arial" font-size="12.00"  text-anchor="middle">
+		<text><tspan font-family="Arial" >twice benchmark</tspan></text>
+	</g>
+</g>
+	<g id="gnuplot_plot_1" ><title>gnuplot_plot_1</title>
+<g fill="none" color="white" stroke="black" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '87.6,422.4 119.2,422.4 119.2,331.5 87.6,331.5 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M87.6,422.4 L87.6,331.6 L119.1,331.6 L119.1,422.4 L87.6,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '150.5,422.4 182.1,422.4 182.1,372.7 150.5,372.7 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M150.5,422.4 L150.5,372.8 L182.0,372.8 L182.0,422.4 L150.5,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '213.4,422.4 244.9,422.4 244.9,397.4 213.4,397.4 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M213.4,422.4 L213.4,397.5 L244.8,397.5 L244.8,422.4 L213.4,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '276.3,422.4 307.8,422.4 307.8,409.9 276.3,409.9 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M276.3,422.4 L276.3,410.0 L307.7,410.0 L307.7,422.4 L276.3,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '339.2,422.4 370.7,422.4 370.7,415.6 339.2,415.6 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M339.2,422.4 L339.2,415.7 L370.6,415.7 L370.6,422.4 L339.2,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '402.1,422.4 433.6,422.4 433.6,418.7 402.1,418.7 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M402.1,422.4 L402.1,418.8 L433.5,418.8 L433.5,422.4 L402.1,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '464.9,422.4 496.5,422.4 496.5,89.1 464.9,89.1 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M464.9,422.4 L464.9,89.2 L496.4,89.2 L496.4,422.4 L464.9,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<g stroke='none' shape-rendering='crispEdges'>
+		<polygon fill = 'rgb(  0, 255, 255)' points = '527.8,422.4 559.4,422.4 559.4,421.8 527.8,421.8 '/>
+	</g>
+	<path stroke='rgb(  0,   0,   0)'  d='M527.8,422.4 L527.8,421.9 L559.3,421.9 L559.3,422.4 L527.8,422.4 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+</g>
+	</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter">
+</g>
+<g fill="none" color="black" stroke="black" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+</g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+	<path stroke='black'  d='M71.9,54.1 L71.9,422.4 L575.0,422.4 L575.0,54.1 L71.9,54.1 Z  '/></g>
+<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter">
+</g>
+</g>
+</svg>
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/paper/fig/twice.xbb	Sat Feb 03 20:56:35 2018 +0900
@@ -0,0 +1,8 @@
+%%Title: fig/twice.pdf
+%%Creator: extractbb 20170318
+%%BoundingBox: 0 0 600 480
+%%HiResBoundingBox: 0.000000 0.000000 600.000000 480.000000
+%%PDFVersion: 1.3
+%%Pages: 1
+%%CreationDate: Sat Feb  3 18:15:19 2018
+
--- a/paper/gpu.tex	Fri Feb 02 04:14:20 2018 +0900
+++ b/paper/gpu.tex	Sat Feb 03 20:56:35 2018 +0900
@@ -37,6 +37,8 @@
 また、block 内の thread 数は blockDim という組み込み変数で取得でき、これも3次元のベクター型になっている。
 CUDA では これらの組み込み変数から thread が対応するデータを割り出し、データ並列の処理を行う。
 
+\newpage
+
 \section{CUDAWorker}
 CUDAWorker は TaskManager から送信される CUDA用の Task を取得し、実行を行う。
 
@@ -85,6 +87,8 @@
 
 実際にcuLaunchKernel 関数を使用している部分を \coderef{cuLaunchKernel} に示す。
 
+\newpage
+
 \lstinputlisting[caption=kernel に起動, label=code:cuLaunchKernel]{./src/cuLaunchKernel.cbc}
 
 Gears OS ではデータ並列 Task の際は Iterator Interface を持っており、 そこで指定した長さ、次元数に応じて cuLaunchKernel の引数を決定する(\coderef{cuLaunchKernel} 11-18行目)。
Binary file paper/master_paper.pdf has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/paper/src/cudaTwice.cu	Sat Feb 03 20:56:35 2018 +0900
@@ -0,0 +1,5 @@
+extern "C" {
+    __global__ void twice(int* array) {
+        array[i+(blockIdx.x*blockDim.x+threadIdx.x)*prefix] = array[i+(blockIdx.x*blockDim.x+threadIdx.x)*prefix]*2;
+    }
+}