Mercurial > hg > Papers > 2018 > parusu-master
changeset 32:ebcf093795f3
Add twice examples
author | Tatsuki IHA <innparusu@cr.ie.u-ryukyu.ac.jp> |
---|---|
date | Sat, 03 Feb 2018 20:56:35 +0900 |
parents | 8793903e4a0d |
children | 6da0158a17a1 |
files | paper/evaluation.tex paper/fig/bitonicNetwork.pdf paper/fig/bitonicNetwork.xbb paper/fig/twice.pdf paper/fig/twice.svg paper/fig/twice.xbb paper/gpu.tex paper/master_paper.pdf paper/src/cudaTwice.cu |
diffstat | 9 files changed, 315 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- a/paper/evaluation.tex Fri Feb 02 04:14:20 2018 +0900 +++ b/paper/evaluation.tex Sat Feb 03 20:56:35 2018 +0900 @@ -1,3 +1,99 @@ \chapter{Gears OS の評価} + +\section{実験環境} +今回 Twice、 BitonicSort をそれぞれ CPU、GPU環境で Gears OS の測定を行う。 + +使用する実験環境を\tabref{powerEdge}、 GPU 環境を\tabref{gtx1070} に示す。 + +\begin{table}[htbp] + \begin{center} + \begin{tabular}{|l|l|} \hline + Model & Dell PowerEdgeR630 \\ \hline + OS & CentOS 7.4.1708 \\ \hline + Memory & 768GB \\ \hline + CPU & 2 x 18-Core Intel Xeon 2.30GHz \\ \hline + \end{tabular} + \caption{実行環境} + \label{tab:powerEdge} + \end{center} +\end{table} + +\begin{table}[htbp] + \begin{center} + \begin{tabular}{|l||l|} \hline + GPU & GeForce GTX 1070 \\ \hline + Cores & 1920 \\ \hline + Clock Speed & 1683MHz \\ \hline + Memory Size & 8GB GDDR5 \\ \hline + Memory Bandwidth & 256GB/s \\ \hline + \end{tabular} + \caption{GPU 環境} + \label{tab:gtx1070} + \end{center} +\end{table} + \section{Twice} +Twice は与えられた整数配列のすべての要素を2倍にする例題である。 + +Twice のTask生成の方針として、CPU の場合は配列ある程度の範囲に分割してTaskを生成する。 +これは要素毎に Task を生成するとその分の Context を生成するために時間を取ってしまうからである。 + +GPU での実行は データ並列を使用して行う。 +GPU でデータ並列を実行する場合は Context とコピーなどは発生しないため、1要素に1スレッドを割り振って実行を行う。 + +Twice は並列実行の依存関係もなく、データ並列での実行に適した課題である。 +そのため、 通信時間を考慮しなければ CPU よりコア数が多い GPU が有利となる。 + +要素数$2^{23}$ のデータに対する Twice の実行結果を \tabref{twice}、\figref{twice}に示す。 +CPU 実行の際は $2^{23}$ のデータを 64 個のTask に分割して並列実行を行っている。 +ここでの ``GPU`` は CPU、 GPU 間のデータの通信時間を含めた時間、 ``GPU(kernel only)`` は kernel のみの実行時間である。 + +\begin{table}[htbp] + \begin{center} + \begin{tabular}{|l||l|} \hline + Processor & Time(ms) \\ \hline + 1 CPU & 147.946 \\ \hline + 2 CPUs & 80.773\\ \hline + 4 CPUs & 40.527\\ \hline + 8 CPUs & 20.267\\ \hline + 16 CPUs & 10.936\\ \hline + 32 CPUs & 5.878\\ \hline + GPU & 542.816\\ \hline + GPU(kernel only)& 0.755\\ \hline + \end{tabular} + \caption{$2^{23}$ のデータに対する Twice} + \label{tab:twice} + \end{center} +\end{table} + +\begin{figure}[htbp] + \begin{center} + \includegraphics[scale=0.6]{./fig/twice.pdf} + \end{center} + \caption{$2^{23}$ のデータに対する twice} + \label{fig:twice} +\end{figure} + +1 CPU と 32 CPU では 約 25.1 倍の速度向上が見られた。 +ある程度の台数効果があると考えられる。 + +GPU での実行は kernel のみの実行時間は 32CPU に比べて 約 7.8 倍の実行向上が見られた。 +しかし、通信時間を含めると 1 CPU より著しく遅い結果となってしまった。 +CPU、GPU の通信時間かボトルネックになっている事がわかる。 + \section{BitonicSort} +BitonicSort は並列処理向けのソートアルゴリズムである。 +代表的なソートアルゴリズムである Quick Sort も並列処理 を行うことが可能であるが、 QuickSort では ソートの過程で並列度が変動するため、台数効果が出づらい。 +一方でBitonic Sort は最初から最後まで並列度が変わらずに並列処理を行う。 +\figref{bitonicNetwork} は要素数8のデータに対する BitonicSort のソートネットワークである。 + +\begin{figure}[htbp] + \begin{center} + \includegraphics[scale=0.6]{./fig/bitonicNetwork.pdf} + \end{center} + \caption{要素数8の BtionicNetwork} + \label{fig:bitonicNetwork} +\end{figure} + +BitonicSort はステージ毎に決まった2点間の要素の入れ替えを並列に実行することによってソートを行う。 +
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/paper/fig/bitonicNetwork.xbb Sat Feb 03 20:56:35 2018 +0900 @@ -0,0 +1,8 @@ +%%Title: fig/bitonicNetwork.pdf +%%Creator: extractbb 20170318 +%%BoundingBox: 0 0 591 422 +%%HiResBoundingBox: 0.000000 0.000000 591.000000 422.000000 +%%PDFVersion: 1.3 +%%Pages: 1 +%%CreationDate: Sat Feb 3 19:57:43 2018 +
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/paper/fig/twice.svg Sat Feb 03 20:56:35 2018 +0900 @@ -0,0 +1,194 @@ +<?xml version="1.0" encoding="utf-8" standalone="no"?> +<svg + width="600" height="480" + viewBox="0 0 600 480" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink" +> + +<title>Gnuplot</title> +<desc>Produced by GNUPLOT 5.2 patchlevel 2 </desc> + +<g id="gnuplot_canvas"> + +<rect x="0" y="0" width="600" height="480" fill="none"/> +<defs> + + <circle id='gpDot' r='0.5' stroke-width='0.5'/> + <path id='gpPt0' stroke-width='0.222' stroke='currentColor' d='M-1,0 h2 M0,-1 v2'/> + <path id='gpPt1' stroke-width='0.222' stroke='currentColor' d='M-1,-1 L1,1 M1,-1 L-1,1'/> + <path id='gpPt2' stroke-width='0.222' stroke='currentColor' d='M-1,0 L1,0 M0,-1 L0,1 M-1,-1 L1,1 M-1,1 L1,-1'/> + <rect id='gpPt3' stroke-width='0.222' stroke='currentColor' x='-1' y='-1' width='2' height='2'/> + <rect id='gpPt4' stroke-width='0.222' stroke='currentColor' fill='currentColor' x='-1' y='-1' width='2' height='2'/> + <circle id='gpPt5' stroke-width='0.222' stroke='currentColor' cx='0' cy='0' r='1'/> + <use xlink:href='#gpPt5' id='gpPt6' fill='currentColor' stroke='none'/> + <path id='gpPt7' stroke-width='0.222' stroke='currentColor' d='M0,-1.33 L-1.33,0.67 L1.33,0.67 z'/> + <use xlink:href='#gpPt7' id='gpPt8' fill='currentColor' stroke='none'/> + <use xlink:href='#gpPt7' id='gpPt9' stroke='currentColor' transform='rotate(180)'/> + <use xlink:href='#gpPt9' id='gpPt10' fill='currentColor' stroke='none'/> + <use xlink:href='#gpPt3' id='gpPt11' stroke='currentColor' transform='rotate(45)'/> + <use xlink:href='#gpPt11' id='gpPt12' fill='currentColor' stroke='none'/> + <path id='gpPt13' stroke-width='0.222' stroke='currentColor' d='M0,1.330 L1.265,0.411 L0.782,-1.067 L-0.782,-1.076 L-1.265,0.411 z'/> + <use xlink:href='#gpPt13' id='gpPt14' fill='currentColor' stroke='none'/> + <filter id='textbox' filterUnits='objectBoundingBox' x='0' y='0' height='1' width='1'> + <feFlood flood-color='white' flood-opacity='1' result='bgnd'/> + <feComposite in='SourceGraphic' in2='bgnd' operator='atop'/> + </filter> + <filter id='greybox' filterUnits='objectBoundingBox' x='0' y='0' height='1' width='1'> + <feFlood flood-color='lightgrey' flood-opacity='1' result='grey'/> + <feComposite in='SourceGraphic' in2='grey' operator='atop'/> + </filter> +</defs> +<g fill="none" color="white" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,422.4 L80.9,422.4 M575.0,422.4 L566.0,422.4 '/> <g transform="translate(63.6,426.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="end"> + <text><tspan font-family="Arial" > 0</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,361.0 L80.9,361.0 M575.0,361.0 L566.0,361.0 '/> <g transform="translate(63.6,364.9)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="end"> + <text><tspan font-family="Arial" > 100</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,299.6 L80.9,299.6 M575.0,299.6 L566.0,299.6 '/> <g transform="translate(63.6,303.5)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="end"> + <text><tspan font-family="Arial" > 200</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,238.2 L80.9,238.2 M575.0,238.2 L566.0,238.2 '/> <g transform="translate(63.6,242.1)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="end"> + <text><tspan font-family="Arial" > 300</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,176.9 L80.9,176.9 M575.0,176.9 L566.0,176.9 '/> <g transform="translate(63.6,180.8)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="end"> + <text><tspan font-family="Arial" > 400</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,115.5 L80.9,115.5 M575.0,115.5 L566.0,115.5 '/> <g transform="translate(63.6,119.4)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="end"> + <text><tspan font-family="Arial" > 500</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,54.1 L80.9,54.1 M575.0,54.1 L566.0,54.1 '/> <g transform="translate(63.6,58.0)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="end"> + <text><tspan font-family="Arial" > 600</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M103.3,422.4 L103.3,413.4 M103.3,54.1 L103.3,63.1 '/> <g transform="translate(103.3,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >1 cpu</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M166.2,422.4 L166.2,413.4 M166.2,54.1 L166.2,63.1 '/> <g transform="translate(166.2,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >2 cpus</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M229.1,422.4 L229.1,413.4 M229.1,54.1 L229.1,63.1 '/> <g transform="translate(229.1,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >4 cpus</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M292.0,422.4 L292.0,413.4 M292.0,54.1 L292.0,63.1 '/> <g transform="translate(292.0,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >8 cpus</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M354.9,422.4 L354.9,413.4 M354.9,54.1 L354.9,63.1 '/> <g transform="translate(354.9,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >16 cpus</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M417.8,422.4 L417.8,413.4 M417.8,54.1 L417.8,63.1 '/> <g transform="translate(417.8,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >32 cpus</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M480.7,422.4 L480.7,413.4 M480.7,54.1 L480.7,63.1 '/> <g transform="translate(480.7,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >gpu</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M543.6,422.4 L543.6,413.4 M543.6,54.1 L543.6,63.1 '/> <g transform="translate(543.6,444.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >gpu(only)</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,54.1 L71.9,422.4 L575.0,422.4 L575.0,54.1 L71.9,54.1 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g transform="translate(16.3,238.3) rotate(270)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >time(ms)</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g transform="translate(323.4,471.3)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >CPUs</tspan></text> + </g> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g transform="translate(323.4,31.0)" stroke="none" fill="black" font-family="Arial" font-size="12.00" text-anchor="middle"> + <text><tspan font-family="Arial" >twice benchmark</tspan></text> + </g> +</g> + <g id="gnuplot_plot_1" ><title>gnuplot_plot_1</title> +<g fill="none" color="white" stroke="black" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '87.6,422.4 119.2,422.4 119.2,331.5 87.6,331.5 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M87.6,422.4 L87.6,331.6 L119.1,331.6 L119.1,422.4 L87.6,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '150.5,422.4 182.1,422.4 182.1,372.7 150.5,372.7 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M150.5,422.4 L150.5,372.8 L182.0,372.8 L182.0,422.4 L150.5,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '213.4,422.4 244.9,422.4 244.9,397.4 213.4,397.4 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M213.4,422.4 L213.4,397.5 L244.8,397.5 L244.8,422.4 L213.4,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '276.3,422.4 307.8,422.4 307.8,409.9 276.3,409.9 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M276.3,422.4 L276.3,410.0 L307.7,410.0 L307.7,422.4 L276.3,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '339.2,422.4 370.7,422.4 370.7,415.6 339.2,415.6 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M339.2,422.4 L339.2,415.7 L370.6,415.7 L370.6,422.4 L339.2,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '402.1,422.4 433.6,422.4 433.6,418.7 402.1,418.7 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M402.1,422.4 L402.1,418.8 L433.5,418.8 L433.5,422.4 L402.1,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '464.9,422.4 496.5,422.4 496.5,89.1 464.9,89.1 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M464.9,422.4 L464.9,89.2 L496.4,89.2 L496.4,422.4 L464.9,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> + <g stroke='none' shape-rendering='crispEdges'> + <polygon fill = 'rgb( 0, 255, 255)' points = '527.8,422.4 559.4,422.4 559.4,421.8 527.8,421.8 '/> + </g> + <path stroke='rgb( 0, 0, 0)' d='M527.8,422.4 L527.8,421.9 L559.3,421.9 L559.3,422.4 L527.8,422.4 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> +</g> + </g> +<g fill="none" color="black" stroke="currentColor" stroke-width="2.00" stroke-linecap="butt" stroke-linejoin="miter"> +</g> +<g fill="none" color="black" stroke="black" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> +</g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> + <path stroke='black' d='M71.9,54.1 L71.9,422.4 L575.0,422.4 L575.0,54.1 L71.9,54.1 Z '/></g> +<g fill="none" color="black" stroke="currentColor" stroke-width="1.00" stroke-linecap="butt" stroke-linejoin="miter"> +</g> +</g> +</svg> +
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/paper/fig/twice.xbb Sat Feb 03 20:56:35 2018 +0900 @@ -0,0 +1,8 @@ +%%Title: fig/twice.pdf +%%Creator: extractbb 20170318 +%%BoundingBox: 0 0 600 480 +%%HiResBoundingBox: 0.000000 0.000000 600.000000 480.000000 +%%PDFVersion: 1.3 +%%Pages: 1 +%%CreationDate: Sat Feb 3 18:15:19 2018 +
--- a/paper/gpu.tex Fri Feb 02 04:14:20 2018 +0900 +++ b/paper/gpu.tex Sat Feb 03 20:56:35 2018 +0900 @@ -37,6 +37,8 @@ また、block 内の thread 数は blockDim という組み込み変数で取得でき、これも3次元のベクター型になっている。 CUDA では これらの組み込み変数から thread が対応するデータを割り出し、データ並列の処理を行う。 +\newpage + \section{CUDAWorker} CUDAWorker は TaskManager から送信される CUDA用の Task を取得し、実行を行う。 @@ -85,6 +87,8 @@ 実際にcuLaunchKernel 関数を使用している部分を \coderef{cuLaunchKernel} に示す。 +\newpage + \lstinputlisting[caption=kernel に起動, label=code:cuLaunchKernel]{./src/cuLaunchKernel.cbc} Gears OS ではデータ並列 Task の際は Iterator Interface を持っており、 そこで指定した長さ、次元数に応じて cuLaunchKernel の引数を決定する(\coderef{cuLaunchKernel} 11-18行目)。