annotate paper/c5.tex @ 67:9c16f6b18100

add result
author Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
date Tue, 16 Feb 2016 17:54:07 +0900
parents 5defec0399f9
children c01a514d33f7
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
53
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
1 \chapter{ベンチマーク}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
2 本項で行なった実験の環境は以下の通りである。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
3 \begin{itemize}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
4 \item Mac OS X 10.10.5
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
5 \item 2*2.66 GHz 6-Core Intel Xeon
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
6 \item Memory 16GB 1333MHz DDR3
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
7 \item 1TB HDD
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
8 \end{itemize}
45
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 40
diff changeset
9
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
10 Cerium で実装した Word Count と Mac の wc の比較と、実装した正規表現と Mac の egrep の比較を行なった。
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
11 また、それぞれの結果に実装した並列処理向け I/O の結果も含む。
54
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 53
diff changeset
12
16
a3c5125aea03 add images
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 15
diff changeset
13 \section{Word Count}
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
14 ファイルの大きさは 約500MByte で、このファイルには 約650万行、約8300万単語が含まれている。
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
15
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
16 表\ref{table:IOwordcount} は、ファイル読み込みを含めた Word Count の結果である。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
17 Mac の wc ではこのファイルを処理するのに 10.59 秒かかる。それに対して、Cerium Word Count は mmap Blocked Read 全ての状況で Mac の wc よりも速いことを示している。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
18 Cerium Word Count 12 CPU のとき、7.83 秒で処理をしており、Mac の wc の 1.4 倍ほど速くなっている。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
19
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
20 mmap は読み込みを OS が制御しており、書き手が制御できない。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
21 また Word Count が走る際ファイルアクセスはランダムアクセスとなる。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
22 mmap はランダムアクセスを想定していなくてグラフにばらつきが起こっていると考えられる。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
23 Blocked Read では読み込みをプログラムの書き手が制御しており、ファイルの読み込みもファイルの先頭から順次読み込みを行なっている。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
24 そのため、読み込みを含めた結果にばらつきが起こりにくくなっていると予想される。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
25
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
26 \begin{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
27 \begin{table}[ht]
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
28 \begin{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
29 \begin{tabular}[t]{|r|r|r|r|}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
30 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
31 CPU Num / 実行方式 & Mac(wc) & mmap & Blocked Read\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
32 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
33 1 & 10.590 & 9.96 & 9.33 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
34 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
35 4 & --- & 8.63 & 8.52 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
36 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
37 8 & --- & 10.35 & 8.04 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
38 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
39 12 & --- & 9.26 & 7.82 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
40 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
41 \end{tabular}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
42 \caption{ファイル読み込みを含む Word Count}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
43 \label{table:IOwordcount}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
44 \end{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
45 \end{table}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
46 \end{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
47
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
48 \newpage
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
49 表\ref{fig:wordcount} はファイル読み込みを含まない Word Count の結果である。
50
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 47
diff changeset
50
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
51 Mac の wc ではこのファイルを処理するのに 4.08 秒かかる。それに対して、Cerium Word Count は 1 CPU で 3.70 秒、12 CPU だと 0.40 秒で処理できる。
53
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
52
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
53 1 CPU で動作させると Mac の wc よりも 1.1 倍ほど速くなり、12 CPU で動作させると wc よりも 10.2 倍ほど速くなった。
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
54 1 CPU と 12 CPU で比較すると、9.25 倍ほど速くなった。
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
55
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
56 ファイルを読み込んだ結果と比較すると、ファイルを読み込まないで実行したほうが 6,7 秒ほど速くなる。
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
57 これよりファイルを読み込んだ文字列処理の場合、処理時間の60\%から90\% はファイルの読み込みであることがわかる。
53
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
58
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
59 \begin{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
60 \begin{table}[ht]
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
61 \begin{center}
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
62 \begin{tabular}[t]{|r|r|}
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
63 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
64 実行方式 & 実行速度(秒)\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
65 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
66 Mac(wc) & 4.08 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
67 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
68 Cerium Word Count(CPU 1) & 3.70\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
69 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
70 Cerium Word Count(CPU 4) & 1.00\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
71 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
72 Cerium Word Count(CPU 8) & 0.52\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
73 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
74 Cerium Word Count(CPU 12) & 0.40\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
75 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
76 \end{tabular}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
77 \caption{ファイル読み込み無しの Word Count}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
78 \label{fig:wordcount}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
79 \end{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
80 \end{table}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
81 \end{tiny}
54
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 53
diff changeset
82
16
a3c5125aea03 add images
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 15
diff changeset
83 \section{正規表現}
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
84 当実験では、Mac の egrep 、C で実装した逐次に DFA の状態遷移と照らし合わせる CGrep、Cerium で並列処理をする CeriumGrep を比較している。
47
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 45
diff changeset
85
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
86 表\ref{table:AZaz} は正規表現 '[A-Z][A-Za-z0-9]*s' を 500MB(単語数約8500万)、1GB(単語数約1.7億語)のファイルに対してマッチングを行なった。
53
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
87
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
88 \begin{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
89 \begin{table}[ht]
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
90 \begin{center}
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
91 \begin{tabular}[t]{|c|r|r|r|}
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
92 \hline
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
93 実行方式/File Size(Match Num) & 500MB(536万) & 1GB(1072万) \\
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
94 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
95 CGrep & 20.62 & 40.10\\
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
96 \hline
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
97 CeriumGrep(CPU 12) mmap & 18.00 & 26.96\\
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
98 \hline
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
99 CeriumGrep(CPU 12) bread & 12.48 & 21.14\\
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
100 \hline
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
101 egrep & 59.51 & 119.23\\
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
102 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
103 \end{tabular}
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
104 \caption{ファイルサイズを変化させた各 grep の結果}
56
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
105 \label{table:AZaz}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
106 \end{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
107 \end{table}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
108 \end{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 54
diff changeset
109
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
110 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
111
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
112 表\ref{table:metachar} 500MB(単語数約8500万) のファイルに対して正規表現 '[A-Z][A-Za-z0-9]*s' をマッチングした結果である。
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
113 これはファイル読み込みを含めた結果と読み込みを含めていない結果の比較である。
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
114 egrep は実行するたびにファイル読み込みを行うため、ファイル読み込み無しの測定はなし。
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
115 \begin{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
116 \begin{table}[ht]
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
117 \begin{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
118 \begin{tabular}[t]{|c|r|r|}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
119 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
120 実行方式 & ファイル読み込み有 & ファイル読み込み無\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
121 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
122 CGrep & 21.171 & 16.150\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
123 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
124 CeriumGrep(CPU 2) & 27.061 & 15.401\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
125 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
126 CeriumGrep(CPU 12) & 12.48 & 7.386\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
127 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
128 egrep & 59.51 & --- \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
129 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
130 \end{tabular}
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
131 \caption{ファイル読み込み有りと無しを変化させた各 grep の結果}
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
132 \label{table:metachar}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
133 \end{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
134 \end{table}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
135 \end{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
136
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
137 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
138 表\ref{table:abab}
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
139 aとb が多く含まれている約500MB(単語数約2300万)のファイルに対して、正規表現の状態数を変化させてみた。
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
140 これは読み込みを含んでいる結果で、CeriumGrep のファイル読み込みは Blocked Read、CPU 数 12 にて実行した。
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
141
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
142 \begin{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
143 \begin{table}[ht]
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
144 \begin{center}
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
145 \begin{tabular}[t]{|l|r|r|r|}
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
146 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
147 正規表現 & マッチ数 & CeriumGrep time (s) & egrep time(s)\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
148 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
149 '(a \textbar b)*a(a \textbar b)(a \textbar b)' & 約1950万 & 38.67 & 86.66 \\
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
150 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
151 '(a \textbar b)*a(a \textbar b)(a \textbar b)(a \textbar b)' & 約1640万 & 38.72 & 94.25 \\
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
152 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
153 '(a \textbar b)*a(a \textbar b)(a \textbar b)(a \textbar b)(a \textbar b)' & 約1640万 & 39.59 & 100.98 \\
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
154 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
155 '(a \textbar b)*a(a \textbar b)(a \textbar b)(a \textbar b)(a \textbar b)(a \textbar b)' & 約1550万 & 38.68 & 104.82 \\
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
156 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
157 \end{tabular}
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
158 \caption{正規表現の状態数を増やした Grep の結果}
65
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
159 \label{table:abab}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
160 \end{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
161 \end{table}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
162 \end{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
163
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 62
diff changeset
164 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
53
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 50
diff changeset
165
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
166 表\ref{table:nomatch} ab の文字列がならんでいるところに (W \textbar w)ord の正規表現
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
167 aとb が多く含まれている約500MB(単語数約2300万)のファイルに対して、全くマッチしない正規表現を与えてパターンマッチングさせてみた。
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
168
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
169 \begin{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
170 \begin{table}[ht]
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
171 \begin{center}
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
172 \begin{tabular}[t]{|c|r|}
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
173 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
174 実行方式/File Size(Match Num) & time (s)\\
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
175 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
176 CGrep & 27.130\\
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
177 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
178 CeriumGrep(CPU 12) mmap & 21.576\\
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
179 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
180 CeriumGrep(CPU 12) bread & 19.986\\
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
181 \hline
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
182 egrep & 28.332\\
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
183 \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
184 \end{tabular}
67
9c16f6b18100 add result
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 66
diff changeset
185 \caption{全くマッチングしないパターンを grep した結果}
62
0d13c52a54fd remove bm_search explain
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 61
diff changeset
186 \label{table:nomatch}
61
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
187 \end{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
188 \end{table}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
189 \end{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 57
diff changeset
190
66
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
191 % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
192 %
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
193 % 表\ref{table:abab}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
194 %
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
195 % \begin{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
196 % \begin{table}[ht]
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
197 % \begin{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
198 % \begin{tabular}[t]{|r|r|r|r|}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
199 % \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
200 % CPU Num / 実行方式 & egrep & mmap & Blocked Read\\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
201 % \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
202 % 1 & 83.09 & 57.65 & 40.49 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
203 % \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
204 % 2 & --- & 43.96 & 33.72 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
205 % \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
206 % 4 & --- & 33.37 & 34.26 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
207 % \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
208 % 8 & --- & 35.48 & 32.46 \\
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
209 % \hline
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
210 % \end{tabular}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
211 % \caption{abab}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
212 % \label{table:abab}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
213 % \end{center}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
214 % \end{table}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
215 % \end{tiny}
Masataka Kohagura <kohagura@cr.ie.u-ryukyu.ac.jp>
parents: 65
diff changeset
216 %