annotate paper/chapter3.tex @ 43:c153591bb3f7

add some image files
author Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
date Mon, 17 Feb 2014 22:51:58 +0900
parents 6bb9cae3f7e4
children cd95400bcaec
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
36
03c8790e34b5 add OUTLINE chapter3
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 11
diff changeset
1 \chapter{Cerium Task Manager を使用した例題}
0
9bf2694ed231 add some files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents:
diff changeset
2 \label{chap:poordirection}
9bf2694ed231 add some files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents:
diff changeset
3
11
5e67750b1c4f write chapter label
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 6
diff changeset
4 \section{File Read}
43
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
5 テキストファイルをある一定のサイズに分割して読み込むプログラムである。このプログラムでは、pread という関数で実装した。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
6 pread 関数は UNIX 標準に関するヘッダファイル、unistd.h に含まれている関数である。(表\ref{table:pread})
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
7 読み込んだテキストファイルはバッファに格納されるが、その格納先は TaskManager の API でメモリを確保する。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
8 \begin{tiny}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
9 \begin{table}[ht]
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
10 \begin{center}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
11 \label{table:pread}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
12 \small
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
13 ssize\_t pread(int fd, void *buf, size\_t nbyte, off\_t offset);
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
14 \begin{tabular}[t]{c|l}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
15 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
16 int fd & 読み込むファイルディスクリプタ \\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
17 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
18 void *buf & 予め用意したバッファへの書き込み \\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
19 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
20 size\_t nbyte & 読み込むサイズ \\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
21 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
22 off\_t offset& ファイルの先頭からのオフセット \\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
23 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
24 \end{tabular}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
25 \caption{pread 関数の概要}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
26 \end{center}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
27 \end{table}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
28 \end{tiny}
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
29
43
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
30 1GB のテキストファイルを分割して読み込み終わるまでの時間を表\ref{table:preaddata}に示す。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
31 \begin{tiny}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
32 \begin{table}[ht]
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
33 \begin{center}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
34 \label{table:preaddata}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
35 \small
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
36 \begin{tabular}[t]{c|l}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
37 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
38 分割サイズ & 読み込み速度(s)\\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
39 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
40 16KB & XX.XXX \\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
41 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
42 16MB & XX.XXX \\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
43 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
44 256MB & XX.XXX \\
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
45 \hline
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
46 \end{tabular}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
47 \caption{file read の実行結果}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
48 \end{center}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
49 \end{table}
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
50 \end{tiny}
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
51
43
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
52 分割サイズを大きくすると、pread の呼ばれる回数が少なくなるので読み込むことが速くなる。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
53 しかし、ある一定以上の大きさになると I/O ネックが勝ってしまい、読み込み速度が変わらない。
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
54
36
03c8790e34b5 add OUTLINE chapter3
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 11
diff changeset
55 \section{Boyer-Moore String Search Algorithm}
43
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
56 読み込んだテキストファイルに対して文字列検索を行う。文字列検索のアルゴリズムとして、Boyer-Moore String Search を実装した。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
57 このアルゴリズムは 1977 年に Robert S. Boyer と J Strother Moore が開発した効率的なアルゴリズムである。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
58
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
59 Boyer-Moore String Search を紹介する前に、文字列検索で比較的単純なアルゴリズムである力任せ法を紹介する。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
60 なお、テキストファイルに含まれている文字列を text、検索する文字列を pattern と定義する。
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
61
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
62 力任せ法(総当り法とも呼ばれる)は、text と pattern を先頭から比較していき、pattern と一致しなければ pattern を1文字分だけ後ろにずらして再度比較をしていくアルゴリズムである。
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
63
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
64 ・ IO を含む Task なので、ここで説明
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
65
43
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
66 ・ BM Search の概要
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
67
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
68 BM Search は XXX が XXX 年に発表した文字列探索アルゴリズム
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
69
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
70 ほかにもいろいろ
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
71
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
72 力まかせ法についても少し書くか
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
73
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
74 [image]BM Search の skip table の説明と図
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
75
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
76 [image]BM Search の実行時の説明と図
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
77
c153591bb3f7 add some image files
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 42
diff changeset
78 ・ BM Search をテキストにかけて、検索文字列が含まれてい数をカウントする。
42
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
79
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
80 図\ref{fig:includeio}
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
81
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
82
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
83 \begin{figure}[htbp]
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
84 \begin{center}
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
85 \includegraphics[width=1.0\textwidth]{fig/includeio.pdf}
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
86 \end{center}
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
87 \caption{[image]IO を含む Task の図}
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
88 \label{fig:includeio}
6bb9cae3f7e4 add images
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 37
diff changeset
89 \end{figure}
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
90
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
91
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
92 ・ [image]分割で読み込むときに分割サイズを間違えると正しい結果が返ってこない。
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
93
36
03c8790e34b5 add OUTLINE chapter3
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 11
diff changeset
94 ・ 実験結果を載っけたい
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
95
36
03c8790e34b5 add OUTLINE chapter3
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 11
diff changeset
96 実験結果どうしよう
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
97
36
03c8790e34b5 add OUTLINE chapter3
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 11
diff changeset
98 文字列検索して数を数えるプログラムだが、それに対応する Mac のコマンドが、grep -c [検索文字列]
37
ce985cabf699 add OUTLINE chapter4
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 36
diff changeset
99
36
03c8790e34b5 add OUTLINE chapter3
Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
parents: 11
diff changeset
100 比較対象をどうしようかしら