# HG changeset patch # User masa # Date 1455800382 -32400 # Node ID 94ec38d9bdc6b805f101f5cd8b4fc0af1d33cff0 # Parent f58bbc4a42f8146b99b1e93d89c64e0dd5da1fbe add abstruct eng diff -r f58bbc4a42f8 -r 94ec38d9bdc6 paper/abstract.tex --- a/paper/abstract.tex Thu Feb 18 21:39:33 2016 +0900 +++ b/paper/abstract.tex Thu Feb 18 21:59:42 2016 +0900 @@ -8,31 +8,26 @@ それぞれの例題によって結果の整合性を取る必要があるが、どのように整合性を取るかは問題によって考慮する必要がある。 本研究で実装した正規表現は、正規表現から正規表現木を生成し、その正規表現木に状態を割り振りながらNFA を生成する。 -もし NFA が生成した場合は Subset Construction で DFA に変換する。 -そして、DFA の生成後、ファイルとマッチングさせる。 +NFA を Subset Construction で DFA に変換する。 +そして、ファイルとマッチングさせる。 +マッチングはマッチの Range と次の状態の組の配列(tState)を生成し、tState を状態遷移させておこなう。 +並列処理はブロック単位で行うが、ブロックにまたがる Pattern はあとでやり直すというシンプルな方法を採用した。 それぞれの例題と Mac に built-in されている wc、egrep と比較し測定を行なった。 \end{abstract} \begin{abstract_eng} Cerium is a parallel programming framework, which developed by our laboratory. -We implemented file read with mmap in parallel string search. -It is easy to use mmap system call to read from file, but current implementation of mmap sometimes does work well. -So we implement `Blocked Read', that is reading file separated by a certain size. - -Examples of parallel string search are included in Cerium. -They are Word Count, Boyer-Moore String Search and Regular Explession. - -It needs to check for correct result in devided file every examples. +We implemented parallel file processing using both mmap and multi threaded read. +The multi-threaded read uses Blocked Read. The parallel proccesing is performed for each block as a map task. +The results of map tasks are summed in a reduce task. -Implemented Regular Explession sequential execute. -At first, convert Regular Explession to binary tree. -Second, allocated state to binary tree. -Third, convert state with Subset Construction. -Finally, implement parallel regular expression. +Examples of parallel string proccesing in Cerium are including Word Count, Boyer-Moore Search and Regular Expression matching. +Regular expression matching uses DFA generation (Subset Construction). +The matching is performed by array of tState, which are pair of matching range of characters and next tState. +The parallel proccesing of the matching is per-block based. +In case of matching patterns which acrosses blocks, the matching processes are executed in the reduce task. -We evaluate examples of Cerium, wc of Mac built-in, and egrep. - - +We compare examples of Cerium with UNIX wc and egrep. \end{abstract_eng}