Mercurial > hg > Members > masakoha > masa

<!DOCTYPE html>

<!--
  Google HTML5 slide template

  Authors: Luke Mahé (code)
           Marcin Wichary (code and design)

           Dominic Mazzoni (browser compatibility)
           Charles Chen (ChromeVox support)

  URL: http://code.google.com/p/html5slides/
-->

<html>
  <head>
    <title>2013-05-21</title>

    <meta charset='utf-8'>
    <script
      src='http://html5slides.googlecode.com/svn/trunk/slides.js'></script>
  </head>

  <style>
    /* Your individual styles here, or just use inline styles if that’s
       what you want. */
  .slides article { background-image: none !important; background-color: white; }


  </style>

  <body style='display: none'>

    <section class='slides layout-regular template-default'>

      <!-- Your slides (<article>s) go here. Delete or comment out the
           slides below.-->

      <article>
        <h1>
          Ceriumによる
          <br>
          正規表現マッチャの実装
        </h1>
        <p>
          Masataka Kohagura
          <br>
          14th May , 2013
        </p>
      </article>

      <article>
        <h3>
        研究目的
        </h3>
        <p>
        本研究室では、Cell用に作られたCeriumにて並列プログラミングを行なっている。様々な例題を実装することにより、どのような問題でも並列処理ができることを証明する。
        </p>
        <p>
        現在は文字列サーチを実装している段階で、ボイヤームーア法を実装している。
        セミグループという、分割したファイルに対して並列処理をさせるような手法によって、既存の文字列サーチと処理速度を比較し、どれだけ速くなるのかを測定する。
        </p>
      </article>

      <article class='smaller'>
        <h3>
        ボイヤームーア法とは(1)
        </h3>
        <p>
        </p>
		<div align="center">
		<IMG SRC="BM1.jpg" ALT="BM1">
		</div>
		<p>
        検索したい文字列(pattern data)の長さをmとする。
		</p>
        <p>
        pattarn dataとtext dataを、pattern dataの末尾から比較を行なっていく。<br>比較したtext dataの文字列がpattern dataの要素に含まれていない場合は、pattern dataをm個ずらし、再比較を行う。
        </p>
      </article>


      <article class='smaller'>
        <h3>
        ボイヤームーア法とは(2)
        </h3>
        <p>
        </p>
		<div align="center">
		<IMG SRC="BM2.jpg" ALT="BM1">
		</div>
		<p>
        比較したtext dataの文字がpattern dataの要素に含まれている場合は、pattern dataの末尾からその要素がどれだけ離れているかで決定される。この図であれば、末尾から2個離れているので、pattern dataを2個ずらし、最比較。
		</p>
      </article>

      <article class='smaller'>
        <h3>
        BM法をCで実装
        </h3>
        <section>
        <pre>
int BM_method(char *text,char *pattern,unsigned long long *match_string){
    int skip[256];
    int i,j,text_len,pattern_len;
    int k = 0;
    text_len = strlen(text);
    pattern_len = strlen(pattern);

    for (i = 0; i < 256; ++i){
        skip[i] = pattern_len;
    }
    for (i = 0; i < pattern_len-1 ; ++i){
        skip[(int)pattern[i]] = pattern_len - i - 1;
    }

    i = pattern_len - 1;
    while ( i < text_len){
        j = pattern_len - 1;
        while (text[i] == pattern[j]){
            if (j == 0){
                match_string[k] = text[i];
                k++;
            }
            --i,--j;
        }
        i = i + max((int)skip[(int)text[i]],pattern_len - j);
    }
    return -1;
}
        </pre>
      </article>

      <article class>
        <h3>
        分割時の処理について
        </h3>
        <p>現在実装している処理</p>
		<div align="center">
		<IMG SRC="search_idata.jpg" ALT="sid">
		</div>
		<p>
        i_dataを1文字だけ多く取ってきて、abが分割されたときでも結果が返ってきてくれるように設定している。
		</p>
      </article>

      <article class>
		<h3>
		現在の問題点
		</h3>
		<p>
        分割回りで結果が返ってこない。(上手く処理されていない）
		</p>
        <p>
        taskが2個以上になるとき、cpuの個数を2個以上に設定しないと結果が返ってこない。
        </p>
      </article>

  </body>
</html>
author	Masataka Kohagura <e085726@ie.u-ryukyu.ac.jp>
date	Mon, 03 Feb 2014 17:47:55 +0900
parents	e4748bca1eb3
children