annotate software/Torque.md @ 120:d03247694a4b

backup 2023-05-09
author autobackup
date Tue, 09 May 2023 00:10:03 +0900
parents b6c284fd5ae4
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
1 # Torque
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
2 ## Torqueとは
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
3 Torqueは、Job Schedulerである。Job Schedulerは、クラスターに次々と投入されるジョブを、キューなどを用い、スケジューリングを行って管理してくれる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
4 クラスターを利用して、実験を行う際には、Job Schedulerを用いる必要がある。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
5 それは、他のクラスターユーザーが存在する場合に、同時に別スレッドで処理を実行してしまうならば、CPUなどのリソースを取り合うことになるため、台数効果などの実験結果が正確に得られないからである。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
6
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
7 まず、[Torqueのインストール](./Torque/install)をする。
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
8 それでは、Torqueの使い方を見てみよう。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
9
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
10 ## Torqueがインストールされたserverにログインする
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
11
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
12 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
13 ssh ie-user@tino-vm1.ads.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
14 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
15
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
16 このvmはシス管か, 誰かに頼んで上げてもらう(自分のときは名嘉村研の先輩に頼みました
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
17
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
18 ## queueの作成(サーバー側)
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
19 テストとして"tqueue"という名前のqueueを作る。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
20 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
21 Qmgr: create queue tqueue queue_type = execution
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
22 Qmgr: set queue tqueue enabled = true
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
23 Qmgr: set queue tqueue started = true
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
24 Qmgr: set server default_queue = tqueue
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
25 Qmgr: set server scheduling = true
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
26 Qmgr: set queue tqueue resources_max.ncpus = 2
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
27 Qmgr: set queue tqueue resources_max.nodes = 2
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
28 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
29 各コマンドの意味を以下に示す。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
30 - `create queue <queue名> queue_type = <queueのタイプ>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
31 - 指定された名前、タイプのqueueを作成する。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
32 - タイプはE,executionを指定するとexecution queueになりR,routeを指定するとrouting queueとなる。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
33 - `set queue <queue名> <Attributes>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
34 - 指定したqueueの指定したAttributesを設定する。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
35 例で設定したAttributesについて
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
36 - `enabled = <true or false>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
37 - trueにすると新しいjobを受け付ける。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
38 - `started = <true or false>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
39 - trueにするとqueueの中のjobの実行を許可する。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
40 - `resources_max.ncpus = <cpu数>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
41 - queueに割り当てるcpu数を設定する。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
42 - `resources_max.nodes = <node数>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
43 - queueに割り当てるnode数を設定する。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
44 - `set server <Attributes>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
45 - 指定したqueueの指定したAttributesを設定する。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
46 例で設定したAttributesについて
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
47 - `default_queue = <queue名>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
48 - 指定したqueueをデフォルトqueueとする。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
49 - `scheduling = <true or false>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
50 - Schedulingを可能にする。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
51
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
52 ここまでの設定の確認を行う。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
53
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
54 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
55 Qmgr: print server
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
56 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
57 これでサーバーの設定が確認できる。作成したqueueが表示されればうまくいっている。また、以下のコマンドqueueの確認ができる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
58
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
59 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
60 Qmgr: list queue <queue名>
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
61 ```
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
62
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
63 ### 設定の保存
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
64 以下のようにリダイレクトして保存する。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
65
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
66 `#echo "p s" | <torqueのインストール先>/bin/qmgr > <ファイル名>`
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
67
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
68 次回以降はこのファイルを与えることで設定が楽になる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
69
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
70 ファイルの与え方もリダイレクト
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
71
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
72 ` # <torqueのインストール先>/bin/qmgr < <ファイル名>`
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
73
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
74 # Torque チュートリアル
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
75
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
76
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
77 ## Torqueを使用する前に
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
78 Torqueは、naha.ie.u-ryukyu.ac.jp上で使用することができる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
79 実験を行うためには、パスワードを使用せずにsshできるように、鍵認証の設定を行わなくてはならない。しかし、mauiユーザーでは既に設定が行われているため、Torqueをすぐに使用することができるようになっている。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
80
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
81 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
82 % ssh ie-user@tino-vm1.ads.ie.u-ryukyu.ac.jp
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
83 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
84
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
85 尚、パスワードはサーバー班が管理しているので、サーバー班から教えてもらうことができる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
86
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
87 ログインすると、ホームディレクトリにProjectフォルダがあるので、その中に自分の学籍番号のフォルダを作成し、その中で作業を行うようにする。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
88
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
89 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
90 % mkdir student/eXX57XX
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
91 % cd student/eXX57XX
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
92 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
93
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
94 追記
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
95 ただし, Projectフォルダは分散環境上で共有されていないため, ファイルを共有したい場合は /mnt/data/* の中にworkspaceを作り, その中で作業を行うのが望ましい.
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
96
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
97
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
98 ## Torque上でジョブを実行する
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
99 ### ジョブを実行するための準備
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
100 jobs.shを作成する。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
101
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
102 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
103 - jobs.sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
104 #!/bin/sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
105 echo hello
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
106 hostname
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
107 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
108
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
109 ### ジョブの実行
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
110 Torqueのジョブは、qsubコマンドによって投入される。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
111
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
112 `% qsub jobs.sh`
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
113
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
114 このように実行すると、1台のクラスターでのみ処理が行なわれる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
115
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
116 ### ジョブの実行結果
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
117 スクリプトの実行が終わると、jobs.sh.oXXX, jobs.sh.eXXXという2つのファイルが生成される。XXXは、Torqueによって割り振られたジョブ番号である。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
118 ファイルにはそれぞれ、標準出力と、標準エラー出力の内容が出力されている。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
119
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
120 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
121 % cat jobs.sh.oXXX
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
122 hello
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
123 cls001.cs.ie.u-ryukyu.ac.jp
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
124 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
125
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
126 ### 複数のノードを用いた実験
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
127 また10台で実験を行うには次のように、実行すれば良い。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
128
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
129 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
130 % qsub -l nodes=10 jobs.sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
131 % cat jobs.sh.oXXX
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
132 hello
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
133 cls010.cs.ie.u-ryukyu.ac.jp
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
134 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
135
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
136 10台のノードを指定しても、実際には10台が使用可能になっただけであり、10台で実行された訳ではない。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
137 このcls010.csは、親ノードである。この親ノード(cls010.cs)から、他のノード(cls001-cls009)に対して、命令を行うようにプログラミングする必要がある。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
138
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
139 例えば、以下のような処理を行う必要がある。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
140
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
141 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
142 - jobs.sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
143 #!/bin/sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
144 #PBS -N ExampleJob
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
145 #PBS -l nodes=10,walltime=00:01:00
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
146 for serv in `cat $PBS_NODEFILE`
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
147 do
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
148 ssh $serv hostname &
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
149 done
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
150 wait
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
151 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
152
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
153 #PBSを用いてコメントをつけると、その部分が、qsubコマンドのオプションとして認識される。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
154 - -N: ExampleJob.oXXXのように、ジョブに名前を付けることができるようになる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
155 - -l: ジョブのオプション。nodes=ノード数、walltime=処理制限時間のように設定できる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
156
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
157 親ノードのシェルには、$PBS_NODEFILEという、環境変数が準備されている。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
158
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
159 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
160 - (例)
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
161 % echo $PBS_NODEFILE
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
162 /var/spool/torque/aux/XXX.naha.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
163 % cat $PBS_NODEFILE
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
164 cls010.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
165 cls009.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
166 ...(略)...
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
167 cls001.cs.ie.u-ryukyu.ac.jp
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
168 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
169
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
170 $PBS_NODEFILEの先頭行のホストが親ノードである。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
171
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
172 このスクリプトを実行してみると、以下のようになった。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
173
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
174 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
175 % qsub jobs.sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
176 % cat ExampleJob.oXXX
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
177 cls003.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
178 cls009.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
179 cls008.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
180 cls010.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
181 cls007.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
182 cls006.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
183 cls004.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
184 cls005.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
185 cls002.cs.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
186 cls001.cs.ie.u-ryukyu.ac.jp
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
187 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
188
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
189 このように、10台のノードで、hostnameコマンドを実行した結果が表示されていることが分かる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
190 これらの他に、mpiを用いて、他のノードにジョブを割り振ることもできる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
191
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
192 ## 1ノード2CPUを用いた実験方法
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
193 クラスターはCoreDuoを搭載しているため、CPUを2つまで使用することができる。つまり、1ノードで、2つの処理を行なうことができる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
194 その場合は、以下のように実行する。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
195 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
196 % qsub -l nodes=10:ppn=2 jobs.sh
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
197 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
198 この場合、$PBS_NODEFILEには、同じホストが2つずつ登録されていることになる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
199
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
200 ## 1ジョブでマルチタスクの実行
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
201 また、1つのジョブで、複数の同じタスクを実行することもできる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
202 その場合は、以下のように実行する。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
203 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
204 % qsub -t 1-3 -l nodes=10 jobs.sh
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
205 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
206 この場合、出力結果も3つのタスクごとに出力される。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
207
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
208
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
209 ## その他、便利なコマンド
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
210 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
211 - jobs.sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
212 #!/bin/sh
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
213 #PBS -N ExampleJob
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
214 #PBS -l nodes=1,walltime=00:01:00
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
215 sleep 10
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
216 echo hello
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
217 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
218
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
219 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
220 % qsub jobs.sh
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
221 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
222 sleepを行って、ジョブを長引かせてテストする。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
223 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
224 % qstat
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
225 Job id Name User Time Use S Queue
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
226 ___
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
227 XXX.naha ExampleJob maui 0 R batch
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
228 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
229
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
230 このように、ジョブの状態を確認することができる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
231
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
232 ## Error 対処メモ
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
233 Nodeの数を4 9台に設定するとなぜかjobがうまく動かない。以下のようなエラーがでる。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
234 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
235 qsub: Job exceeds queue resource limits MSG=cannot satisfy queue max nodes requirement
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
236 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
237
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
238 対処法はNodeの数の設定時に、一桁だった場合は 05 といったように0をつけることで解決する。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
239 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
240 % qsub -l nodes=05 test.sh
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
241 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
242
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
243 全体で登録しているNodeの数が二桁になるとこうする必要がある。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
244 また、Nodeの数を間違っていなくても、上のエラーがでるときがある。その時は今使っているqueueの設定をみる。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
245 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
246 % qmgr -c "p s"
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
247 :
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
248 create queue cqueue
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
249 set queue cqueue queue_type = Execution
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
250 set queue cqueue resources_max.ncpus = 184
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
251 set queue cqueue resources_max.nodes = 46
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
252 set queue cqueue enabled = True
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
253 set queue cqueue started = True
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
254 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
255
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
256 ncpus の値やnodesの値を確認する。resource_max.nodes でなく、 resource_max.nodect も設定したほうがいいかもしれない。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
257 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
258 % sudo qmgr
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
259 set queue cqueue resources_max.nodect = 46
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
260 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
261
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
262
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
263 ## crとcsの両方のtorqueクラスタを1つで使う
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
264 crで作成したtorqueのクラスタをcsの方でまとめて使うための方法。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
265
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
266 以下の2つの設定変更により行える
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
267 - クライアント側のserver_nameファイルとconfigファイルをmass00.cs(親)の設定に変更する。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
268 - 親のtorqueのnodesの設定にcrのクラスタの情報を付け加える。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
269
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
270 上記の変更をpbsをstopさせて変更を行う。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
271 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
272 % sudo /etc/init.d/torque stop
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
273 ```
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
274 ### クライアント側のserver_nameファイルとconfigファイルをmass00.cs(親)の設定に変更する
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
275 変更するファイルは以下の2つ
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
276 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
277 /var/spool/torque/server_name
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
278 /var/spool/torque/mom_priv/config
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
279 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
280 それぞれの中身をみてみる
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
281 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
282 mass01(cr) % cat /var/spool/torque/server_name
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
283 mass00.cr.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
284 mass01(cr) % cat /var/spool/torque/mom_priv/config
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
285 $logevent 0x1fff
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
286 $max_load 1.2
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
287 $ideal_load 1.0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
288 $pbsserver mass00.cr.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
289 $restricted mass00.cr.ie.u-ryukyu.ac.jp%
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
290 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
291 上記のmass00.crをmass.csに変更すればよい
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
292
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
293 ###親のtorqueのnodesの設定にcrのクラスタの情報を付け加える
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
294 変更するファイルは以下のもの
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
295 /var/spool/torque/server_priv/nodes
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
296
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
297 中にはクライアントとなるサーバのドメインと使用するcpuの数の記述がある。
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
298 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
299 % sudo cat /var/spool/torque/server_priv/nodes
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
300 mass02.cs.ie.u-ryukyu.ac.jp np=4
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
301 mass03.cs.ie.u-ryukyu.ac.jp np=4
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
302 mass04.cs.ie.u-ryukyu.ac.jp np=4
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
303 :略
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
304 mass46.cs.ie.u-ryukyu.ac.jp np=4
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
305 mass47.cs.ie.u-ryukyu.ac.jp np=4
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
306 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
307
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
308 これにmass01.cr.ie.u-ryukyu.ac.jp といったcr側のクライアントの情報を追加してあげればよい。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
309
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
310 設定が終わったら
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
311 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
312 % sudo /etc/init.d/torque start
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
313 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
314 をして起動させる。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
315 クライアント全部を手作業でやるのは面倒臭いのでcapistrano を使ってください。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
316
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
317 ## workspaceの作り方
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
318 ### 力技(rootのゴリ押し)
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
319
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
320 例えばie-user@tino-vm1.ads.ie.u-ryukyu.ac.jp で
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
321 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
322 mkdir /mnt/data/fuga
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
323 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
324
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
325 と作成しようとしてもPermmision Errorになる. また, rootで作成しても
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
326 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
327 sudo mkdir /mnt/data/fuga
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
328 ls -l /mnt/data
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
329 合計 48
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
330 ...
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
331 drwxr-xr-x 2 ie-user ie-user 4096 8月 9 2016 examples
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
332 drwxr-xr-x 2 nfsnobody nfsnobody 6 1月 24 06:08 fuga
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
333 ...
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
334 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
335
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
336 となり, usernameとgroupが違うため, 他のvmがfugaにアクセスできない. (そもそも, このvmでもsudoがないとファイルが書き込めないはず.
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
337
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
338 このファイルを管理しているのはtino-vm2の方なので, そこにアクセスしてファイル権限を書き換える
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
339 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
340 ssh tino-vm2.ads.ie.u-ryukyu.ac.jp
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
341 # /exports/data がmountされていたため, この中のfugaを変える
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
342 sudo chown -R ie-user:ie-user /exports/data/fuga
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
343 exit
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
344 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
345
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
346 その後ファイルを見てみると書き換わっている
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
347 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
348 ls -l /mnt/data
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
349 合計 48
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
350 ...
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
351 drwxr-xr-x 2 ie-user ie-user 4096 8月 9 2016 examples
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
352 drwxr-xr-x 2 ie-user ie-user 6 1月 24 06:08 fuga
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
353 ...
2
b6c284fd5ae4 backup 2020-12-16
autobackup
parents: 0
diff changeset
354 ```
0
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
355
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
356 ただゴリ押しなので他の方法もあるかも. (というか普通は管理者に言うべきなんだろうなぁ....
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
357
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
358 ## Christie用メモ
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
359 @tinovm1:/mnt/data/christie-workspace/の中で作業を行う。
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
360 christieを更新した際はjarファイルを置き換えること。(Jenkinsで自動化できるらしい。)
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
361
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
362
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
363
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
364 ## 参考文献
e12992dca4a0 init from Growi
anatofuz <anatofuz@cr.ie.u-ryukyu.ac.jp>
parents:
diff changeset
365 http://docs.adaptivecomputing.com/torque/6-1-2/adminGuide/torque.htm#topics/torque/0-intro/introduction.htm%3FTocPath%3DChapter%25201%253A%2520Introduction%7C_____0