R.A. Epigonos et al.

Misc > High Performance Computing(HPC)



[メモ] コンピュータをクラスタ化してみる



とりあえずキーワードの羅列だけ。openMosix,ClusterKnoppix,OpenSSI,Kerrighed GridEngine,Rocks Clusters,SCore,巫女ぐにょLinuxスイッチングハブ,,SMPマシン(複数のプロセッサをもつマシン),Beowulf

  1. みっし~の研究生活: Linux HPCクラスターの構築(その2)
  2. Amazon.com: High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI (Nutshell Handbooks): Joseph Sloan: Books
  3. 負荷分散ソフトウエアGrid Engine
  4. rubyneko - 第10回 関西 Debian 勉強会 行ってきました
  5. PC Cluster Consortium
  6. OpenMosixによる計算クラスタの構築
  7. MIKO GNYO/Linux
  8. MIKO GNYO/Linux: 検索結果
  9. kuroyagi さんのノートブック

SSI(Single Server Image)環境

調べた感じだとしたのように5つほど選択肢があるようだ。これらから派生していくつかのディストリビューションがある。ClusterKnoppixは openMosixを組み込んだカーネルを用いたKnoppix。僕にとって重要なのはクラスタの運用中にノードの抜き差しが出来るかどうかだ。

  1. openMosix(LinuxPIM)
  2. Kerrighed
  3. OpenSSI
  1. openmosix|Kerrighed|OpenSSI|LinuxPMI - Google 検索
  2. オープンソースのクラスター管理システム - SourceForge.JP Magazine
  3. Linux.com :: A survey of open source cluster management systems
  4. coLinuxとopenMosixで異機種混合のクラスターを構成する
  5. スラッシュドット・ジャパン | openMosixでHPCクラスタはいかが?


とりあえず上に上げた3つの中で最新の更新のものKerrighedを試してみる。debian etchで環境構築する。まずはInstalling Kerrighed 2.3.0 - Kerrighedをみつつカーネルのコンパイルを行う。なんだか余分なパッケージを大量に入れたような気もするが。

$ su -
# apt-get install xmlto
# apt-get install lsb
# apt-get install rsync
# apt-get install pkg-config
# apt-get install libtool
# apt-get install gcc
# apt-get install bzip2
# cd /usr/src/
# wget http://kerrighed.gforge.inria.fr/kerrighed-latest.tar.gz
# wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.20.tar.bz2
# ls
kerrighed-2.3.0.tar.gz  linux-2.6.20.tar.bz2
# tar zxf kerrighed-2.3.0.tar.gz
# tar jxf linux-2.6.20.tar.bz2
# ls
kerrighed-2.3.0  kerrighed-2.3.0.tar.gz  linux-2.6.20  linux-2.6.20.tar.bz2
# cd kerrighed-2.3.0
# ./configure --with-kernel=/usr/src/linux-2.6.20
# make patch
# make defconfig
# make kernel
# make
# make kernel-install
# make install
# ls -l /boot/vmlinuz-2.6.20-krg
-rw-r--r-- 1 root root 2488432 Jan  4 12:38 /boot/vmlinuz-2.6.20-krg
# ls -l /boot/System.map
lrwxrwxrwx 1 root root 21 Jan  4 12:38 /boot/System.map -> System.map-2.6.20-krg
# ls -l /lib/modules/2.6.20-krg
total 52
lrwxrwxrwx 1 root root   21 Jan  4 12:38 build -> /usr/src/linux-2.6.20
drwxr-xr-x 2 root root 4096 Jan  4 12:49 extra
drwxr-xr-x 2 root root 4096 Jan  4 12:38 kernel
-rw-r--r-- 1 root root   45 Jan  4 12:49 modules.alias
-rw-r--r-- 1 root root   69 Jan  4 12:49 modules.ccwmap
-rw-r--r-- 1 root root   44 Jan  4 12:49 modules.dep
-rw-r--r-- 1 root root   73 Jan  4 12:49 modules.ieee1394map
-rw-r--r-- 1 root root  141 Jan  4 12:49 modules.inputmap
-rw-r--r-- 1 root root   81 Jan  4 12:49 modules.isapnpmap
-rw-r--r-- 1 root root   74 Jan  4 12:49 modules.ofmap
-rw-r--r-- 1 root root   99 Jan  4 12:49 modules.pcimap
-rw-r--r-- 1 root root   43 Jan  4 12:49 modules.seriomap
-rw-r--r-- 1 root root 3217 Jan  4 12:49 modules.symbols
-rw-r--r-- 1 root root  189 Jan  4 12:49 modules.usbmap
lrwxrwxrwx 1 root root   21 Jan  4 12:38 source -> /usr/src/linux-2.6.20
# ls -l /etc/default/kerrighed
-rwxr-xr-x 1 root root 327 Jan  4 12:49 /etc/default/kerrighed
# ls -l /lib/modules/2.6.20-krg
total 52
lrwxrwxrwx 1 root root   21 Jan  4 12:38 build -> /usr/src/linux-2.6.20
drwxr-xr-x 2 root root 4096 Jan  4 12:49 extra
drwxr-xr-x 2 root root 4096 Jan  4 12:38 kernel
-rw-r--r-- 1 root root   45 Jan  4 12:49 modules.alias
-rw-r--r-- 1 root root   69 Jan  4 12:49 modules.ccwmap
-rw-r--r-- 1 root root   44 Jan  4 12:49 modules.dep
-rw-r--r-- 1 root root   73 Jan  4 12:49 modules.ieee1394map
-rw-r--r-- 1 root root  141 Jan  4 12:49 modules.inputmap
-rw-r--r-- 1 root root   81 Jan  4 12:49 modules.isapnpmap
-rw-r--r-- 1 root root   74 Jan  4 12:49 modules.ofmap
-rw-r--r-- 1 root root   99 Jan  4 12:49 modules.pcimap
-rw-r--r-- 1 root root   43 Jan  4 12:49 modules.seriomap
-rw-r--r-- 1 root root 3217 Jan  4 12:49 modules.symbols
-rw-r--r-- 1 root root  189 Jan  4 12:49 modules.usbmap
lrwxrwxrwx 1 root root   21 Jan  4 12:38 source -> /usr/src/linux-2.6.20
# ls -l /etc/default/kerrighed
-rwxr-xr-x 1 root root 327 Jan  4 12:49 /etc/default/kerrighed
# ls -lR /usr/local/share/man*
total 36
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man1
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man2
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man3
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man4
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man5
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man6
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man7
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man8
drwxr-sr-x 2 root staff 4096 Jan  4 12:49 man9
total 20
-rw-r--r-- 1 root staff  886 Jan  4 12:49 checkpoint.1
-rw-r--r-- 1 root staff 1314 Jan  4 12:49 krgadm.1
-rw-r--r-- 1 root staff 2334 Jan  4 12:49 krgcapset.1
-rw-r--r-- 1 root staff  813 Jan  4 12:49 migrate.1
-rw-r--r-- 1 root staff  894 Jan  4 12:49 restart.1
total 12
-rw-r--r-- 1 root staff 1322 Jan  4 12:49 krgcapset.2
-rw-r--r-- 1 root staff 1349 Jan  4 12:49 migrate.2
-rw-r--r-- 1 root staff 1248 Jan  4 12:49 migrate_self.2
total 0
total 0
total 4
-rw-r--r-- 1 root staff 1838 Jan  4 12:49 kerrighed_nodes.5
total 0
total 8
-rw-r--r-- 1 root staff 2055 Jan  4 12:49 kerrighed.7
-rw-r--r-- 1 root staff 2900 Jan  4 12:49 kerrighed_capabilities.7
total 0
total 0
node01:/usr/src/kerrighed-2.3.0# ls -l /usr/local/bin/krgadm
-rwxr-xr-x 1 root staff 21315 Jan  4 12:49 /usr/local/bin/krgadm
node01:/usr/src/kerrighed-2.3.0# ls -l /usr/local/bin/krgcapset
-rwxr-xr-x 1 root staff 21058 Jan  4 12:49 /usr/local/bin/krgcapset
node01:/usr/src/kerrighed-2.3.0# ls -l /usr/local/bin/migrate
-rwxr-xr-x 1 root staff 11358 Jan  4 12:49 /usr/local/bin/migrate
node01:/usr/src/kerrighed-2.3.0# ls -l /usr/local/lib/libkerrighed.*
-rw-r--r-- 1 root staff 36258 Jan  4 12:49 /usr/local/lib/libkerrighed.a
-rwxr-xr-x 1 root staff   843 Jan  4 12:49 /usr/local/lib/libkerrighed.la
lrwxrwxrwx 1 root staff    21 Jan  4 12:49 /usr/local/lib/libkerrighed.so -> libkerrighed.so.1.0.0
lrwxrwxrwx 1 root staff    21 Jan  4 12:49 /usr/local/lib/libkerrighed.so.1 -> libkerrighed.so.1.0.0
-rwxr-xr-x 1 root staff 28805 Jan  4 12:49 /usr/local/lib/libkerrighed.so.1.0.0
node01:/usr/src/kerrighed-2.3.0# ls -l /usr/local/include/kerrighed
total 56
-rw-r--r-- 1 root staff   810 Jan  4 12:49 capabilities.h
-rw-r--r-- 1 root staff   840 Jan  4 12:49 capability.h
-rw-r--r-- 1 root staff   601 Jan  4 12:49 checkpoint.h
-rw-r--r-- 1 root staff   197 Jan  4 12:49 comm.h
-rw-r--r-- 1 root staff  1054 Jan  4 12:49 hotplug.h
-rw-r--r-- 1 root staff   233 Jan  4 12:49 kerrighed.h
-rw-r--r-- 1 root staff 13742 Jan  4 12:49 kerrighed_tools.h
-rw-r--r-- 1 root staff  1163 Jan  4 12:49 krgnodemask.h
-rw-r--r-- 1 root staff  1459 Jan  4 12:49 proc.h
-rw-r--r-- 1 root staff   405 Jan  4 12:49 process_group_types.h
-rw-r--r-- 1 root staff  1494 Jan  4 12:49 types.h
# mkinitramfs -o /boot/initrd.img-2.6.20-krg 2.6.20-krg
# vi /boot/grub/menu.lst
default 3
title           Debian GNU/Linux, kernel 2.6.20-krg
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.20-krg root=/dev/hda1 ro session_id=1
initrd          /boot/initrd.img-2.6.20-krg
# ifconfig
# echo "session=1">> /etc/kerrighed_nodes
# echo "nbmin=1">> /etc/kerrighed_nodes
# echo "">> /etc/kerrighed_nodes
# cat /etc/kerrighed_nodes


  1. ocs/Howto/Kerrighed - Mandriva Community Wiki
  2. kerrighed installation how to | In da Wok ......
  3. Installing Kerrighed 2.2.0 - Kerrighed
  4. Main Page - Kerrighed
  5. grub menu.lst default - Google 検索
  7. Grubでデュアルブート時のデフォルト(標準)起動OS設定
  8. session_id kerrighed menu.lst - Google 検索
  9. Tutorial: Kerrighed | Bioinformatics
  10. krg_DRBL - Grid Architecture - Trac
  11. Linux安裝入門與基本管理

Sun Grid Engine


せっかくなので、最新版をもらってくる。Sun Grid Engine 6.2を。このときSunのアカウントが必要。古めの版にはアカウント不必要。Linux版をダウンロードしておく。

  1. Sun Grid Engine の機能詳細
  2. Sun Grid Engine(SGE)利用法 | スーパーコンピュータ | ヒトゲノム解析センター
  3. gridengine: ホーム
  4. gridengine: Grid Engine HOWTOs



# mkdir -p /opt/sge62


# export SGE_ROOT=/opt/sge62


# useradd sgeagmin


# tar zxf ge62_lx24-x86.tar.gz
  1. Ubuntu でグリッドコンピューティング - May the Source be with you

Sun Grid Engineの導入(2回目)


$ ls

とても親切なインストールマニュアルがあるのでそれを参照。英語版だけどわかりやすい。基本的にCD-ROMに収められたソフトのインストール手順のようなので、そこは読み替え。マシンはx86で、tar methodでインストールしたいのでこのsge62u2_1_linux24-i586_targz.zipファイルを解凍。すると、sge6_2u2_1/ディレクトリが作られて、そのなかにマニュアルで言及されているcommonとarchtecture dependentのbinファイルが出来る。

$ unzip sge62u2_1_linux24-i586_targz.zip
$ ls sge6_2u2_1/
$ pwd


$ su -
# mkdir -p /opt/sge6-2
# cd /opt/sge6-2
# tar zxf /usr/src/sge6_2u2_1/sge-6_2u2-common.tar.gz
# tar zxf /usr/src/sge6_2u2_1/sge-6_2u2_1-bin-linux24-i586.tar.gz
# ls 
3rd_party  doc       include        install_qmaster  mpi   start_gui_installer
catman     dtrace    inst_sge       lib              pvm   util
ckpt       examples  install_execd  man              qmon


# export SGE_ROOT='/opt/sge6-2'
# printenv SGE_ROOT


# util/setfileperm.sh $SGE_ROOT


                    WARNING WARNING WARNING
We will set the the file ownership and permission to

   UserID:         0
   GroupID:        0
   In directory:   /opt/sge6-2

We will also install the following binaries as SUID-root:


Do you want to set the file permissions (yes/no) [NO] >> yes


Verifying and setting file permissions and owner in >3rd_party<
Verifying and setting file permissions and owner in >bin<
Verifying and setting file permissions and owner in >ckpt<
Verifying and setting file permissions and owner in >dtrace<
Verifying and setting file permissions and owner in >examples<
Verifying and setting file permissions and owner in >inst_sge<
Verifying and setting file permissions and owner in >install_execd<
Verifying and setting file permissions and owner in >install_qmaster<
Verifying and setting file permissions and owner in >lib<
Verifying and setting file permissions and owner in >mpi<
Verifying and setting file permissions and owner in >pvm<
Verifying and setting file permissions and owner in >qmon<
Verifying and setting file permissions and owner in >util<
Verifying and setting file permissions and owner in >utilbin<
Verifying and setting file permissions and owner in >catman<
Verifying and setting file permissions and owner in >doc<
Verifying and setting file permissions and owner in >include<
Verifying and setting file permissions and owner in >man<

Your file permissions were set

次にguiインストールかcommand lineインストールか。ここではcommand lineインストールにする。マニュアルのnote部分を読む。とりあえずlinuxに新規インストールするぶんには問題なさそうだな。マニュアルのやることリストには2つある。

  1. インストールスクリプトをマスターホストとすべての計算ホストで実行する。
  2. 認証ホストと計算キューをsubmitするホストの情報を登録する。


  1. csp-protocolで暗号化されたメッセージをホスト間でやり取り
  2. 秘密鍵の交換は公開鍵プロトコルで行われる。
  3. 暗号化は透過的に行われる
  4. 暗号化セッションはセッションの開始からある時間内で有効。


さらにInstalling SMF Servicesも読むが、solalis 10のための機能らしいので飛ばす。



# ./install_qmaster


Do you agree with that license? (y/n) [n] >> y


Welcome to the Grid Engine installation

Grid Engine qmaster host installation

Before you continue with the installation please read these hints:

   - Your terminal window should have a size of at least
     80x24 characters

   - The INTR character is often bound to the key Ctrl-C.
     The term >Ctrl-C< is used during the installation if you
     have the possibility to abort the installation

The qmaster installation procedure will take approximately 5-10 minutes.

Hit <RETURN> to continue >>


Unsupported local hostname

The current hostname is resolved as follows:

Hostname: localhost
Aliases:  hoge
Host Address(es):

It is not supported for a Grid Engine installation that the local hostname
contains the hostname "localhost" and/or the IP address "127.0.x.x" of the
loopback interface.
The "localhost" hostname should be reserved for the loopback interface
("") and the real hostname should be assigned to one of the
physical or logical network interfaces of this machine.

Installation failed.

Press <RETURN> to exit the installation procedure >>


# cat /etc/hosts       localhost       hoge

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

で、編集。ifconfigででてきたイーサネットアダプタに割り当てられたアドレスにたいして名前をつければよい。今までは127.0.0.1にhogeというホスト名(エイリアスの)が割り当てられていたが、これを192.168.14.6のホスト名にする。このマシンには4つのethアダプタがあるので、それ以外のものについても適当に追加。      localhost   hoge    hoge1    hoge2    hoge3

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts


# ./install_qmaster
Choosing Grid Engine admin user account

You may install Grid Engine that all files are created with the user id of an
unprivileged user.

This will make it possible to install and run Grid Engine in directories
where user >root< has no permissions to create and write files and directories.

   - Grid Engine still has to be started by user >root<

   - this directory should be owned by the Grid Engine administrator

Do you want to install Grid Engine
under an user id other than >root< (y/n) [y] >> y

sge管理ユーザネームを入力するのだが、管理ユーザを作っておくのを忘れたのでCtrl + Cで終了。

Choosing a Grid Engine admin user name

Please enter a valid user name >>


# adduser sgeadmin
Adding user `sgeadmin' ...
Adding new group `sgeadmin' (1001) ...
Adding new user `sgeadmin' (1001) with group `sgeadmin' ...
Creating home directory `/home/sgeadmin' ...
Copying files from `/etc/skel' ...
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for sgeadmin
Enter the new value, or press ENTER for the default
        Full Name []:
        Room Number []:
        Work Phone []:
        Home Phone []:
        Other []:
Is the information correct? [Y/n] Y


# ./install_qmaster
Choosing a Grid Engine admin user name

Please enter a valid user name >> sgeadmin

Installing Grid Engine as admin user >sgeadmin<

Hit <RETURN> to continue >>


Checking $SGE_ROOT directory

The Grid Engine root directory is:

   $SGE_ROOT = /opt/sge6-2

If this directory is not correct (e.g. it may contain an automounter
prefix) enter the correct path to this directory or hit <RETURN>
to use default [/opt/sge6-2] >>

Your $SGE_ROOT directory: /opt/sge6-2

Hit <RETURN> to continue >>


Grid Engine TCP/IP communication service

The port for sge_qmaster is currently set as service.

   sge_qmaster service set to port 6444

Now you have the possibility to set/change the communication ports by using the
>shell environment< or you may configure it via a network service, configured
in local >/etc/service<, >NIS< or >NIS+<, adding an entry in the form

    sge_qmaster <port_number>/tcp

to your services database and make sure to use an unused port number.

How do you want to configure the Grid Engine communication ports?

Using the >shell environment<:                           [1]

Using a network service like >/etc/service<, >NIS/NIS+<: [2]

(default: 2) >>

sge_qmasterをgrid engineのコミュニケーション手段として使うと言うこと。enterで次に進む。

Grid Engine TCP/IP service >sge_qmaster<

Using the service


for communication with Grid Engine.

Hit <RETURN> to continue >>


Grid Engine TCP/IP communication service

The port for sge_execd is currently set as service.

   sge_execd service set to port 6445

Now you have the possibility to set/change the communication ports by using the
>shell environment< or you may configure it via a network service, configured
in local >/etc/service<, >NIS< or >NIS+<, adding an entry in the form

    sge_execd <port_number>/tcp

to your services database and make sure to use an unused port number.

How do you want to configure the Grid Engine communication ports?

Using the >shell environment<:                           [1]

Using a network service like >/etc/service<, >NIS/NIS+<: [2]

(default: 2) >>


Grid Engine TCP/IP communication service

Using the service


for communication with Grid Engine.

Hit <RETURN> to continue >>


Grid Engine cells

Grid Engine supports multiple cells.

If you are not planning to run multiple Grid Engine clusters or if you don't
know yet what is a Grid Engine cell it is safe to keep the default cell name


If you want to install multiple cells you can enter a cell name now.

The environment variable


will be set for all further Grid Engine commands.

Enter cell name [default] >>

Using cell >default<.
Hit <RETURN> to continue >>


Unique cluster name

The cluster name uniquely identifies a specific Sun Grid Engine cluster.
The cluster name must be unique throughout your organization. The name
is not related to the SGE cell.

The cluster name must start with a letter ([A-Za-z]), followed by letters,
digits ([0-9]), dashes (-) or underscores (_).

Enter new cluster name or hit <RETURN>
to use default [p6444] >>
creating directory: /opt/sge6-2/default/common


Hit <RETURN> to continue >>


Grid Engine qmaster spool directory

The qmaster spool directory is the place where the qmaster daemon stores
the configuration and the state of the queuing system.

The admin user >sgeadmin< must have read/write access
to the qmaster spool directory.

If you will install shadow master hosts or if you want to be able to start
the qmaster daemon on other hosts (see the corresponding section in the
Grid Engine Installation and Administration Manual for details) the account
on the shadow master hosts also needs read/write access to this directory.

The following directory


will be used as qmaster spool directory by default!

Do you want to select another qmaster spool directory (y/n) [n] >>


Windows Execution Host Support

Are you going to install Windows Execution Hosts? (y/n) [n] >>


Verifying and setting file permissions

Did you install this version with >pkgadd< or did you already
verify and set the file permissions of your distribution (y/n) [y] >>


Verifying and setting file permissions

We may now verify and set the file permissions of your Grid Engine

This may be useful since due to unpacking and copying of your distribution
your files may be unaccessible to other users.

We will set the permissions of directories and binaries to

   755 - that means executable are accessible for the world

and for ordinary files to

   644 - that means readable for the world

Do you want to verify and set your file permissions (y/n) [y] >>

どうやらここの処理は最初に行ったutil/setfileperm.sh $SGE_ROOTと同じことをしてくれているようだ。

Verifying and setting file permissions and owner in >3rd_party<
Verifying and setting file permissions and owner in >bin<
Verifying and setting file permissions and owner in >ckpt<
Verifying and setting file permissions and owner in >dtrace<
Verifying and setting file permissions and owner in >examples<
Verifying and setting file permissions and owner in >inst_sge<
Verifying and setting file permissions and owner in >install_execd<
Verifying and setting file permissions and owner in >install_qmaster<
Verifying and setting file permissions and owner in >lib<
Verifying and setting file permissions and owner in >mpi<
Verifying and setting file permissions and owner in >pvm<
Verifying and setting file permissions and owner in >qmon<
Verifying and setting file permissions and owner in >util<
Verifying and setting file permissions and owner in >utilbin<
Verifying and setting file permissions and owner in >catman<
Verifying and setting file permissions and owner in >doc<
Verifying and setting file permissions and owner in >include<
Verifying and setting file permissions and owner in >man<

Your file permissions were set

Hit <RETURN> to continue >>


Select default Grid Engine hostname resolving method

Are all hosts of your cluster in one DNS domain? If this is
the case the hostnames

   >hostA< and >hostA.foo.com<

would be treated as equal, because the DNS domain name >foo.com<
is ignored when comparing hostnames.

Are all hosts of your cluster in a single DNS domain (y/n) [y] >> y

Ignoring domain name when comparing hostnames.

Hit <RETURN> to continue >>
Making directories

creating directory: /opt/sge6-2/default/spool/qmaster
creating directory: /opt/sge6-2/default/spool/qmaster/job_scripts
Hit <RETURN> to continue >>


Setup spooling
Your SGE binaries are compiled to link the spooling libraries
during runtime (dynamically). So you can choose between Berkeley DB
spooling and Classic spooling method.
Please choose a spooling method (berkeleydb|classic) [berkeleydb] >>


The Berkeley DB spooling method provides two configurations!

Local spooling:
The Berkeley DB spools into a local directory on this host (qmaster host)
This setup is faster, but you can't setup a shadow master host

Berkeley DB Spooling Server:
If you want to setup a shadow master host, you need to use
Berkeley DB Spooling Server!
In this case you have to choose a host with a configured RPC service.
The qmaster host connects via RPC to the Berkeley DB. This setup is more
failsafe, but results in a clear potential security hole. RPC communication
(as used by Berkeley DB) can be easily compromised. Please only use this
alternative if your site is secure or if you are not concerned about
security. Check the installation guide for further advice on how to achieve
failsafety without compromising security.

Do you want to use a Berkeley DB Spooling Server? (y/n) [n] >>

Hit <RETURN> to continue >>


Berkeley Database spooling parameters

Please enter the database directory now, even if you want to spool locally,
it is necessary to enter this database directory.

Default: [/opt/sge6-2/default/spool/spooldb] >>

creating directory: /opt/sge6-2/default/spool/spooldb
Dumping bootstrapping information
Initializing spooling database

Hit <RETURN> to continue >>


Grid Engine group id range

When jobs are started under the control of Grid Engine an additional group id
is set on platforms which do not support jobs. This is done to provide maximum
control for Grid Engine jobs.

This additional UNIX group id range must be unused group id's in your system.
Each job will be assigned a unique id during the time it is running.
Therefore you need to provide a range of id's which will be assigned
dynamically for jobs.

The range must be big enough to provide enough numbers for the maximum number
of Grid Engine jobs running at a single moment on a single host. E.g. a range
like >20000-20100< means, that Grid Engine will use the group ids from
20000-20100 and provides a range for 100 Grid Engine jobs at the same time
on a single host.

You can change at any time the group id range in your cluster configuration.

Please enter a range [20000-20100] >>

Using >20000-20100< as gid range. Hit <RETURN> to continue >>


Grid Engine cluster configuration

Please give the basic configuration parameters of your Grid Engine


The pathname of the spool directory of the execution hosts. User >sgeadmin<
must have the right to create this directory and to write into it.

Default: [/opt/sge6-2/default/spool] >>


Grid Engine cluster configuration (continued)


The email address of the administrator to whom problem reports are sent.

It's is recommended to configure this parameter. You may use >none<
if you do not wish to receive administrator mail.

Please enter an email address in the form >user@foo.com<.

Default: [none] >> sgeadmin@localhost


The following parameters for the cluster configuration were configured:

   execd_spool_dir        /opt/sge6-2/default/spool
   administrator_mail     sgeadmin@localhost

Do you want to change the configuration parameters (y/n) [n] >>
Creating local configuration
Creating >act_qmaster< file
Adding default complex attributes
Adding default parallel environments (PE)
Adding SGE default usersets
Adding >sge_aliases< path aliases file
Adding >qtask< qtcsh sample default request file
Adding >sge_request< default submit options file
Creating >sgemaster< script
Creating >sgeexecd< script
Creating settings files for >.profile/.cshrc<

Hit <RETURN> to continue >>


qmaster startup script

We can install the startup script that will
start qmaster at machine boot (y/n) [y] >>

cp /opt/sge6-2/default/common/sgemaster /etc/init.d/sgemaster.p6444
/usr/sbin/update-rc.d sgemaster.p6444
 Adding system startup for /etc/init.d/sgemaster.p6444 ...
   /etc/rc0.d/K03sgemaster.p6444 -> ../init.d/sgemaster.p6444
   /etc/rc1.d/K03sgemaster.p6444 -> ../init.d/sgemaster.p6444
   /etc/rc6.d/K03sgemaster.p6444 -> ../init.d/sgemaster.p6444
   /etc/rc2.d/S95sgemaster.p6444 -> ../init.d/sgemaster.p6444
   /etc/rc3.d/S95sgemaster.p6444 -> ../init.d/sgemaster.p6444
   /etc/rc4.d/S95sgemaster.p6444 -> ../init.d/sgemaster.p6444
   /etc/rc5.d/S95sgemaster.p6444 -> ../init.d/sgemaster.p6444

Hit <RETURN> to continue >>


Grid Engine qmaster startup

Starting qmaster daemon. Please wait ...
   starting sge_qmaster
Hit <RETURN> to continue >>


Adding Grid Engine hosts

Please now add the list of hosts, where you will later install your execution
daemons. These hosts will be also added as valid submit hosts.

Please enter a blank separated list of your execution hosts. You may
press <RETURN> if the line is getting too long. Once you are finished
simply press <RETURN> without entering a name.

You also may prepare a file with the hostnames of the machines where you plan
to install Grid Engine. This may be convenient if you are installing Grid
Engine on many hosts.

Do you want to use a file which contains the list of hosts (y/n) [n] >>


Adding admin and submit hosts

Please enter a blank seperated list of hosts.

Stop by entering <RETURN>. You may repeat this step until you are
entering an empty list. You will see messages from Grid Engine
when the hosts are added.

Host(s): master01
Finished adding hosts. Hit <RETURN> to continue >>


If you want to use a shadow host, it is recommended to add this host
to the list of administrative hosts.

If you are not sure, it is also possible to add or remove hosts after the
installation with <qconf -ah hostname> for adding and <qconf -dh hostname>
for removing this host

Attention: This is not the shadow host installation
You still have to install the shadow host separately

Do you want to add your shadow host(s) now? (y/n) [y] >>


Adding Grid Engine shadow hosts

Please now add the list of hosts, where you will later install your shadow

Please enter a blank separated list of your execution hosts. You may
press <RETURN> if the line is getting too long. Once you are finished
simply press <RETURN> without entering a name.

You also may prepare a file with the hostnames of the machines where you plan
to install Grid Engine. This may be convenient if you are installing Grid
Engine on many hosts.

Do you want to use a file which contains the list of hosts (y/n) [n] >>


Adding admin hosts

Please enter a blank seperated list of hosts.

Stop by entering <RETURN>. You may repeat this step until you are
entering an empty list. You will see messages from Grid Engine
when the hosts are added.

Finished adding hosts. Hit <RETURN> to continue >>


Creating the default <all.q> queue and <allhosts> hostgroup

root@master01 added "@allhosts" to host group list
root@master01 added "all.q" to cluster queue list

Hit <RETURN> to continue >>


Scheduler Tuning

The details on the different options are described in the manual.

1) Normal
          Fixed interval scheduling, report limited scheduling information,
          actual + assumed load

2) High
          Fixed interval scheduling, report limited scheduling information,
          actual load

3) Max
          Immediate Scheduling, report no scheduling information,
          actual load

Enter the number of your preferred configuration and hit <RETURN>!
Default configuration is [1] >> 1

We're configuring the scheduler with >Normal< settings!
Do you agree? (y/n) [y] >>


Using Grid Engine

You should now enter the command:

   source /opt/sge6-2/default/common/settings.csh

if you are a csh/tcsh user or

   # . /opt/sge6-2/default/common/settings.sh

if you are a sh/ksh user.

This will set or expand the following environment variables:

   - $SGE_ROOT         (always necessary)
   - $SGE_CELL         (if you are using a cell other than >default<)
   - $SGE_CLUSTER_NAME (always necessary)
   - $SGE_QMASTER_PORT (if you haven't added the service >sge_qmaster<)
   - $SGE_EXECD_PORT   (if you haven't added the service >sge_execd<)
   - $PATH/$path       (to find the Grid Engine binaries)
   - $MANPATH          (to access the manual pages)

Hit <RETURN> to see where Grid Engine logs messages >>
Grid Engine messages

Grid Engine messages can be found at:

   /tmp/qmaster_messages (during qmaster startup)
   /tmp/execd_messages   (during execution daemon startup)

After startup the daemons log their messages in their spool directories.

   Qmaster:     /opt/sge6-2/default/spool/qmaster/messages
   Exec daemon: <execd_spool_dir>/<hostname>/messages

Grid Engine startup scripts

Grid Engine startup scripts can be found at:

   /opt/sge6-2/default/common/sgemaster (qmaster)
   /opt/sge6-2/default/common/sgeexecd (execd)

Do you want to see previous screen about using Grid Engine again (y/n) [n] >>
Your Grid Engine qmaster installation is now completed

Please now login to all hosts where you want to run an execution daemon
and start the execution host installation procedure.

If you want to run an execution daemon on this host, please do not forget
to make the execution host installation in this host as well.

All execution hosts must be administrative hosts during the installation.
All hosts which you added to the list of administrative hosts during this
installation procedure can now be installed.

You may verify your administrative hosts with the command

   # qconf -sh

and you may add new administrative hosts with the command

   # qconf -ah <hostname>

Please hit <RETURN> >>


# printenv
# . /opt/sge6-2/default/common/settings.sh
# printenv


# qconf -sh


# ./install_execd
Welcome to the Grid Engine execution host installation

If you haven't installed the Grid Engine qmaster host yet, you must execute
this step (with >install_qmaster<) prior the execution host installation.

For a sucessfull installation you need a running Grid Engine qmaster. It is
also neccesary that this host is an administrative host.

You can verify your current list of administrative hosts with
the command:

   # qconf -sh

You can add an administrative host with the command:

   # qconf -ah <hostname>

The execution host installation will take approximately 5 minutes.

Hit <RETURN> to continue >>


Checking $SGE_ROOT directory

The Grid Engine root directory is:

   $SGE_ROOT = /opt/sge6-2

If this directory is not correct (e.g. it may contain an automounter
prefix) enter the correct path to this directory or hit <RETURN>
to use default [/opt/sge6-2] >>

Your $SGE_ROOT directory: /opt/sge6-2

Hit <RETURN> to continue >>


Grid Engine cells

Please enter cell name which you used for the qmaster
installation or press <RETURN> to use [default] >>

Using cell: >default<

Hit <RETURN> to continue >>


Grid Engine TCP/IP communication service

The port for sge_execd is currently set as service.

   sge_execd service set to port 6445

Hit <RETURN> to continue >>
Checking hostname resolving

This hostname is known at qmaster as an administrative host.

Hit <RETURN> to continue >>


Execd spool directory configuration

You defined a global spool directory when you installed the master host.
You can use that directory for spooling jobs from this execution host
or you can define a different spool directory for this execution host.

ATTENTION: For most operating systems, the spool directory does not have to
be located on a local disk. The spool directory can be located on a
network-accessible drive. However, using a local spool directory provides
better performance.

FOR WINDOWS USERS: On Windows systems, the spool directory MUST be located
on a local disk. If you install an execution daemon on a Windows system
without a local spool directory, the execution host is unusable.

The spool directory is currently set to:

Do you want to configure a different spool directory
for this host (y/n) [n] >>
Creating local configuration
sgeadmin@master01 modified "master01" in configuration list
Local configuration for host >master01< created.

Hit <RETURN> to continue >>


execd startup script

We can install the startup script that will
start execd at machine boot (y/n) [y] >> y

cp /opt/sge6-2/default/common/sgeexecd /etc/init.d/sgeexecd.p6444
/usr/sbin/update-rc.d sgeexecd.p6444
 Adding system startup for /etc/init.d/sgeexecd.p6444 ...
   /etc/rc0.d/K03sgeexecd.p6444 -> ../init.d/sgeexecd.p6444
   /etc/rc1.d/K03sgeexecd.p6444 -> ../init.d/sgeexecd.p6444
   /etc/rc6.d/K03sgeexecd.p6444 -> ../init.d/sgeexecd.p6444
   /etc/rc2.d/S95sgeexecd.p6444 -> ../init.d/sgeexecd.p6444
   /etc/rc3.d/S95sgeexecd.p6444 -> ../init.d/sgeexecd.p6444
   /etc/rc4.d/S95sgeexecd.p6444 -> ../init.d/sgeexecd.p6444
   /etc/rc5.d/S95sgeexecd.p6444 -> ../init.d/sgeexecd.p6444

Hit <RETURN> to continue >>
Grid Engine execution daemon startup

Starting execution daemon. Please wait ...
   starting sge_execd

Hit <RETURN> to continue >>
Adding a queue for this host

We can now add a queue instance for this host:

   - it is added to the >allhosts< hostgroup
   - the queue provides 1 slot(s) for jobs in all queues
     referencing the >allhosts< hostgroup

You do not need to add this host now, but before running jobs on this host
it must be added to at least one queue.

Do you want to add a default queue instance for this host (y/n) [y] >>

root@master01 modified "@allhosts" in host group list
root@master01 modified "all.q" in cluster queue list

Hit <RETURN> to continue >>
Using Grid Engine

You should now enter the command:

   source /opt/sge6-2/default/common/settings.csh

if you are a csh/tcsh user or

   # . /opt/sge6-2/default/common/settings.sh

if you are a sh/ksh user.

This will set or expand the following environment variables:

   - $SGE_ROOT         (always necessary)
   - $SGE_CELL         (if you are using a cell other than >default<)
   - $SGE_CLUSTER_NAME (always necessary)
   - $SGE_QMASTER_PORT (if you haven't added the service >sge_qmaster<)
   - $SGE_EXECD_PORT   (if you haven't added the service >sge_execd<)
   - $PATH/$path       (to find the Grid Engine binaries)
   - $MANPATH          (to access the manual pages)

Hit <RETURN> to see where Grid Engine logs messages >>
Grid Engine messages

Grid Engine messages can be found at:

   /tmp/qmaster_messages (during qmaster startup)
   /tmp/execd_messages   (during execution daemon startup)

After startup the daemons log their messages in their spool directories.

   Qmaster:     /opt/sge6-2/default/spool/qmaster/messages
   Exec daemon: <execd_spool_dir>/<hostname>/messages

Grid Engine startup scripts

Grid Engine startup scripts can be found at:

   /opt/sge6-2/default/common/sgemaster (qmaster)
   /opt/sge6-2/default/common/sgeexecd (execd)

Do you want to see previous screen about using Grid Engine again (y/n) [n] >>


# exit


$ . /opt/sge6-2/default/common/settings.sh


$ qconf -sconf
execd_spool_dir              /opt/sge6-2/default/spool
mailer                       /bin/mail
xterm                        /usr/bin/X11/xterm
load_sensor                  none
prolog                       none
epilog                       none
shell_start_mode             posix_compliant
login_shells                 sh,ksh,csh,tcsh
min_uid                      0
min_gid                      0
user_lists                   none
xuser_lists                  none
projects                     none
xprojects                    none
enforce_project              false
enforce_user                 auto
load_report_time             00:00:40
max_unheard                  00:05:00
reschedule_unknown           00:00:00
loglevel                     log_warning
administrator_mail           sgeadmin@localhost
set_token_cmd                none
pag_cmd                      none
token_extend_time            none
shepherd_cmd                 none
qmaster_params               none
execd_params                 none
reporting_params             accounting=true reporting=false \
                             flush_time=00:00:15 joblog=false sharelog=00:00:00
finished_jobs                100
gid_range                    20000-20100
qlogin_command               builtin
qlogin_daemon                builtin
rlogin_command               builtin
rlogin_daemon                builtin
rsh_command                  builtin
rsh_daemon                   builtin
max_aj_instances             2000
max_aj_tasks                 75000
max_u_jobs                   0
max_jobs                     0
max_advance_reservations     0
auto_user_oticket            0
auto_user_fshare             0
auto_user_default_project    none
auto_user_delete_time        86400
delegated_file_staging       false
reprioritize                 0
jsv_url                      none
jsv_allowed_mod              ac,h,i,e,o,j,M,N,p,w


$ rsh hoge date
Permission denied.
$ qsub $SGE_ROOT/examples/jobs/simple.sh


$ echo 'hoge sgeadmin' >> ~/.rhosts
$ rsh hoge date
2009年  4月 10日 金曜日 15:12:40 JST
$ date
2009年  4月 10日 金曜日 15:12:49 JST


[メモ] hudsonとかTheSchwartzとか





  1. はてなブックマーク
  2. Google Bookmarks
  3. del.icio.us


  1. Posted: 2003-12-14T20:05:25+09:00
  2. Modified: 2003-12-14T14:13:14+09:00
  3. Generated: 2025-02-17T23:09:17+09:00