2016-05-31

starcluster と cfncluster その1

AWSのクーポンが5月末(つまり今日)で期限切れということに気付いたので、急遽starclusterとcfnclusterを触ってみることにしました。

どちらも、pythonで書かれたAWS上でEC2のクラスタを構成するツールなんですが、前者はMITで開発されていて後者はAmazonが開発しているものです。
starclusterの方が古くからあったようで、githubの一番古いコミットは2009年のものでした。でもこの段階で既に動いてたっぽいので、プロジェクトの開始はもっと古そうです*1。
一方のcfnclusterは20014年5月にプロジェクトを始めたようなので、約2年くらいの歴史ですね。

starclusterの特長はここのページにまとまってます。
What is StarCluster? — StarCluster 0.95.6 documentation
要するに、事前に色々とインストールしたAMIを用意してあって、任意のノード数のインスタンスを起動してセットアップしてくれるといったとこでしょうかね。

cfnclusterの方はもうちょっとインフラ寄りの機能しかなくて、インスタンスの起動とネットワーク構成あたりしかやらないようです。
要はソフトのインストールは、自分でAMIを用意してねというスタンスなんでしょうかね。
この辺のページを見てると、cloud watchやauto scalingとの連動のあたりは凝ったことやってるなーといったところですが、お手軽にクラウド上でHPCやりたいっていう層*2にはあんまり訴求しない気がします。
CfnCluster Processes — CfnCluster 1.2.1
あと、AWSのサービスを使いすぎてて、ベンダーロックイン感が高めですね。:p

インストール

とりあえず、両方ともインストールしてみましょう。なお、cfnclusterはpython2.7.10 on cygwin、starclusterはpython2.7.10 on windows7でやってます。cygwinで全部やるつもりだったんですが、starclusterのインストールに失敗したのでこっちだけnative環境でやり直しています。

starcluster

インストールしようとすると、pycryptoをビルドしようとして失敗するので、こちらのページを参考にVS90COMNTOOLSを上書きして無理矢理VS2015を使うように設定します。
「Unable to find vcvarsall.bat」の対処法 | Regen Techlog
違うバージョンのVSが入っている時や、そもそもVSなんぞ入れとらんという時は別の上記ページの別の方法で対処しましょう。

> python -m virtualenv --python=C:\Python27\python.exe starcluster
> cd starcluster
> Scripts\activate
> set VS90COMNTOOLS=%VS140COMNTOOLS%
> pip install starcluster
Collecting starcluster
Collecting iso8601>=0.1.8 (from starcluster)
  Using cached iso8601-0.1.11-py2.py3-none-any.whl
Collecting pycrypto>=2.5 (from starcluster)
  Using cached pycrypto-2.6.1.tar.gz
Collecting workerpool>=0.9.2 (from starcluster)
Collecting iptools>=0.6.1 (from starcluster)
  Using cached iptools-0.6.1-py2.py3-none-any.whl
Collecting scp>=0.7.1 (from starcluster)
  Using cached scp-0.10.2-py2.py3-none-any.whl
Collecting boto>=2.23.0 (from starcluster)
  Using cached boto-2.40.0-py2.py3-none-any.whl
Collecting Jinja2>=2.7 (from starcluster)
  Using cached Jinja2-2.8-py2.py3-none-any.whl
Collecting decorator>=3.4.0 (from starcluster)
  Using cached decorator-4.0.9-py2.py3-none-any.whl
Collecting paramiko>=1.12.1 (from starcluster)
  Using cached paramiko-2.0.0-py2.py3-none-any.whl
Collecting optcomplete>=1.2-devel (from starcluster)
Collecting six (from workerpool>=0.9.2->starcluster)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting MarkupSafe (from Jinja2>=2.7->starcluster)
Collecting pyasn1>=0.1.7 (from paramiko>=1.12.1->starcluster)
  Using cached pyasn1-0.1.9-py2.py3-none-any.whl
Collecting cryptography>=1.1 (from paramiko>=1.12.1->starcluster)
  Using cached cryptography-1.3.2-cp27-none-win_amd64.whl
Requirement already satisfied (use --upgrade to upgrade): setuptools>=11.3 in c:\users\n_so5\onedrive\python\starcluster\lib\site-packages (from cryptography>=1.1->paramiko>=1.12.1->starcluster)
Collecting enum34 (from cryptography>=1.1->paramiko>=1.12.1->starcluster)
  Using cached enum34-1.1.6-py2-none-any.whl
Collecting ipaddress (from cryptography>=1.1->paramiko>=1.12.1->starcluster)
  Using cached ipaddress-1.0.16-py27-none-any.whl
Collecting idna>=2.0 (from cryptography>=1.1->paramiko>=1.12.1->starcluster)
  Using cached idna-2.1-py2.py3-none-any.whl
Collecting cffi>=1.4.1 (from cryptography>=1.1->paramiko>=1.12.1->starcluster)
  Using cached cffi-1.6.0-cp27-none-win_amd64.whl
Collecting pycparser (from cffi>=1.4.1->cryptography>=1.1->paramiko>=1.12.1->starcluster)
Building wheels for collected packages: pycrypto
  Running setup.py bdist_wheel for pycrypto ... done
  Stored in directory: C:\Users\n_so5\AppData\Local\pip\Cache\wheels\80\1f\94\f76e9746864f198eb0e304aeec319159fa41b082f61281ffce
Successfully built pycrypto
Installing collected packages: iso8601, pycrypto, six, workerpool, iptools, pyasn1, enum34, ipaddress, idna, pycparser, cffi, cryptography, paramiko, scp, boto, MarkupSafe, Jinja2, decorator, optcomplete, starcluster
Successfully installed Jinja2-2.8 MarkupSafe-0.23 boto-2.40.0 cffi-1.6.0 cryptography-1.3.2 decorator-4.0.9 enum34-1.1.6 idna-2.1 ipaddress-1.0.16 iptools-0.6.1 iso8601-0.1.11 optcomplete-1.2-devel paramiko-2.0.0 pyasn1-0.1.9 pycparser-2.14 pycrypto-2.6.1 scp-0.10.2 six-1.10.0 starcluster-0.95.6 workerpool-0.9.4 You are using pip version 8.0.2, however version 8.1.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

cfncluster

> python -m virtualenv cfncluster_cygwin
> cd cfncluster_cygwin/
> . bin/activate
> pip install cfncluster
Collecting cfncluster
  Downloading cfncluster-1.2.1.tar.gz
Collecting boto>=2.39 (from cfncluster)
  Downloading boto-2.40.0-py2.py3-none-any.whl (1.3MB)
    100% |████████████████████████████████| 1.4MB 660kB/s
Collecting awscli>=1.10.13 (from cfncluster)
  Downloading awscli-1.10.34-py2.py3-none-any.whl (938kB)
    100% |████████████████████████████████| 942kB 948kB/s
Collecting colorama<=0.3.3,>=0.2.5 (from awscli>=1.10.13->cfncluster)
  Downloading colorama-0.3.3.tar.gz
Collecting docutils>=0.10 (from awscli>=1.10.13->cfncluster)
  Downloading docutils-0.12.tar.gz (1.6MB)
    100% |████████████████████████████████| 1.6MB 465kB/s
Collecting rsa<=3.5.0,>=3.1.2 (from awscli>=1.10.13->cfncluster)
  Downloading rsa-3.4.2-py2.py3-none-any.whl (46kB)
    100% |████████████████████████████████| 51kB 1.9MB/s
Collecting botocore==1.4.24 (from awscli>=1.10.13->cfncluster)
  Downloading botocore-1.4.24-py2.py3-none-any.whl (2.3MB)
    100% |████████████████████████████████| 2.3MB 340kB/s
Collecting s3transfer==0.0.1 (from awscli>=1.10.13->cfncluster)
  Downloading s3transfer-0.0.1-py2.py3-none-any.whl
Collecting pyasn1>=0.1.3 (from rsa<=3.5.0,>=3.1.2->awscli>=1.10.13->cfncluster)
  Downloading pyasn1-0.1.9-py2.py3-none-any.whl
Collecting python-dateutil<3.0.0,>=2.1 (from botocore==1.4.24->awscli>=1.10.13->cfncluster)
  Downloading python_dateutil-2.5.3-py2.py3-none-any.whl (201kB)
    100% |████████████████████████████████| 204kB 1.5MB/s
Collecting jmespath<1.0.0,>=0.7.1 (from botocore==1.4.24->awscli>=1.10.13->cfncluster)
  Downloading jmespath-0.9.0-py2.py3-none-any.whl
Collecting futures<4.0.0,>=2.2.0; python_version == "2.6" or python_version == "2.7" (from s3transfer==0.0.1->awscli>=1.10.13->cfncluster)
  Downloading futures-3.0.5-py2-none-any.whl
Collecting six>=1.5 (from python-dateutil<3.0.0,>=2.1->botocore==1.4.24->awscli>=1.10.13->cfncluster)
  Downloading six-1.10.0-py2.py3-none-any.whl
Building wheels for collected packages: cfncluster, colorama, docutils
  Running setup.py bdist_wheel for cfncluster ... done
  Stored in directory: /home/n_so5/.cache/pip/wheels/26/c8/b0/3cf98bf7d72a9a63a358cbf094a50092641a70843de01ca155
  Running setup.py bdist_wheel for colorama ... done
  Stored in directory: /home/n_so5/.cache/pip/wheels/21/c5/cf/63fb92293f3ad402644ccaf882903cacdb8fe87c80b62c84df
  Running setup.py bdist_wheel for docutils ... done
  Stored in directory: /home/n_so5/.cache/pip/wheels/db/de/bd/b99b1e12d321fbc950766c58894c6576b1a73ae3131b29a151
Successfully built cfncluster colorama docutils
Installing collected packages: boto, colorama, docutils, pyasn1, rsa, six, python-dateutil, jmespath, botocore, futures, s3transfer, awscli, cfncluster
Successfully installed awscli-1.10.34 boto-2.40.0 botocore-1.4.24 cfncluster-1.2.1 colorama-0.3.3 docutils-0.12 futures-3.0.5 jmespath-0.9.0 pyasn1-0.1.9 python-dateutil-2.5.3 rsa-3.4.2 s3transfer-0.0.1 six-1.10.0

クラスタの起動

まずは、starclusterを使って4ノードのクラスタを作ってみましょう。
ここのチュートリアルを参考に、適当にいじっていきます。
Quick-Start — StarCluster 0.95.6 documentation

まずは、コンフィグファイルを生成します。

> starcluster help
Options:
--------
[1] Show the StarCluster config template
[2] Write config template to C:\Users\n_so5\.babun\cygwin\home\n_so5\.starcluster\config
[q] Quit
Please enter your selection: 2  <== 2を入力してテンプレートファイルを書き出させます。
>>> Config template written to C:\Users\n_so5\.babun\cygwin\home\n_so5\.starcluster\config
>>> Please customize the config template

続いて、生成されたconfigファイルを編集します。変更したのは以下の5行のみです。内部的にはbotoを使っているので、鍵関連は環境変数でも設定できるはずですが、設定ファイルを読み込んだ時点で値が入っていないとエラーチェックにひっかかるようです・・・

AWS_ACCESS_KEY_ID =     #your aws access key id here
AWS_SECRET_ACCESS_KEY = #your secret aws access key here
AWS_USER_ID =           #your 12-digit aws user id here
AWS_REGION_NAME =  ap-northeast-1
CLUSTER_SIZE = 4
NODE_INSTANCE_TYPE = c3.large

それから、keypairを生成します。

>starcluster createkey mykey -o %HOME%\.ssh\mykey.rsa

最後にクラスタを起動します。

>starcluster start smallcluster
StarCluster - (http://star.mit.edu/cluster) (v. 0.95.6)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

>>> Using default cluster template: smallcluster
>>> Validating cluster template settings...
>>> Cluster template settings are valid
>>> Starting cluster...
>>> Launching a 4-node cluster...
>>> Creating security group @sc-smallcluster...
>>> Creating placement group @sc-smallcluster...
Reservation:r-cbaaa168
>>> Waiting for instances to propagate...
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Waiting for cluster to come up... (updating every 30s)
>>> Waiting for all nodes to be in a 'running' state...
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Waiting for SSH to come up on all nodes...
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Waiting for cluster to come up took 1.251 mins
>>> The master node is ec2-54-175-0-145.compute-1.amazonaws.com
>>> Configuring cluster...
>>> Running plugin starcluster.clustersetup.DefaultClusterSetup
>>> Configuring hostnames...
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Creating cluster user: sgeadmin (uid: 1001, gid: 1001)
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Configuring scratch space for user(s): sgeadmin
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Configuring /etc/hosts on each node
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Starting NFS server on master
>>> Configuring NFS exports path(s):
/home
>>> Mounting all NFS export path(s) on 3 worker node(s)
3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Setting up NFS took 0.285 mins
>>> Configuring passwordless ssh for root
>>> Configuring passwordless ssh for sgeadmin
>>> Running plugin starcluster.plugins.sge.SGEPlugin
>>> Configuring SGE...
>>> Configuring NFS exports path(s):
/opt/sge6
>>> Mounting all NFS export path(s) on 3 worker node(s)
3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Setting up NFS took 0.212 mins
>>> Installing Sun Grid Engine...
3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Creating SGE parallel environment 'orte'
4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Adding parallel environment 'orte' to queue 'all.q'
>>> Configuring cluster took 2.328 mins
>>> Starting cluster took 3.701 mins

The cluster is now ready to use. To login to the master node
as root, run:

    $ starcluster sshmaster smallcluster

If you're having issues with the cluster you can reboot the
instances and completely reconfigure the cluster from
scratch using:

    $ starcluster restart smallcluster

When you're finished using the cluster and wish to terminate
it and stop paying for service:

    $ starcluster terminate smallcluster

Alternatively, if the cluster uses EBS instances, you can
use the 'stop' command to shutdown all nodes and put them
into a 'stopped' state preserving the EBS volumes backing
the nodes:

    $ starcluster stop smallcluster

WARNING: Any data stored in ephemeral storage (usually /mnt)
will be lost!

You can activate a 'stopped' cluster by passing the -x
option to the 'start' command:

    $ starcluster start -x smallcluster

This will start all 'stopped' nodes and reconfigure the
cluster.

ものの数分も待っていたら、クラスタの完成です！
さっそくログインしてみましょう。

>starcluster sshmaster smallcluster
StarCluster - (http://star.mit.edu/cluster) (v. 0.95.6)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

>>> Starting Pure-Python SSH shell...
Line-buffered terminal emulation. Press F6 or ^Z to send EOF.

eval $(resize)
root@master:~# eval $(resize)
The program 'resize' is currently not installed. You can install it by typing:
apt-get install xterm

"Starting Pure-Python SSH shell"とか出てるので、どうやらこのためにParamikoを使っているようです。しかし、残念ながらtermcapだかterminfoだかの設定が入っていないのとwindowsのコマンドプロンプト環境からだと激しく使い難い*3ので、事前に用意されているsgeadminというユーザのauthorized_keysに鍵を追加して後は普段から使っているsshクライアントを使って作業します。

> starcluster put smallcluster %HOME%\.ssh\id_rsa.pub ./
> starcluster sshmaster smallcluster
> root@master:~# cat id_rsa.pub >> ~sgeadmin/.ssh/authorized_keys
> starcluster listclusters
 StarCluster - (http://star.mit.edu/cluster) (v. 0.95.6)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

-----------------------------------------------
smallcluster (security group: @sc-smallcluster)
-----------------------------------------------
Launch time: 2016-05-31 14:06:38
Uptime: 0 days, 00:23:07
VPC: vpc-9f9df4fa
Subnet: subnet-7907d752
Zone: us-east-1a
Keypair: mykey
EBS volumes: N/A
Cluster nodes:
     master running i-1a5d9286 ec2-54-175-0-145.compute-1.amazonaws.com
    node001 running i-1b5d9287 ec2-54-175-213-27.compute-1.amazonaws.com
    node002 running i-185d9284 ec2-54-89-100-190.compute-1.amazonaws.com
    node003 running i-195d9285 ec2-52-201-218-218.compute-1.amazonaws.com
Total nodes: 4

starcluster put {クラスタ名} でファイルをアップロード(getだと逆にダウンロード)できます。これで公開鍵を送りこんで、authorized_keysに末尾に追加しておきましょう。
terminalの設定がダメダメなのでコピペでなんとかするのはちょっと厳しげです。
あと、最後にやっているようにstarcluster listclustersコマンドを実行すると起動中のマシンのホスト名一覧が表示されるので、ここでmasterと表記されているノード(ec2-54-175-0-145.compute-1.amazonaws.com)へsshでログインします。
この時点でようやく気付いたんですが、REGION指定がうまく渡っていなかったようで、デフォルトのus-east-1にインスタンスが上がっていました・・・

続けてHPLでも流すかーと思ってたんですが、なんか設定が変なみたいで、mpiccはmpich2のものだけどmpirunはOpenMPIのものが使われているという謎な状況です。

sgeadmin@master:~$ which mpicc
/usr/bin/mpicc
sgeadmin@master:~$ which mpirun
/usr/bin/mpirun
sgeadmin@master:~$ file /usr/bin/mpicc /usr/bin/mpirun
/usr/bin/mpicc:  symbolic link to `/etc/alternatives/mpicc'
/usr/bin/mpirun: symbolic link to `/etc/alternatives/mpirun'
sgeadmin@master:~$ file /etc/alternatives/mpicc /etc/alternatives/mpirun
/etc/alternatives/mpicc:  symbolic link to `/usr/bin/mpicc.mpich2'
/etc/alternatives/mpirun: symbolic link to `/usr/bin/mpirun.openmpi'

alternativesって使ったことなかったんですが、ぐぐりながら状況を見てみるとこんな感じ。

> update-alternatives --display mpi
mpi - auto mode
  link currently points to /usr/include/mpich2
/usr/include/mpich2 - priority 40
  slave libmpi++.so: /usr/lib/libmpichcxx.so
  slave libmpi.so: /usr/lib/libmpich.so
  slave libmpif77.so: /usr/lib/libfmpich.so
  slave libmpif90.so: /usr/lib/libmpichf90.so
  slave mpic++: /usr/bin/mpic++.mpich2
  slave mpic++.1.gz: /usr/share/man/man1/mpic++.mpich2.1.gz
  slave mpicc: /usr/bin/mpicc.mpich2
  slave mpicc.1.gz: /usr/share/man/man1/mpicc.mpich2.1.gz
  slave mpicxx: /usr/bin/mpicxx.mpich2
  slave mpicxx.1.gz: /usr/share/man/man1/mpicxx.mpich2.1.gz
  slave mpif77: /usr/bin/mpif77.mpich2
  slave mpif77.1.gz: /usr/share/man/man1/mpif77.mpich2.1.gz
  slave mpif90: /usr/bin/mpif90.mpich2
  slave mpif90.1.gz: /usr/share/man/man1/mpif90.mpich2.1.gz
/usr/lib/openmpi/include - priority 40
  slave libmpi++.so: /usr/lib/openmpi/lib/libmpi_cxx.so
  slave libmpi.so: /usr/lib/openmpi/lib/libmpi.so
  slave libmpif77.so: /usr/lib/openmpi/lib/libmpi_f77.so
  slave libmpif90.so: /usr/lib/openmpi/lib/libmpi_f90.so
  slave mpiCC: /usr/bin/mpic++.openmpi
  slave mpiCC.1.gz: /usr/share/man/man1/mpiCC.openmpi.1.gz
  slave mpic++: /usr/bin/mpic++.openmpi
  slave mpic++.1.gz: /usr/share/man/man1/mpic++.openmpi.1.gz
  slave mpicc: /usr/bin/mpicc.openmpi
  slave mpicc.1.gz: /usr/share/man/man1/mpicc.openmpi.1.gz
  slave mpicxx: /usr/bin/mpic++.openmpi
  slave mpicxx.1.gz: /usr/share/man/man1/mpicxx.openmpi.1.gz
  slave mpif77: /usr/bin/mpif77.openmpi
  slave mpif77.1.gz: /usr/share/man/man1/mpif77.openmpi.1.gz
  slave mpif90: /usr/bin/mpif90.openmpi
  slave mpif90.1.gz: /usr/share/man/man1/mpif90.openmpi.1.gz
Current 'best' version is '/usr/include/mpich2'.

良くわからんけどrootで入り直して、マニュアルでOpenMPIに設定します。*4

# update-alternatives --config  mpi
There are 2 choices for the alternative mpi (providing /usr/include/mpi).

  Selection    Path                      Priority   Status
------------------------------------------------------------
* 0            /usr/include/mpich2        40        auto mode
  1            /usr/include/mpich2        40        manual mode
  2            /usr/lib/openmpi/include   40        manual mode

Press enter to keep the current choice[*], or type selection number: 2
update-alternatives: using /usr/lib/openmpi/include to provide /usr/include/mpi (mpi) in manual mode

もっかいsgeadminでログインしなおしてOpenMPIが使われるようになっているか確認します。

sgeadmin@master:~$ file /usr/bin/mpicc /usr/bin/mpirun
/usr/bin/mpicc:  symbolic link to `/etc/alternatives/mpicc'
/usr/bin/mpirun: symbolic link to `/etc/alternatives/mpirun'
sgeadmin@master:~$ file /etc/alternatives/mpicc /etc/alternatives/mpirun
/etc/alternatives/mpicc:  symbolic link to `/usr/bin/mpicc.openmpi'
/etc/alternatives/mpirun: symbolic link to `/usr/bin/mpirun.openmpi'

ようやく正常に使えるようになったので、こんな感じのテストプログラムを作って流してみます。

#include <mpi.h>
#include <stdio.h>

main(int argc, char** argv)
{
int nproc;
int myrank;
char hostname[80];
int name_len;
int ierr;
int i;
MPI_Init(&argc, &argv);

ierr=MPI_Comm_size(MPI_COMM_WORLD, &nproc);
if(ierr!=0) fprintf(stderr,"ierr from MPI_Comm_size = %d\n",ierr);
ierr= MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
if(ierr!=0) fprintf(stderr,"ierr from MPI_Comm_rank = %d\n",ierr);

MPI_Get_processor_name(hostname, &name_len);
if(name_len >80) hostname[79]='\0';

for(i=0; i< nproc; i++)
{
  if(myrank==i)
  {
    fprintf(stderr,"myrank is %d of %d on %s\n",myrank,nproc, hostname);
  }
  MPI_Barrier(MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}

> mpicc tmp.c
> mpirun -np 8 -machinefile hostfile ./a.out
myrank is 0 of 8 on master
myrank is 1 of 8 on master
myrank is 2 of 8 on node001
myrank is 3 of 8 on node001
myrank is 4 of 8 on node002
myrank is 5 of 8 on node002
myrank is 6 of 8 on node003
myrank is 7 of 8 on node003

ようやく正常に流れました。ちなみ、alternativesの設定をいじるまではMPI_Comm_sizeを呼ぶとプロセス数が0だと言われていました。
mpich2とOpenMPIが混ざってたので、おそらくMPI_COMM_WORLDの値が両ライブラリの間で違うのが原因でしょう。

これは、複数のMPIライブラリが用意された環境だとはまるポイントで、MPIの規格で定義されているいくつかの値(MPI_COMM_WORLDとかMPI_SUMとか)は実はmpi.hの中で定義されているマクロ変数で実装されていることが多くて、なおかつライブラリが変われば値自体は異なるものになっていることが多いので互換性はありません。
これらの定義済変数はコンパイル時に(正確にはプリプロセス時に)それぞれのライブラリが定義した値へと置き換えられてしまうので、今回のようにコンパイル時に使ったmpi.hと実行時に呼ばれた共有ライブラリが別の実装のものになっていると、この値に不整合が発生し、呼び出し側ではMPI_COMM_WORLDを渡したつもりなのに、ライブラリ側では未定義のCOMMが指定されているという現象が起きます。*5
この辺の事情が分かっていないSEが適当な作業をすると、ユーザから「俺のプログラム動かんようになったやないか！」というクレームが飛んでくるので気をつけましょうw*6

さて、せっかくなのでHPLも流しときましょう。まずはHPLのソースをダウンロードしてきます。*7

HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers

さっきの鍵と同じようにstarcluster putで送りこんで後はビルドして流すだけ。

> starcluster put smallcluster hpl-2.2.tar.gz /tmp
sgeadmin@master:~$ tar xfz /tmp/hpl-2.2.tar.gz
sgeadmin@master:~$ cd hpl-2.2/
sgeadmin@master:~/hpl-2.2$ cp setup/Make.Linux_PII_FBLAS ./

Make.Linux_PII_FBLASをエディタで開いて以下のように設定します。

TOPdir       = $(HOME)/hpl-2.2
MPdir, MPinc, MPlib  ->コメントアウト
LAdir=/usr/lib
Lalib= $(LAdir)/libblas.a
CC=mpicc
LINKER=mpicc

そして、make

> make arch=Linux_PII_FBLAS 2>&1 |tee makelog

正常にビルドできたら、bin/Linux_PII_FBLASの下にxhplという名前の実行ファイルとHPL.datという設定ファイルが生成されています。
デフォルトだとNsが小さすぎるのと、いくつかの設定を振りながら測定を繰り返す形になっているのでエディタでHPL.datを開いて"# of ～"と書かれている行の左端の数字を全部1にします。それから、Ns=10000, NBs=64, Ps=4, Qs=2くらいに設定してSGE経由でジョブを投げます。

> qsub -cwd -pe orte 8 -b y mpirun ./xhpl

正常に終了したらmpirun.o*とかmpirun.e*というファイルに標準出力や標準エラー出力が吐かれています。

sgeadmin@master:~/hpl-2.2/bin/Linux_PII_FBLAS$ cat mpirun.o13
================================================================================
HPLinpack 2.2  --  High-Performance Linpack benchmark  --   February 24, 2016
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :   10000
NB     :      64
PMAP   : Row-major process mapping
P      :       4
Q      :       2
PFACT  :    Left
NBMIN  :       2
NDIV   :       2
RFACT  :    Left
BCAST  :   1ring
DEPTH  :       0
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR00L2L2       10000    64     4     2              88.81              7.509e+00
HPL_pdgesv() start time Tue May 31 07:56:39 2016

HPL_pdgesv() end time   Tue May 31 07:58:08 2016

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0013934 ...... PASSED
================================================================================

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================

8コアで7.6GFlopsってずいぶん低いですね。
使っているCPUは、 Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHzだそうなので、理論ピーク性能は8コア合計で 4*2*2.8*8= 172.7GFlopsなので実効効率は4.4%
さすがにサイズが小さすぎたようなので、N=40000まで増やして再測定してみました。しかし、これでも27.4GFlopsなので実効効率は15%くらいですかね。

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR00L2L2       40000    64     4     2            1557.25              2.740e+01
HPL_pdgesv() start time Tue May 31 08:17:37 2016

HPL_pdgesv() end time   Tue May 31 08:43:35 2016

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0007106 ...... PASSED
================================================================================

ドキュメントによれば、starclusterが使っているBLASはAtlasらしいので、リビルドしていない状態だとこんなもんで限界かもしれません。時間があったら、HPCCのDGEMMテストでも使って追試してみましょう。

かなり長くなってしまったので、cfnclusterは次の記事に分けます。

*1:2008年のSCの時に見たような気がするけど、気のせいかも

*2:そんな層が居るかどうかはともかくとしてw

*3:Ctrol-C押すといきなりsshのセッションが切れたりとか・・・

*4:これ、本来はauto設定にするならpriorityの値を変えないと駄目なんじゃなかろうか・・・

*5:Commが未定義の時にMPI_Comm_size/rankがこんな挙動してて良いのかどうかは未確認

*6:といいつつ、MPI関連とかジョブスケジューラ関連の設定ミスはエンドユーザから指摘してもなかなか理解していないSEが多いのも事実。

*7:あら、今年に入ってアップデートされてるな・・・今さら何を変えたんだろうか。

2016-03-24

windows(cygwin)でansibleその3

とりあえずparamikoで動くようになったものの、.ssh/configが使えないという致命的な欠点に気付いて、再トライしてみました。

といっても、ぐぐれどもぐぐれども前々回みつけたこのblogと同じく、"-o ContorolMaster=no"を指定すればOKって記事しかひっかかりません。
blog.simonmetzger.de

駄目もとでもう一回やってみようかと思って、ansible.cfgに

[ssh_connection]
ssh_args = -o ControlMaster=no

を書くと・・・あっさり成功しました。

しかも、何回か間隔を空けて再実行してみましたが100%成功してます。

改めて前々回の自分の記事を見直すと

しかし、ansible.cfgに書いたのでは-vvvvオプションを付けると、あいかわらずControlMaster=autoになっていたので、コマンドライン引数を追加して
> ansible hoge.huga.com -m ping -i hosts --ssh-extra-args="-o ControlMaster=no"
にしてみました。

とあります。しかし、これが何かの間違いだったようで*1今回は-vvvvオプションを付けたら、ControlMaster=noが設定されていました。
おそらく、ansible.cfgに書くとデフォルトで付けるオプションを変更するけど、--ssh-extra-argsだと追加で渡すという挙動なんじゃないかと思います。*2

というわけで、前回作ったparamikoの設定は破棄してOpenSSHとともに生きていきます。

*1:ansible.cfgのスペルを間違えてたとか、そんなレベルの単純ミスな気がする・・・

*2:名前的にもssh_argsとssh-extra-argsだから、この挙動の方が正しそう

2016-03-14

windows(cygwin)でansibleその2

ssh接続に成功したりしなかったりする問題は結局解決しないままなんですが、

OpenSSHがだめならparamikoを使えば良いじゃない

ということで、paramikoで試してみました。

ansible hoge.huga.com -m ping -i hosts -c paramiko -k
SSH password:  <- 秘密鍵のパスフレーズを入力
hoge.huga.com | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

こっちはあっさり成功！
何回か試しましたが、100%つながります。

ただし、ここの記事によると、paramikoを使うとOpenSSHより遅いらしいので、ansible.cfgに

RHEL6系でansibleを使うならrecord_host_keysをFalseにすると速くなる - still deeper

record_host_keys=False

を指定しておきます。あと-c paramikoとかも毎回指定するのも面倒なので、まとめてansible.cfgには次のように指定しておきます。

[defaults]
ask_pass=True
transport=paramiko
[paramiko]
record_host_keys=False

これでもっかいpingを打ってみると

ansible hoge.huga.com -m ping -i hosts -c paramiko -k
SSH password:  <- 秘密鍵のパスフレーズを入力
hoge.huga.com | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

うむ。問題無し
パフォーマンスを考えると、そのうちparamikoからOpenSSHに乗り換えたくなるかもしれませんが、そんな大規模ノードの管理に使うつもりは無いので当面はこれでやってみます。

2016-03-11

windows(cygwin)でansible

ずいぶん前に、windows nativeなpythonに無理矢理ansibleをインストールしてみたら、pwdだのfcntlだのといったUnix固有の標準ライブラリに依存しまくってて、まったく歯が立たなかったので放置してたんですが、最近babunというcygwinの派生版を使い始めたのでこっちで試してみることにしました。

pipのインストール

現時点のbabunのpythonは2.7.10なので、pipのwebページによるとインストール済のはずですが、

pip is already installed if you're using Python 2 >=2.7.9 or Python 3 >=3.4 downloaded from
https://pip.pypa.io/en/stable/installing/

どうやらインストールされてないようでwhich pipするとwindows nativeのpython用のpipが返されます。

> which pip
/cygdrive/c/ProgramData/chocolatey/bin/pip

公式ページの英語版ドキュメントによると

The ensurepip package provides support for bootstrapping the pip installer into an existing Python installation or virtual environment. This bootstrapping approach reflects the fact that pip is an independent project with its own release cycle, and the latest available stable version is bundled with maintenance and feature releases of the CPython reference interpreter.
In most cases, end users of Python shouldn’t need to invoke this module directly (as pip should be bootstrapped by default), but it may be needed if installing pip was skipped when installing Python (or when creating a virtual environment) or after explicitly uninstalling pip.
https://docs.python.org/2.7/library/ensurepip.html

とのことなので、ensurepipを使ってインストールしましょう。

> which python
/usr/bin/python
> python -m ensurepip
Ignoring indexes: https://pypi.python.org/simple
Collecting setuptools
Collecting pip
Installing collected packages: setuptools, pip
Successfully installed pip-6.1.1 setuptools-15.2
> which pip
/usr/bin/pip
> pip install -U pip
You are using pip version 6.1.1, however version 8.1.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting pip
  Downloading pip-8.1.0-py2.py3-none-any.whl (1.2MB)
    100% |████████████████████████████████| 1.2MB 327kB/s
Installing collected packages: pip
  Found existing installation: pip 6.1.1
    Uninstalling pip-6.1.1:
      Successfully uninstalled pip-6.1.1
Successfully installed pip-8.1.0

virtualenvのインストール

以前書いたwindows版の記事を参考にpipでインストールします。
hpcmemo.hatenablog.com

> pip install virtualenv
Collecting virtualenv
  Downloading virtualenv-15.0.0-py2.py3-none-any.whl (1.8MB)
    100% |████████████████████████████████| 1.8MB 231kB/s
Installing collected packages: virtualenv
Successfully installed virtualenv-15.0.0

ansibleのインストール

virtualenvで環境を作って、ansibleをインストールします。

> virtualenv ansible
New python executable in /cygdrive/c/Users/n_so5/OneDrive/Python/ansible/bin/python2.7
Also creating executable in /cygdrive/c/Users/n_so5/OneDrive/Python/ansible/bin/python
Installing setuptools, pip, wheel...done.
> cd ansible
> . bin/activate
(ansible)>  pip install ansible
Collecting ansible
  Using cached ansible-2.0.1.0.tar.gz
Collecting paramiko (from ansible)
  Downloading paramiko-1.16.0-py2.py3-none-any.whl (169kB)
    100% |████████████████████████████████| 174kB 706kB/s
Collecting jinja2 (from ansible)
  Downloading Jinja2-2.8-py2.py3-none-any.whl (263kB)
    100% |████████████████████████████████| 266kB 1.1MB/s
Collecting PyYAML (from ansible)
  Downloading PyYAML-3.11.tar.gz (248kB)
    100% |████████████████████████████████| 256kB 758kB/s
Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/lib/python2.7/site-packages (from ansible)
Collecting pycrypto>=2.6 (from ansible)
  Downloading pycrypto-2.6.1.tar.gz (446kB)
    100% |████████████████████████████████| 450kB 975kB/s
Collecting ecdsa>=0.11 (from paramiko->ansible)
  Downloading ecdsa-0.13-py2.py3-none-any.whl (86kB)
    100% |████████████████████████████████| 92kB 2.3MB/s
Collecting MarkupSafe (from jinja2->ansible)
  Downloading MarkupSafe-0.23.tar.gz
Installing collected packages: ecdsa, pycrypto, paramiko, MarkupSafe, jinja2, PyYAML, ansible
  Running setup.py install for pycrypto ... done
  Running setup.py install for MarkupSafe ... done
  Running setup.py install for PyYAML ... done
  Running setup.py install for ansible ... done
Successfully installed MarkupSafe-0.23 PyYAML-3.11 ansible-2.0.1.0 ecdsa-0.13 jinja2-2.8 paramiko-1.16.0 pycrypto-2.6.1

動作確認

とりあえずテストとしてpingでも打ってみます。

> echo hoge.huga.com > hosts
> ansible hoge.huga.com -m ping -i hosts
0 [main] python2.7 6852 child_info_fork::abort: address space needed by '_speedups.dll' (0x460000) is already occupied
ERROR! Unexpected Exception: 'NoneType' object has no attribute 'terminate'
to see the full traceback, use -vvv

_sppedups.dllってなんじゃと思ってfindしてみると

> find ./ -name _speedups.dll
./lib/python2.7/site-packages/markupsafe/_speedups.dll

さっきansibleを入れた時に一緒にインストールされたMarkupSafeに含まれるライブラリのようです。

おそらく、こちらの記事と同じ問題だろうと推測して、同じことをやってみます。
d.hatena.ne.jp

> find `pwd` -name '*.dll' -o -name '*.so' >~/rebase_list
一旦babunのターミナルを終了して、%USERPROFILE%\.babun\cygwin\bin\ash.exe を実行
$cd $HOME
$/bin/rebaseall -T rebase_list -v
(中略)
/usr/bin/cygattr-1.dll: new base = 6fe10000, new size = 10000
/usr/bin/cygatomic-1.dll: new base = 6fe20000, new size = 20000
/usr/bin/cygaspell-15.dll: new base = 6fe40000, new size = b0000
/usr/bin/cygarchive-13.dll: new base = 6fef0000, new size = b0000
/usr/bin/cygaprutil-1-0.dll: new base = 6ffa0000, new size = 30000
/usr/bin/cygapr-1-0.dll: new base = 6ffd0000, new size = 30000

再度babunを起動して、

> ansible hoge.huga.com -m ping -i hosts
hoge.huga.com | UNREACHABLE! => {
    "changed": false,
    "msg": "SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue",
    "unreachable": true
}

forkの問題は解消されたっぽいけどやっぱり接続できません。
とりあえず、-vvvvを付けて実行してみると、ログの最後の方で"Broken Pipe"というのが表示されていたのでぐぐってみたところ、こんな記事がひっかかりました。
blog.simonmetzger.de
しかし、ansible.cfgに書いたのでは-vvvvオプションを付けると、あいかわらずControlMaster=autoになっていたので、コマンドライン引数を追加して

> ansible hoge.huga.com -m ping -i hosts --ssh-extra-args="-o ControlMaster=no"

にしてみました。

それでもUNREACHABLEは変わらないので、さらにしばらくぐぐり続けたところ、こんなページがみつかりました。
serverfault.com
.sshの下のファイルのowner/groupを見たらgroupが"None"になっていたので、試しに

> chown ユーザ名:Users ~/.ssh/*

をしてみたところ、ようやく接続できました。

> ansible hoge.huga.com -m ping -i hosts --ssh-extra-args="-o ControlMaster=no"
hoge.huga.com | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

何回か再実行すると、時々FAILになったりするので、ちょっと実用にはほど遠いんですがとりあえずこの辺で一旦終了しときます。

これまでのまとめ

babun環境からansibleを使う時は

ansibleをインストールしたら、dllファイルのリストを作ってrebaseall
.ssh/ 以下のファイルのowner/groupを正しく設定する
--ssh-extra-args="-o ControlMaster=no"をつける

それでも、なんか失敗する時あり・・・orz

2016-02-12

Intel版python

先日、VTuneのドキュメントを漁っていたら面白そうなものを見つけました。

Python* Distribution | Intel® Developer Zone

ドキュメントを見たところ、MKLとnumpy/scipyを同梱したpythonのようです。*1

ダウンロードするためには、Technical Previewへ参加する必要がありますが、これに参加するとついでにVTuneの最新版のTechnical Previewにも参加したことになるようです。

Python* Profiling | Intel® Developer Zone

これ、python専用の機能限定版じゃなくて、フル機能版みたいなので*2色々と楽しめそうです。

とりあえずpythonを入れるマシンを用意せんといかんなぁ

*1:enthoughtの有料版と一緒じゃねーか？

*2:会社のPCにこっそり入れようとしたら、"Newer version is already installed"とか言われたので・・・

2016-01-29

git-hookを使ってコミット時にテスト

就職して以来CVS->subversion->gitと、かれこれ10年近くVCSを使ってきましたが、今だにcommitを忘れるという初歩的な問題が解決できません・・・

ソース作成/修正
ビルド
テスト
コミット

という順番で作業してると、ビルド&テストしてる間に時間が空いてしまって作業が止まるのが原因だろうと仮定してgit-hookを使ったポカヨケを作り込んでみました。

サンプルのプロジェクトとして、懐しのはろーわーるどをCで書いてみました。

#include <stdio.h>
int main(int argc, char *argv[])
{
  printf("hello world!\n");
  return 0;
}

ビルド用のmakefileはこんな感じ

LM=hello
LINKER=$(CC)
LDFLAGS=$(CFLAGS)

SRCS_C=hello.c
OBJS_C=$(SRCS_C:%.c=%.o)

OBJS=$(OBJS_C)

all:$(LM)

$(LM):$(OBJS)
        $(LINKER) $(LDFLAGS) $(OBJS) -o $(LM)


.PHONY: clean

clean:
        -rm -rf *.o $(LM)

そして、テストスクリプトとして次のようなpythonのスクリプトをtest.pyとして置いておきます。

#!/usr/bin/python
# -*- coding: utf-8 -*-

import subprocess
import sys

if __name__ == "__main__":
    output=subprocess.check_output("./hello")
    if output == "Hello world!\n":
        sys.exit(0)
    else:
        print("FAILED!!")
        print output
        sys.exit(1)

ここまで準備できたら、次はhookを用意します。

".git/hooks/commit-msg"という名前で次のようなシェルスクリプトを作ります。

#!/bin/sh

#コミットメッセージが空でなければ
#ビルド&テストを実施
grep -v -e'^#' -e'^$' $1 >/dev/null
if [ $? -eq 0 ];then
  make && python test.py
  exit $?
fi

これでgit commitしてメッセージを書き終えると自動的にmakeとtest.pyが実行されます。
今の状態だと、hello は"hello world!"と出力するのに、テストでは"Hello world!"と出力することを期待されているのでテストに失敗し、コミットできないはずです。
ではさっそく試してみましょう。

> git add hello.c Makefile test.py
>  git commit -m'Initial commit'
cc    -c -o hello.o hello.c
cc  hello.o -o hello
FAILED!!
hello world!

>git log
fatal: bad default revision 'HEAD'

たしかに、コミットされてません！
hello.cを修正して、正しく"Hello world!"と出力するように変えると

 git commit -m'Initial commit'
cc    -c -o hello.o hello.c
cc  hello.o -o hello
[master (root-commit) c524d0b] Initial commit
 3 files changed, 42 insertions(+)
 create mode 100644 Makefile
 create mode 100644 hello.c
 create mode 100755 test.py

>git log
commit c524d0bd9c1ed94fa20b9eab29a222f6312237d3
Author: n_so5 <n_so5@localhost>
Date:   Fri Jan 29 14:10:47 2016 +0900

    Initial commit

無事にコミットされました！

あとは、必ずテストを書くという強い心があれば大丈夫・・・orz

2015-12-02

cmderでIntel CompilerとVSを使う設定

数値計算というと、大学とかに置いてあるスパコンにログインして、Linuxのコマンドラインで作業してるイメージが強いかもしれませんが、時々windows上でVisualStudioとIntel コンパイラで開発してますっていう人にも出会います。*1

となると、受託開発ではやっぱりwindowsでやってねというお話をいただくこともあるんですが、いかんせんVSの起動が遅すぎるので
viでコードを書いてコマンドラインからビルドするといういつものスタイルで作業したくなるわけです。
しかし、インテルコンパイラのインストール時にもろもろ設定してくれたコマンドプロンプトは、所詮windowsのデフォルトのコマンドプロンプト上で環境変数などを設定しただけのものなので、使い難いったらありゃしない。

前置きが長くなりましたが、cmderの設定をいじって、インテルコンパイラを使える状態のcmd.exeが起動するようにしようというのが今回の目的です。
ちなみに環境はIntel Parallel Studio 2016+Visual Studio 2015+windows7な環境ですので、インストール先などは適宜読み替えてください。

まずは既存の設定の確認から

スタートメニューから「すべてのプログラム」->「Intel Parallel Studio XE 2016」->「Compiler and Performance Libraries」-> 「Command Prompt with Intel Compiler 16.0」とたどって「Intel 64 Visual Studio 2015 environment」を右クリックします。

f:id:n_so5:20151202104215p:plain

プロパティ画面が開いたらリンク先欄にある文字列をコピーしておいて、cmderを起動します。
cmderのsettings画面で、「Startup」-> 「Tasks」とたどって「+」ボタンを押し右下の欄にペーストします。
後は上の方にある欄に適当に名前を付ければ完了です。

f:id:n_so5:20151202111235p:plain

cmder上で[+]ボタンの横にある下向き三角をクリックすると、メニューがずらっと出てくるので
さきほど作ったtaskを選ぶと、こんな感じでiclにパスが通った状態でコマンドプロンプトが起動します。

f:id:n_so5:20151202111651p:plain

あとは、vi+ctagsがあれば快適なコーディングができます。
ま、自由にいじっていいならcmakeでビルドするようにして、バッチファイルでも置いとくんですが・・・

*1:以前の同僚はVSしか使ったことが無いとか言ってて入社してきた時に凄いカルチャーショックを受けました

HPCメモ

HPC(High Performance Computing)に関連したりしなかったりすることのメモ書き