Using the SAGA based Master/Worker abstraction on FutureGrid ============================================================ *FIRST* - please skim the README to understand what parts are covered there, then return here. This file *only* covers the FG specific parts! Build/install the MasterWorker abstraction: -------------------------------------------- On all FG machines, you can use the SAGA CSA installations to build the master/worker library and examples, and any MW application you want to write on your own. The SAGA CSA installations are documented here: https://github.com/saga-project/saga-deployments/wiki You can load the SAGA environment, like with this example on india.futuregrid.org: eval `grep export /N/soft/SAGA/README.saga-1.6.gcc-4.1.2.india` The following steps will the install the master-worker components: # create installation tree (feel free to choose a different location) mkdir $HOME/install mkdir $HOME/install/lib mkdir $HOME/install/bin # build the MW library and examples make make -C example # install library and examples cp libsaga_pm_master_worker.* $HOME/install/lib/ cp example/rsh/mw_rsh_master $HOME/install/bin/ cp example/rsh/mw_rsh_worker $HOME/install/bin/ cp example/hello_world/mw_helloworld_master $HOME/install/bin/ cp example/hello_world/mw_helloworld_worker $HOME/install/bin/ # make sure your environment is set so that the code can be run. echo 'export PATH=$PATH:$HOME/install/bin/' >> $HOME/.bash_profile echo 'export LD_LIBRRAY_PATH=$LD_LIBRRAY_PATH:$HOME/install/lib/' >> $HOME/.bash_profile The above preparations need to be repeated on all hosts which you intent use for the M/W ensemble. Master/Worker distribution -------------------------- There are two distribution components to the MW abstraction: - running workers on remote machines - coordinating between master and worker (local to remote machine) Running remote worker --------------------- The two example codes contain the following lines of code: merzky@thinkie:~/saga/applications/master_worker$ grep wd.rm example/*/*cpp example/hello_world/mw_helloworld_master.cpp: wd.rm = "fork://localhost/"; example/rsh/mw_rsh_master.cpp: wd.rm = input; The 'wd.rm' specifies what resource URL should be used to start the worker. helloworld has that fixed at "fork://localhost/", the rsh example accepts the same as user input. If you want to run the worker on a different host, change "fork://localhost/" into, for example "ssh://sierra.futuregrid.org/" Coordinating master / worker ---------------------------- The coordination of the MW components is done via SAGA's advert service, which is a kind of central database, used to exchange small pices of information. The URL of that database is: advert://SAGA:SAGA_client@advert.cct.lsu.edu:8080/home/merzky/mw/run_1 In detail: avert:// - this is an advert database URL SAGA:SAGA_client - username/password for accessing that database advert.cct.lsu.edu - the database is hosted on that machine 8080 - port the database listens on /home/merzky/... - some path to separate information for each user The Master will pick that URL up from an environment setting, so the following needs to be defined before running the master: export SAGA_MW_ADVERT_URL="advert://SAGA:SAGA_client@advert.cct.lsu.edu:8080/home/merzky/master_worker/" For FutureGrid runs, please adjust the path element of the url so that it is unique for *you*. Running the examples -------------------- With the settings from above, you should be able to run the MW examples on FG, like this: # source $HOME/.bash_profile # mw_rsh_master advert root : advert://localhost/home/merzky/master_worker//merzky/ worker rm url : fork://localhost command : /bin/date Thu Apr 5 10:27:22 CEST 2012 command : /bin/hostname thinkie command : touch /tmp/hello_rsh_worker command : ls -l /tmp/hello_rsh_worker -rw-r--r-- 1 merzky merzky 0 2012-04-05 10:27 /tmp/hello_rsh_worker command : quit The example above uses a local advert service (see README), and runs a worker on localhost.