By now you have gotten your feet wet; installing HPX in source and out of source, getting a feel for how application and component development works and playing around with running HPX programs using a variety of different command line options. If you have experimented with multi-locality runs you know that it is a bit of a pain to setup, you have to open up as many terminals as you want localities, make sure you set all the command line options just right for each one. This is all good and fine for maybe 2 or 3, but for a truly distributed run you might want as many as 64 localities. How do you do this without opening up 64 terminals? The answer is by learning the usage of hpx_run.py.
Essentially hpx_run.py automates the process of setting up each runtime instance, which leads to less human errors when doing multi-locality runs. If you have used MPI before you might want to liken our hpx_run.py to their mpirun. Indeed there are some similarities, but there are a few "gotchas" that you will want to pay attention to throughout this tutorial.
The hpx_run.py script is located in your hpx checkout at $HPX_ROOT/tools/hpx_run.py.
If you haven't already, you should checkout this directory and put it in your path (so you can easily run hpx_run.py).
The rest of this tutorial is going to assume that the HPX examples are being built and are in your path (by default fibonacci
is the only example being built, you may need to edit HPX/examples/quickstart/CMakeLists.txt to uncomment the other examples).
Once you have gotten hpx_run.py, its time to start using it! Lets see what options we have available to use by running hpx_run.py with the help flag.
$ hpx_run.py -h usage: hpx_run.py [options] command options: -h, --help show this help message and exit -a APP_LOGLEVEL, --app_logging=APP_LOGLEVEL Enable application logging at specified level (default 0) -d, --debug Put run in debug mode -g, --use_gdb Execute local runtimes in GDB -o HPX_LOGLEVEL, --hpx_logging=HPX_LOGLEVEL Enable HPX logging at specified level (default 0) -l LOCALITIES, --localities=LOCALITIES Specify homogeneous locality layout -m MACHINE_FILE, --machinefile=MACHINE_FILE Set resources based on machine file (e.g., PBS_NODEFILE) -p BASE_PORT, --port=BASE_PORT Set base port (default 2222) -s, --shhh Suppress most output
The bare minimum you need to run an HPX application using this script is to supply the program name and the locality setup. For now we are going to use one locality and one thread, to do this run all you need to type is:
$ hpx_run.py -l 1:1 "fibonacci" System view: 1 nodes: Node 'node0' with 1 cores Locality set: 1 localities: Locality 'L0' with 1 threads Distributed runtime: 1 local instances: Runtime 'rts0' with 1 threads Runtime 'rts0' stdout: elapsed: 0.018269, result: 55 Runtime 'rts0' quit
You have successfully run your first HPX program using hpx_run.py!
All you need to do to make this a multi-locality run is to change the locality layout.
When you specify '-l' you need to supply two options, the number of localities and the number of threads with a colon in between.
For example -l 2:1
would run with two localities and 1 thread.
Specifying -l 4:8
would run with four localities and 8 threads.
Now lets do a run with two localities and two threads for fibonacci.
$ hpx_run.py -l 2:2 "fibonacci" System view: 2 nodes: Node 'node1' with 2 cores Node 'node0' with 2 cores Locality set: 2 localities: Locality 'L0' with 2 threads Locality 'L1' with 2 threads Distributed runtime: 2 local instances: Runtime 'rts1' with 2 threads Runtime 'rts0' with 2 threads Runtime 'rts0' quit Runtime 'rts1' stdout: elapsed: 0.1188, result: 55 Runtime 'rts1' quit
By now you might have noticed that we are enclosing the program in quotes.
The reason for this is so that we can specify command line arguments to the program itself without interfering with the command line arguments to hpx_run.py.
For example, we can change the value of fibonacci to calculate using the -v
option.
This time I'm going to run fibonacci4 with 4 localities and 8 threads each and a value of '20'.
$ hpx_run -l 4:8 "fibonacci4 -v 20" System view: 4 nodes: Node 'node1' with 8 cores Node 'node0' with 8 cores Node 'node3' with 8 cores Node 'node2' with 8 cores Locality set: 4 localities: Locality 'L2' with 8 threads Locality 'L3' with 8 threads Locality 'L0' with 8 threads Locality 'L1' with 8 threads Distributed runtime: 4 local instances: Runtime 'rts3' with 8 threads Runtime 'rts2' with 8 threads Runtime 'rts1' with 8 threads Runtime 'rts0' with 8 threads Runtime 'rts0' quit Runtime 'rts1' quit Runtime 'rts2' quit Runtime 'rts3' stdout: elapsed: 1.83268, result: 6765 Number of invocations of fib(): 9790 Runtime 'rts3' quit
Just with the -l
option in hpx_run.py you have automated a ton of work for yourself!
Even if all you do is single locality runs, using this to automatically set up the runtime can be a huge timesaver.
Up until now we have been running locally, which is fine for testing your code to see if it works. However, the true benefit of having localities is to be able to use multiple physical resources in your computation. To this extent hpx_run.py allows you to specify a machine file (usually by $PBS_NODEFILE, but you can specify your own file if you want) which will construct its "System view" and will automatically allocate all system resources to your computation. The only real difference is a change in options to hpx_run.py which I illustrate below.
// This is the command to run inside of a PBS job $ hpx_run.py -m $PBS_NODEFILE "fibonacci4 -v 20"
You can optionally specify the -l
option with a machine file to only use part of the resources allocated to you, if you want to.
Well those are the basics for using hpx_run.py to do multi-locality runs. See the next installment in this tutorial for some problems you might run into as well as mastering the other options.