Next: Output of Hyperslab Data
Up: IOUtil
Previous: Saving/Generating Parameter Files
Contents
I/O Modes
For a run on multiple processors, scalar, 1D, and 2D output will always be
written from only processor zero (that is, required data from all other
processors will be sent to processor zero, which then outputs all the gathered
data). For full-dimensional output of grid arrays this may become a quite expensive
operation since output by only a single processor will probably result in an
I/O bottleneck and delay further computation. For this reason Cactus offers
different I/O modes for such output which can be controlled by the
IO::out_mode parameter, in combination with IO::out_unchunked
and IO::out_proc_every. These parameters allow I/O to be optimised for
your particular machine architecture and needs:
- IO::out_mode = "onefile"
As for the 1D and 2D I/O methods, writing to file is performed only
by processor zero.
This processor gathers all the output data from the other processors
and then writes to a single file. The gathered grid array data from each
processor can be either written in chunks (IO::out_unchunked =
"no") with each chunk containing the data from a single processor, or
collected into a single global array before writing ( IO::out_unchunked = "yes"). The default is to write the data in chunks.
This can be changed by adding an option string to the group/variable name(s)
in the out_vars parameter with the key out_unchunked and an
associated string value "yes|no|true|false".
- IO::out_mode = "np"
Output is written in parallel for groups of processors. Each group
consists of IO::out_proc_every processors which have assigned one I/O
processor which gathers data from the group and writes it to file. The
chunked output will go into IO::out_proc_every files.
The default number of processors in a group is eight.
- IO::out_mode = "proc"
This is the default output mode.
Every processor writes its own chunk of data into a separate output file.
Probably the single-processor "proc" mode is the most efficient output
mode on machines with a fast I/O subsystem and many I/O nodes (e.g. a Linux
cluster with local disks attached to each node) because it provides the highest
parallelity for outputting data. Note that on very large numbers of processors
you may have to fall back to "np", doing output by every so many processors,
mode if the system limit of maximum open file descriptors is exceeded (this is
true for large jobs on a T3E).
While the "np" and "proc" I/O modes are fast for outputting large
amounts of data from all or a group of processors in parallel, they have
the disadvantage of writing chunked files. These files then have to be
recombined during a postprocessing phase so that the final unchunked data can be
visualized by standard tools. For that purpose a recombiner utility program is
provided by the thorns offering parallel I/O methods.
Next: Output of Hyperslab Data
Up: IOUtil
Previous: Saving/Generating Parameter Files
Contents
|