Next: Output of Hyperslab Data Up: IOUtil Previous: Saving/Generating Parameter Files Contents

I/O Modes

For a run on multiple processors, scalar, 1D, and 2D output will always be written from only processor zero (that is, required data from all other processors will be sent to processor zero, which then outputs all the gathered data). For full-dimensional output of grid arrays this may become a quite expensive operation since output by only a single processor will probably result in an I/O bottleneck and delay further computation. For this reason Cactus offers different I/O modes for such output which can be controlled by the IO::out_mode parameter, in combination with IO::out_unchunked and IO::out_proc_every. These parameters allow I/O to be optimised for your particular machine architecture and needs:

IO::out_mode = "onefile"
As for the 1D and 2D I/O methods, writing to file is performed only by processor zero. This processor gathers all the output data from the other processors and then writes to a single file. The gathered grid array data from each processor can be either written in chunks (IO::out_unchunked = "no") with each chunk containing the data from a single processor, or collected into a single global array before writing ( IO::out_unchunked = "yes"). The default is to write the data in chunks. This can be changed by adding an option string to the group/variable name(s) in the out_vars parameter with the key out_unchunked and an associated string value "yes|no|true|false".
IO::out_mode = "np"
Output is written in parallel for groups of processors. Each group consists of IO::out_proc_every processors which have assigned one I/O processor which gathers data from the group and writes it to file. The chunked output will go into IO::out_proc_every files. The default number of processors in a group is eight.
IO::out_mode = "proc"
This is the default output mode. Every processor writes its own chunk of data into a separate output file.

Probably the single-processor "proc" mode is the most efficient output mode on machines with a fast I/O subsystem and many I/O nodes (e.g. a Linux cluster with local disks attached to each node) because it provides the highest parallelity for outputting data. Note that on very large numbers of processors you may have to fall back to "np", doing output by every so many processors, mode if the system limit of maximum open file descriptors is exceeded (this is true for large jobs on a T3E).

While the "np" and "proc" I/O modes are fast for outputting large amounts of data from all or a group of processors in parallel, they have the disadvantage of writing chunked files. These files then have to be recombined during a postprocessing phase so that the final unchunked data can be visualized by standard tools. For that purpose a recombiner utility program is provided by the thorns offering parallel I/O methods.

Next: Output of Hyperslab Data Up: IOUtil Previous: Saving/Generating Parameter Files Contents