17.2.77. MPI_Comm_spawn

MPI_Comm_spawn — 生成多个相同的二进制进程。

17.2.77.1. 语法

17.2.77.1.1. C语法

#include <mpi.h>

int MPI_Comm_spawn(const char *command, char *argv[], int maxprocs,
     MPI_Info info, int root, MPI_Comm comm,
     MPI_Comm *intercomm, int array_of_errcodes[])

17.2.77.1.2. Fortran语法

USE MPI
! or the older form: INCLUDE 'mpif.h'
MPI_COMM_SPAWN(COMMAND, ARGV, MAXPROCS, INFO, ROOT, COMM,
     INTERCOMM, ARRAY_OF_ERRCODES, IERROR)

     CHARACTER*(*) COMMAND, ARGV(*)
     INTEGER INFO, MAXPROCS, ROOT, COMM, INTERCOMM,
     ARRAY_OF_ERRCODES(*), IERROR

17.2.77.1.3. Fortran 2008 语法

USE mpi_f08
MPI_Comm_spawn(command, argv, maxprocs, info, root, comm, intercomm,
             array_of_errcodes, ierror)
     CHARACTER(LEN=*), INTENT(IN) :: command, argv(*)
     INTEGER, INTENT(IN) :: maxprocs, root
     TYPE(MPI_Info), INTENT(IN) :: info
     TYPE(MPI_Comm), INTENT(IN) :: comm
     TYPE(MPI_Comm), INTENT(OUT) :: intercomm
     INTEGER :: array_of_errcodes(*)
     INTEGER, OPTIONAL, INTENT(OUT) :: ierror

17.2.77.2. 输入参数

command: 要启动的程序名称（字符串，仅在root节点有效）。
argv: 传递给command的参数（字符串数组，仅在root节点有效）。
maxprocs: 最大启动进程数（整数，仅在根节点有效）。
info: 一组键值对，用于告知运行时系统在何处以及如何启动进程（句柄，仅在根节点有效）。
root: 检查前述参数的进程等级（整数）。
comm: 包含派生进程组的内部通信器（句柄）。

17.2.77.3. 输出参数

intercomm: 原始组与新生成组之间的进程间通信器（句柄）。
array_of_errcodes: 每个进程对应一个错误码（整数数组）。
ierror: 仅限Fortran：错误状态（整数）。

17.2.77.4. 描述

MPI_Comm_spawn 尝试启动由 command 指定的MPI程序的 maxprocs 个相同副本，与它们建立通信并返回一个交互通信器。被启动的进程称为子进程。子进程拥有自己独立的MPI_COMM_WORLD，与父进程的相互分离。MPI_Comm_spawn 在 comm 上是集体操作，且在子进程中调用 MPI_Init 之前可能不会返回。同样，子进程中的 MPI_Init 在所有父进程调用 MPI_Comm_spawn 之前可能不会返回。从这个意义上说，父进程中的 MPI_Comm_spawn 和子进程中的 MPI_Init 构成了跨越父进程和子进程联合体的集体操作。MPI_Comm_spawn 返回的交互通信器在本地组中包含父进程，在远程组中包含子进程。本地组和远程组中进程的顺序分别与父进程中 comm 组的顺序和子进程中MPI_COMM_WORLD的顺序相同。子进程可以通过函数 MPI_Comm_get_parent 获取此交互通信器。

MPI标准允许实现使用MPI_COMM_WORLD的MPI_UNIVERSE_SIZE属性来指定程序中将处于活动状态的进程数量。虽然这个MPI标准的实现定义了MPI_UNIVERSE_SIZE，但它不允许用户设置其值。如果您尝试设置MPI_UNIVERSE_SIZE的值，将会收到错误消息。

command 参数

command参数是一个包含待启动程序名称的字符串。在C语言中，该字符串以空字符结尾。在Fortran中，会去除首尾空格。MPI首先会在生成进程的工作目录中查找该文件。

argv 参数

argv 是一个包含传递给程序的参数的字符串数组。按照惯例，argv 的第一个元素是传递给 command 的第一个参数，而非命令本身（某些上下文中可能不同）。在 C 语言中，参数列表以 NULL 结尾；在 Fortran 中则以空字符串结尾（注意：确保 argv 数组最后一项为空字符串是 MPI 应用程序的责任，编译器不会自动插入）。在 Fortran 中，字符串首尾的空格会被自动去除，因此全为空格的字串会被视为空字符串。常量 MPI_ARGV_NULL 在 C 和 Fortran 中均可用于表示空参数列表。在 C 语言中，该常量等同于 NULL。

在C语言中，MPI_Comm_spawn的参数argv与main函数的argv参数在两方面存在差异。首先，它的元素索引偏移了一位。具体来说，main函数中的argv[0]包含程序名称（由command指定），而main函数中的argv[1]对应MPI_Comm_spawn中的argv[0]，main函数中的argv[2]对应MPI_Comm_spawn中的argv[1]，以此类推。其次，MPI_Comm_spawn的argv必须以null结尾，以便确定其长度。向MPI_Comm_spawn传递MPI_ARGV_NULL作为argv参数时，会导致main函数接收到的argc值为1，且argv的第0个元素为程序名称。

maxprocs 参数

Open MPI 尝试启动 maxprocs 个进程。如果无法启动 maxprocs 个进程，则会引发 MPI_ERR_SPAWN 类错误。如果 MPI 能够启动指定数量的进程，MPI_Comm_spawn 将成功返回，并且已启动的进程数量 m 由 intercomm 的远程组大小给出。

使用默认行为的spawn调用被称为硬性调用。允许返回少于maxprocs进程数的spawn调用被称为软性调用。

info 参数

info参数在C语言中是MPI_Info类型的不透明句柄，在Fortran中是INTEGER类型。它是一个包含多个用户指定的(key,value)键值对的容器。key和value都是字符串（在C中是null-terminated char *，在Fortran中是character*(*)）。关于创建和操作info参数的例程描述，请参阅MPI-2标准第4.10节。

对于SPAWN调用，info参数向MPI和运行时系统提供额外的、与实现相关的指令，说明如何启动进程。在C或Fortran中，应用程序可以传递MPI_INFO_NULL。不需要对进程位置进行精细控制的便携式程序应使用MPI_INFO_NULL。

Open MPI 中可识别的 info 键如下。（MPI-2 标准第 5.3.4 节中提到的保留值未实现。）

Key                    Type     Description
---                    ----     -----------

host                   char *   Host on which the process should be
                                spawned.  See the orte_host man
                                page for an explanation of how this
                                will be used.
hostfile               char *   Hostfile containing the hosts on which
                                the processes are to be spawned. See
                                the orte_hostfile man page for
                                an explanation of how this will be
                                used.
add-host               char *   Add the specified host to the list of
                                hosts known to this job and use it for
                                the associated process. This will be
                                used similarly to the -host option.
add-hostfile           char *   Hostfile containing hosts to be added
                                to the list of hosts known to this job
                                and use it for the associated
                                process. This will be used similarly
                                to the -hostfile option.
wdir                   char *   Directory where the executable is
                                located. If files are to be
                                pre-positioned, then this location is
                                the desired working directory at time
                                of execution - if not specified, then
                                it will automatically be set to
                                ompi_preload_files_dest_dir.
ompi_prefix            char *   Same as the --prefix command line
                                argument to mpirun.
ompi_preload_binary    bool     If set to true, pre-position the
                                specified executable onto the remote
                                host. A destination directory must
                                also be provided.
ompi_preload_files     char *   A comma-separated list of files that
                                are to be pre-positioned in addition
                                to the executable.  Note that this
                                option does not depend upon
                                ompi_preload_binary - files can
                                be moved to the target even if an
                                executable is not moved.
ompi_stdin_target      char *   Comma-delimited list of ranks to
                                receive stdin when forwarded.
ompi_non_mpi           bool     If set to true, launching a non-MPI
                                application; the returned communicator
                                will be MPI_COMM_NULL. Failure to set
                                this flag when launching a non-MPI
                                application will cause both the child
                                and parent jobs to "hang".
ompi_param             char *   Pass an OMPI MCA parameter to the
                                child job.  If that parameter already
                                exists in the environment, the value
                                will be overwritten by the provided
                                value.
mapper                 char *   Mapper to be used for this job
map_by                 char *   Mapping directive indicating how
                                processes are to be mapped (slot,
                                node, socket, etc.).
rank_by                char *   Ranking directive indicating how
                                processes are to be ranked (slot,
                                node, socket, etc.).
bind_to                char *   Binding directive indicating how
                                processes are to be bound (core, slot,
                                node, socket, etc.).
path                   char *   List of directories to search for
                                the executable
npernode               char *   Number of processes to spawn on
                                each node of the allocation
pernode                bool     Equivalent to npernode of 1
ppr                    char *   Spawn specified number of processes
                                on each of the identified object type
env                    char *   Newline-delimited list of envars to
                                be passed to the spawned procs

bool类型的info键实际上是字符串，但会按以下规则进行求值：如果字符串值是数字，则转换为整数并强制转为布尔值（即零值为假，非零值为真）。如果字符串值为（不区分大小写）"yes"或"true"，则布尔值为真。如果字符串值为（不区分大小写）"no"或"false"，则布尔值为假。所有其他字符串值均不被识别，因此视为假。

root 参数

在root参数之前的所有参数仅由comm中排名等于root的进程进行检查。其他进程上这些参数的值将被忽略。

array_of_errcodes 参数

array_of_errcodes 是一个长度为 maxprocs 的数组，MPI 在其中报告请求启动的进程状态。如果成功生成所有 maxprocs 个进程，array_of_errcodes 将填充为 MPI_SUCCESS 值。如果有任何进程未能生成，array_of_errcodes 将填充为 MPI_ERR_SPAWN 值。在 C 或 Fortran 中，如果应用程序不关心错误代码，可以传递 MPI_ERRCODES_IGNORE。

17.2.77.5. 注意事项

父进程中MPI_Comm_spawn的完成并不必然意味着子进程中已调用MPI_Init（尽管返回的intercommunicator可以立即使用）。

17.2.77.6. 错误

几乎所有MPI例程都会返回一个错误值；C语言例程通过函数返回值返回，Fortran例程则通过最后一个参数返回。

在返回错误值之前，会调用与通信对象（如通信器、窗口、文件）关联的当前MPI错误处理程序。如果MPI调用未关联任何通信对象，则该调用被视为附加到MPI_COMM_SELF，并将调用关联的MPI错误处理程序。当MPI_COMM_SELF未初始化时（即在MPI_Init/MPI_Init_thread之前、MPI_Finalize之后，或仅使用会话模型时），错误会触发初始错误处理程序。初始错误处理程序可通过在使用世界模型时调用MPI_Comm_set_errhandler来修改MPI_COMM_SELF，或通过mpiexec的mpi_initial_errhandler命令行参数，或MPI_Comm_spawn/MPI_Comm_spawn_multiple的info键来设置。如果未设置其他适当的错误处理程序，则MPI I/O函数将调用MPI_ERRORS_RETURN错误处理程序，而其他所有MPI函数将调用MPI_ERRORS_ABORT错误处理程序。

Open MPI 包含三个可使用的预定义错误处理器：

MPI_ERRORS_ARE_FATAL 导致程序中止所有连接的MPI进程。
MPI_ERRORS_ABORT 一个可在通信器、窗口、文件或会话上调用的错误处理程序。当在通信器上调用时，其行为类似于在该通信器上调用MPI_Abort。如果在窗口或文件上调用，则行为类似于在包含对应窗口或文件中进程组的通信器上调用MPI_Abort。如果在会话上调用，则仅中止本地进程。
MPI_ERRORS_RETURN 向应用程序返回一个错误代码。

MPI应用程序也可以通过调用以下方式实现自己的错误处理程序：

MPI_Comm_create_errhandler 然后 MPI_Comm_set_errhandler
MPI_File_create_errhandler 然后 MPI_File_set_errhandler
MPI_Session_create_errhandler 然后 MPI_Session_set_errhandler 或在 MPI_Session_init
MPI_Win_create_errhandler 然后 MPI_Win_set_errhandler

请注意，MPI不保证MPI程序在出现错误后能够继续运行。

查看MPI手册页获取完整的MPI错误代码列表。

有关更多信息，请参阅MPI-3.1标准中的错误处理部分。

另请参阅