环境变量配置错误
现象描述
用户集群环境配置问题导致mpirun命令在集群环境中运行失败。
运行失败示例如下:
$ mpirun ~/AllReduce -------------------------------------------------------------------------- Sorry! You were supposed to get help about: opal_init:startup:internal-failure But I couldn't open the help file: /usr1/workspace/Version_pipeline_ompi_aarch64_gcc10.3.1_CentOS7.6_MLX4.9/ompi/build/../share/openmpi/help-opal-runtime.txt: No such file or directory. Sorry! --------------------------------------------------------------------------
可能原因
“.bashrc”文件未配置“OPAL_PREFIX”环境变量。
恢复步骤
- 使用PuTTY工具,以Hyper MPI普通用户(例如“hmpi_user”)登录至作业执行节点。
- 按照配置环境变量修改“.bashrc”文件,配置环境变量。
父主题: FAQ