On Sat, May 04, 2019 at 05:11:20AM +0000, Chen Chen wrote:
I'm a new user of ohpc. Your project is very helpful and easy to use.
I'm utilizing it to deploy a cluster for my university department.
However after I finished the deployment and running several benchmarks,
the mvapich hang from time to time.
Linpack (HPL) ran successful with 2 4 and 64 nodes, but hang when given
8 and 16 nodes.
mpbench (in llcbench from utk, University of Tennessee) ran successful
with 16 and 32 nodes, but hang when given 64 nodes.
I grabbed the 2.3 and 2.3.1 source from mvapich and compiled myself. The
2.3.1 version works smoothly for my failed runs, and the 2.3 hangs, like
the one provided in ohpc release.
Therefore I request the developers to investigate it and upgrade the
mvapich in ohpc release. I'm grad to provide and help and logs.
2.3.1 will be included in the next release (1.3.8):https://github.com/openhpc/ohpc/issues/977