Rosetta Commons

March 29, 2011 at 7:50 pm #5262

Anonymous

First thing: I don’t know anything about MPI in Rosetta++ (and I don’t know that anyone remembers it well).

Rosetta (both 2 and 3) does not actually parallelize anything as a general rule. Usually the MPI build allows the job distribution layer to have each core work on a different trajectory; the only advantages offered by MPI are that the results all land in one directory (instead of needing N directories for N processors) and in inflexible cluster environments that insist on MPI. Thus, if you are having trouble with MPI and aren’t required to use it by your cluster’s sysadmin, don’t use it, as it won’t speed up your results. You’ll get the same speed out of just making 16 results directories and starting 16 jobs, one in each directory. (Remember to use -constant_seed and -jran to give them different random number generator seeds).

All cores working on the same job (all on myjob_0001, then all on myjob_0002) is indicative that the MPI communication layer is not working for some reason. If you pass an executeable to mpirun that is not actually built with MPI, this is the result you get (at least in 3.x).

In 3.x, if you try to run the MPI-built executeable WITHOUT mpirun, it fails with an MPI-related error message:

PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required environment variable: MPIRUN_RANK

You could try your build without MPI to see if you get this error or not; if you do not get this error it may mean that the MPI part of the MPI build didn’t get built for some reason.

Reply To: Why a paralleled Rosetta perform like a normal one?