mpi + mp_mutate_relax

Member Site Forums Rosetta 3 Rosetta 3 – General mpi + mp_mutate_relax

Viewing 1 reply thread
  • Author
    Posts
    • #3326
      Anonymous

        Hi all,

        I’m looking for a basic sanity check on how mpi should work with mp_mutate_relax. I’ve compiled rosetta-2019.47.61047 with extras=mpi and it all seems to be working fine. So if i run mpiexec -np 4 $rosetta3/bin/mp_mutate_relax.mpi…. with the following flags:

        -in:file:s /home/dave/Projects/TMEM/Rosetta/mutation_memb/4wis.opm.clean.pdb

        -in:file:native /home/dave/Projects/TMEM/Rosetta/mutation_memb/4wis.opm.clean.pdb

        -jd2:mpi_work_partition_job_distributor true

        -mp:setup:spanfiles /home/dave/Projects/TMEM/Rosetta/mutation_memb/4wis.opm.clean.span

        -mp:mutate_relax:mutation A_V442A B_V442A

        -mp:mutate_relax:relax true

        -relax:range:angle_max 0.3

        -relax:range:nmoves nres

        -mp:transform:optimize_embedding false

        -mp:mutate_relax:nmodels 4

        -out:path:pdb /home/dave/Projects/TMEM/Rosetta/mutation_memb/V442A-2/

        I’m expecting that the first model would be job 1 and run on core 1, model 2 job 2 core 2 etc… That doesn’t seem to be the behavior that I’m seeing, it looks like the models are still being run in series instaed of parallel. A single model takes ~ 40min to produce a relaxed model, this run of 4 is still running a bit under 2 hrs so far. Am I incorrect in my thinking as to how this should be working ?

        Thanks in advance,

        dave

         

      • #15111
        Anonymous

          Hi Dave, 

          I apologize, but this protocol was neither developed nor tested in MPI mode, so you’re a bit on your own on this one. Just to clarify: generally, options are not just mix and match, meaning, not all options work with all other options. This includes options for parallelization, for instance for MPI or the JobDistributor. I understand that this can be a bit frustrating (it is for everyone) but there is no easy fix for that from the developers perspective. I’d suggest running the protocol in plain vanilla mode, i.e. serially, even though you could try the option -multiple_processes_writing_to_one_directory 1 and see what happens. 

          Good luck,

          Julia

           

      Viewing 1 reply thread
      • You must be logged in to reply to this topic.