Problems running pmut in parallel with openmpi

Member Site Forums Rosetta 3 Rosetta 3 – Applications Problems running pmut in parallel with openmpi

Viewing 2 reply threads
  • Author
    Posts
    • #1415
      Anonymous

        Hi there,

        I am running into problems with pmut using openmpi. We successfully compiled Rosetta3.4 with mpi support, but now that we run pmut with the openmpi command ‘mpirun -np 8’, the job does not distribute ‘the work of creating and scoring all mutants evenly across all available CPUs’ (as it says in the manual), but instead calculates every mutant 8 times: the log file lists every entry 8 times (see below). Am I doing something obvious wrong? Or did the compilation with openmpi fail?

        Thanks a lot for your help in this matter.

        Rene

        ###########

        Here’s the command script I use:

        mpirun -np 8 pmut_scan_parallel.linuxgccrelease
        -database /opt/rosetta3.4/rosetta_database
        -s XXX.pdb
        -ex1
        -ex2
        -extrachi_cutoff 1
        -use_input_sc
        -ignore_unrecognized_res
        -no_his_his_pairE
        -multi_cool_annealer 10
        -mute basic core

        The output log file looks like this:

        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: go(): master node
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number single mutants possible: 9044
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants possible: 0
        protocols.pmut_scan.PointMutScanDriver: fill_mutations_list(): number double mutants excluded for distance: 0
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: mutation mutation_PDB_numbering average ddG average total energy
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15I A-L15I -2.935 -521.94
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-L15T A-L15T -1.388 -520.39
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-L15V A-L15V -2.474 -521.47
        protocols.pmut_scan.PointMutScanDriver: A-T16V A-T16V -1.207 -519.48
        protocols.pmut_scan.PointMutScanDriver: A-T16V A-T16V -1.207 -519.48
        protocols.pmut_scan.PointMutScanDriver: A-T16V A-T16V -1.207 -519.48
        protocols.pmut_scan.PointMutScanDriver: A-T16V A-T16V -1.207 -519.48
        protocols.pmut_scan.PointMutScanDriver: A-T16V A-T16V -1.207 -519.48
        protocols.pmut_scan.PointMutScanDriver: A-T16V A-T16V -1.207 -519.48

      • #7834
        Anonymous

          Run pmut_scan_parallel.linuxgccrelease without mpirun and with no options. If it fails with an mpi-related error (mine is

          “PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required environment variable: MPIRUN_RANK”

          but yours may be different), then you are using the MPI executable correctly and the problem is elsewhere. If it instead fails with

          “basic.options.util: Use either -s or -l to designate one or more start_files
          ERROR:: Exit from: src/basic/options/util.cc line: 109″

          then you are not using the MPI executable.

          I notice that MPI is not in your executable name: pmut_scan_parallel.linuxgccrelease. There should be a pmut_scan_parallel.mpi.linuxgccrelease in bin/ if you compiled in MPI. There should be a pmut_scan_parallel.default.linuxgccrelease if you compiled not-in-mpi, and both should be present if you compiled both. The pmut_scan_parallel with no insertion (no mpi or default) points to whatever was compiled most recently, so if you compiled mpi then non-mpi, but tried to use the unspecified symlink in bin, you’ll get this behavior.

        • #7836
          Anonymous

            THanks for your help. It works now. Was using the wrong executable :P

        Viewing 2 reply threads
        • You must be logged in to reply to this topic.