Member Site Forums Rosetta 3 Rosetta 3 – Build/Install Rosetta3.2.1-LAM-MPI Run-Problem-More than 2 processor Jobs Reply To: Rosetta3.2.1-LAM-MPI Run-Problem-More than 2 processor Jobs

#5629
Anonymous

    Hi Steven:

    I have applied the patch that you had sent me


    PATCH


    [ravi@torkv rosetta_source]$ ./scons.py bin mode=release extras=mpi
    scons: Reading SConscript files …
    svn: ‘.’ is not a working copy
    scons: done reading SConscript files.
    scons: Building targets …
    mpiCC -o build/src/release/linux/2.6/64/x86/gcc/mpi/apps/public/AbInitio_MPI.o -c -std=c++98 -pipe -ffor-scope -W -Wall -pedantic -Wno-long-long -O3 -ffast-math -funroll-loops -finline-functions -finline-limit=20000 -s -Wno-unused-variable -DNDEBUG -DUSEMPI -Isrc -Iexternal/include -Isrc/platform/linux/64/gcc -Isrc/platform/linux/64 -Isrc/platform/linux -Iexternal/boost_1_38_0 -I/usr/local/include -I/usr/include src/apps/public/AbInitio_MPI.cc
    mpiCC -o build/src/release/linux/2.6/64/x86/gcc/mpi/AbInitio_MPI.linuxgccrelease -Wl,-rpath=/opt/nasapps/build/Rosetta/rosetta_source/build/src/release/linux/2.6/64/x86/gcc/mpi build/src/release/linux/2.6/64/x86/gcc/mpi/apps/public/AbInitio_MPI.o -Llib -Lexternal/lib -Lbuild/src/release/linux/2.6/64/x86/gcc/mpi -Lsrc -L/usr/local/lib -L/usr/lib -L/lib -L/lib64 -ldevel -lprotocols -lcore -lnumeric -lutility -lObjexxFCL -lz
    Install file: “build/src/release/linux/2.6/64/x86/gcc/mpi/AbInitio_MPI.linuxgccrelease” as “bin/AbInitio_MPI.linuxgccrelease”
    mpiCC -o build/src/release/linux/2.6/64/x86/gcc/mpi/AbInitio_MPI.mpi.linuxgccrelease -Wl,-rpath=/opt/nasapps/build/Rosetta/rosetta_source/build/src/release/linux/2.6/64/x86/gcc/mpi build/src/release/linux/2.6/64/x86/gcc/mpi/apps/public/AbInitio_MPI.o -Llib -Lexternal/lib -Lbuild/src/release/linux/2.6/64/x86/gcc/mpi -Lsrc -L/usr/local/lib -L/usr/lib -L/lib -L/lib64 -ldevel -lprotocols -lcore -lnumeric -lutility -lObjexxFCL -lz
    Install file: “build/src/release/linux/2.6/64/x86/gcc/mpi/AbInitio_MPI.mpi.linuxgccrelease” as “bin/AbInitio_MPI.mpi.linuxgccrelease”
    scons: done building targets.



    MPIRUN with NP 2 WORKS FINE


    mpirun -np 2 $rosetta_home/bin/AbinitioRelax.mpi.linuxgccrelease @flags

    ……
    ……
    ……
    Total weighted score: 24.862

    ===================================================================
    Finished Abinitio

    protocols.abinitio.AbrelaxApplication: (1) Finished _0001 in 7 seconds.
    protocols::checkpoint: (1) Deleting checkpoints of ClassicAbinitio
    protocols::checkpoint: (1) Deleting checkpoints of Abrelax
    protocols.jobdist.JobDistributors: (1) Node: 1 next_job()
    protocols.jobdist.JobDistributors: (1) Slave Node 1 — requesting job from master node; tag_ 1
    protocols.jobdist.JobDistributors: (0) Master Node –available job? 0
    protocols.jobdist.JobDistributors: (0) Master Node — Spinning down node 1
    protocols.jobdist.JobDistributors: (0) Node 0 — ready to call mpi finalize
    protocols.jobdist.JobDistributors: (1) Node 1 — ready to call mpi finalize
    protocols::checkpoint: (0) Deleting checkpoints of ClassicAbinitio
    protocols::checkpoint: (0) Deleting checkpoints of Abrelax
    protocols::checkpoint: (1) Deleting checkpoints of ClassicAbinitio
    protocols::checkpoint: (1) Deleting checkpoints of Abrelax



    –MPIRUN with NP >2 FAILS


    ===================================================================
    Stage 2
    Folding with score1 for 2000


    One of the processes started by mpirun has exited with a nonzero exit
    code. This typically indicates that the process finished in error.
    If your process did not finish in error, be sure to include a “return
    0″ or “exit(0)” in your C code before exiting the application.

    PID 7798 failed on node n0 (129.43.63.50) due to signal 11.


    Could LAM-7.1.4 be an issue?

    Thanks

    Ravi