Member Site › Forums › Rosetta 3 › Rosetta 3 – Build/Install › Compilation and unit test failures
- This topic has 3 replies, 2 voices, and was last updated 3 years, 2 months ago by Anonymous.
-
AuthorPosts
-
-
September 2, 2021 at 5:09 am #3842Anonymous
Hi all,
I installed Rosetta 3.13 in KISTI supercomputor 5 (Nurion) (https://www.ksc.re.kr/eng/index/main).
And, I ran the unit tests. It is showing 85% success rate.
Below is what I did and I have attached the relevant documents.
$ cd rosetta/main/source
$ module purge
$ module load craype-network-opa python/2.7.15 gcc/8.3.0 mvapich2/2.3.1 craype-mic-knl
$ vi tools/build/site.settings
“overrides” : {
“cc” : “/apps/compiler/gcc/8.3.0/mvapich2/2.3.1/bin/mpicc”,
“cxx” : “/apps/compiler/gcc/8.3.0/mvapich2/2.3.1/bin/mpicxx”
$ ./scons.py -j64 bin mode=release extras=mpi log=environment 2>&1 |tee make_1.log
$ ./scons.py -j64 mode=debug extras=mpi log=environment 2>&1 |tee make_2.log
$ ./scons.py -j64 mode=debug extras=mpi cat=test log=environment 2>&1 |tee make_3.log
$ cat test.sh
#!/bin/sh
#PBS -V
#PBS -N mpi_test_job
#PBS -q debug
#PBS -A etc
#PBS -l select=1:ncpus=64:mpiprocs=64:ompthreads=1
#PBS -l walltime=04:00:00
#PBS -W sandbox=PRIVATE
cd $PBS_O_WORKDIR
module purge
module load craype-network-opa python/2.7.15 gcc/8.3.0 mvapich2/2.3.1 craype-mic-knl
TOTAL_CPUS=$(wc -l $PBS_NODEFILE | awk ‘{print $1}’)
python /scratch/a1376a01/rosetta/main/source/test/run.py –database=/scratch/a1376a01/rosetta/main/database –mode=debug –extras=mpi –compiler=gcc –jobs=64
$ qsub test.sh
How can I fix these problems?
Thank you in advance
-
September 2, 2021 at 5:14 am #16005Anonymous
Here is the result of unit tests.
-
September 2, 2021 at 4:04 pm #16007Anonymous
From the log, it looks like the majority of the failures are along the lines of
‘./core.test RotamerSetsTests –database /scratch/a1376a01/rosetta/main/database -mute all -no_fconfig’ exceeded the timeout and will be killed!
Some of the tests can take a fair amount of time to run, so it looks like they’re not completing in time on your computer. You can try passing `–timeout 0` to the test/run.py script to turn off the timeout completely. Also, check your cluster job settings, to see if you’re bumping up against cluster runtime limits. (You got the final summary, so it doesn’t look like you ran into the full job being canceled, but I don’t know if your cluster has process-level timeout control.)
Another thing to keep in mind is that even for an MPI compile, the tests here are all run serially (not through MPI). I don’t know your cluster, but if there’s issues with running 64 separate single CPU non-MPI jobs given your PBS setup, then that might be contributing to things.
All that said, the tests are there primarily for developers during development. Running them isn’t a part of the typical installation process. You certainly can, but be prepared that there may be a few tests which fail not due to issues with the code, but just because of quirks of your machine/setup.
-
September 28, 2021 at 1:59 am #16022Anonymous
Dear rmoretti,
Thank you very much for your comment.
I followed your advice and tested again one by one.
Although, some tests still failed due to cluster runtime limits, most of the failed tests were successful.
Also, the rosetta I compiled seems to be working well as I can get the results of calculation with MPI.
Thanks again.
-
-
-
AuthorPosts
- You must be logged in to reply to this topic.