Intel Compiler – Inaccurate G! step errors

Member Site Forums Rosetta 3 Rosetta 3 – Build/Install Intel Compiler – Inaccurate G! step errors

Viewing 5 reply threads
  • Author
    Posts
    • #2368
      Anonymous

        Hi,

        We are trying to get Rosetta setup fo a user here at UT Southwestern, and have downloaded and built the 2015.39 release using the Intel Composer XE 2015 compiler suite, with MVAPICH2 2.1 as the MPI stack. This is the standard compiler/mpi stack we use for a large amount of other software on the cluster. Compilation of Rosetta followed the site settings file distributed for the TACC stampede cluster.

        When the user runs a simple test of relax.linuxiccrelease:


        ${ROSETTA_BIN_DIR}/relax.linuxiccrelease -database ${ROSETTA_DATABASE_DIR} -in:file:s 1BE9_clean.pdb -in:file:fullatom -out:prefix relax_

        We are seeing Inaccurate G! step errors, such as:


        core.optimization.LineMinimizer: (0) Inaccurate G! step= 3.8147e-08 Deriv= -239.863 Finite Diff= 4.62715e+07

        … and the output is not as expected. These errors don’t occur using binaries included in the download, or a build here using gcc instead of the Intel compilers.

        Am wondering if anyone else has seen such numerical issues after using the Intel compiler, or if there are any pointers to investigate this further?

        Many Thanks,

        DT

         

      • #11403
        Anonymous

          Inaccurate G! is either nonconcerning or nondiagnostic (your pick).  In broad strokes, it means the minimizer is behaving badly.  Sometimes this is a false alarm (depending on minimizer settings), sometimes it’s because you’ve hit a patch of a particular scorefunction’s range where it behaves mathematically badly.  Usually you can ignore them – it means inefficiency not error in most cases.

          You also say “the output is not as expected”.  What does that mean?

        • #11404
          Anonymous

            Thanks for the info. Honestly I don’t have an exact idea of what ‘not as expected means’. I’ll have to ask the user concerned to tak part on this forum. Their original query to our HPC support team below.

            The concern is that the Innacurate G! issue only occurs on the Intel compiled version – and not on a gcc compiled version. Possible there could be slight numerical differences from using Intel MKL or similar?

            I’ll try to get more detail / ask the user to post here directly.

            Many Thanks!


            I get some strange error during the run:

            core.optimization.LineMinimizer: (0) Inaccurate G! step= 3.8147e-08 Deriv= -239.863 Finite Diff= 4.62715e+07

            Otherwise the run finishes normally, but the scores of the optimization process are not accurate in the end. The collaborator does not manage to replicate this on his cluster with the same scripts that I am using. Could it be that something did not compile right? Do you perhaps have a log file that they could look at?

            The collaborators are using the QB3 cluster (at UCSF) and they don’t use the MPI version. They have:

             

             

             

          • #11405
            Anonymous

              From the user…

               

              “Output is not as expected” means that the total_score at the end of relax run are not the same. Perhaps the difference is not significant. When I run it with gcc compiled version, relax gives a final total_score below -200. While running relax with intel compiled version gave me total_scores between -190 and -167. I could do several runs to generate statistics on the scores if that would help. 

               

              The score files look like this: 

              ==> gcc_TestRun_Relax.sc <==

              SEQUENCE:

              SCORE: total_score dslf_fa13    fa_atr    fa_dun   fa_elec fa_intra_rep       fa_rep       fa_sol hbond_bb_sc hbond_lr_bb    hbond_sc hbond_sr_bb       omega     p_aa_pp pro_close      rama       ref description

              SCORE:    -205.193     0.000  -413.155   104.034   -55.167        0.940       40.477      236.847     -17.384     -35.506     -11.994     -20.385       6.055     -25.877     0.234   -10.497    -3.814 relax_1BE9_clean_0001

               

              ==> intel_TestRun_Relax.sc <==

              SEQUENCE:

              SCORE: total_score dslf_fa13    fa_atr    fa_dun   fa_elec fa_intra_rep       fa_rep       fa_sol hbond_bb_sc hbond_lr_bb    hbond_sc hbond_sr_bb       omega     p_aa_pp pro_close      rama       ref description

              SCORE:    -167.911     0.000  -413.977    79.443   -47.998        0.960       77.783      237.462     -11.119     -32.917      -8.990     -19.018       6.097     -22.813     0.512    -9.521    -3.814 relax_1BE9_clean_0001

               

            • #11408
              Anonymous

                A score difference of 40 units is surprisingly large, but within the boundaries of what the random sampling of Monte Carlo will do.  I would 100% not expect you to get identical scores from this test (even with the same RNG seeds, I’d be maybe 75/25 on a different score due to compiler and processor differences).  Run maybe 100 models (-nstruct 100) and see what the averages look like.

              • #11409
                Anonymous

                  Many thanks for the input. Have collected averages on 100 models for both our compiled intel/mvapich2 version and the gcc version in the download from this site.

                  The intel compiled version gives lower, and more variable ‘total_score’ values.  Would appreciate any info to pass to user r.e. whether this is as expected / within reason?

                  Many Thanks,

                    INTEL     GCC     DIFFERENCE
                    MEAN STDEV   MEAN STDEV   MEAN
                  total_score -187.961 11.255   -200.810 3.913   -12.849
                  dslf_fa13 0.000 0.000   0.000 0.000   0.000
                  fa_atr -404.379 7.544   -407.013 4.233   -2.634
                  fa_dun 79.907 1.738   103.700 1.742   23.793
                  fa_elec -49.324 2.153   -55.068 2.176   -5.744
                  fa_intra_rep 0.981 0.020   0.959 0.017   -0.022
                  fa_rep 53.182 13.008   40.024 0.999   -13.158
                  fa_sol 233.381 4.160   232.609 3.195   -0.771
                  hbond_bb_sc -11.173 1.367   -14.666 1.547   -3.493
                  hbond_lr_bb -32.314 0.898   -34.760 0.796   -2.446
                  hbond_sc -10.442 1.097   -12.090 1.257   -1.648
                  hbond_sr_bb -19.517 0.793   -20.403 0.427   -0.887
                  omega 7.391 1.565   5.558 0.452   -1.833
                  p_aa_pp -22.739 0.687   -25.176 0.356   -2.437
                  pro_close 0.303 0.248   0.171 0.031   -0.132
                  rama -9.403 0.690   -10.839 0.569   -1.436
                  ref -3.814     -3.814     0.000

              Viewing 5 reply threads
              • You must be logged in to reply to this topic.