Intel Compiler – Inaccurate G! step errors

This topic has 5 replies, 2 voices, and was last updated 8 years, 11 months ago by Anonymous.

Viewing 5 reply threads

Author

Posts

January 19, 2016 at 3:33 pm #2368
Anonymous
Hi,

We are trying to get Rosetta setup fo a user here at UT Southwestern, and have downloaded and built the 2015.39 release using the Intel Composer XE 2015 compiler suite, with MVAPICH2 2.1 as the MPI stack. This is the standard compiler/mpi stack we use for a large amount of other software on the cluster. Compilation of Rosetta followed the site settings file distributed for the TACC stampede cluster.

When the user runs a simple test of relax.linuxiccrelease:
```
${ROSETTA_BIN_DIR}/relax.linuxiccrelease -database ${ROSETTA_DATABASE_DIR} -in:file:s 1BE9_clean.pdb -in:file:fullatom -out:prefix relax_
```
We are seeing Inaccurate G! step errors, such as:
```
core.optimization.LineMinimizer: (0) Inaccurate G! step= 3.8147e-08 Deriv= -239.863 Finite Diff= 4.62715e+07
```
… and the output is not as expected. These errors don’t occur using binaries included in the download, or a build here using gcc instead of the Intel compilers.

Am wondering if anyone else has seen such numerical issues after using the Intel compiler, or if there are any pointers to investigate this further?

Many Thanks,

DT
January 19, 2016 at 3:45 pm #11403
Anonymous
Inaccurate G! is either nonconcerning or nondiagnostic (your pick). In broad strokes, it means the minimizer is behaving badly. Sometimes this is a false alarm (depending on minimizer settings), sometimes it’s because you’ve hit a patch of a particular scorefunction’s range where it behaves mathematically badly. Usually you can ignore them – it means inefficiency not error in most cases.

You also say “the output is not as expected”. What does that mean?

January 19, 2016 at 10:57 pm #11404

Anonymous

Thanks for the info. Honestly I don’t have an exact idea of what ‘not as expected means’. I’ll have to ask the user concerned to tak part on this forum. Their original query to our HPC support team below.

The concern is that the Innacurate G! issue only occurs on the Intel compiled version – and not on a gcc compiled version. Possible there could be slight numerical differences from using Intel MKL or similar?

I’ll try to get more detail / ask the user to post here directly.

Many Thanks!



I get some strange error during the run:



core.optimization.LineMinimizer: (0) Inaccurate G! step= 3.8147e-08 Deriv= -239.863 Finite Diff= 4.62715e+07



Otherwise the run finishes normally, but the scores of the optimization process are not accurate in the end. The collaborator does not manage to replicate this on his cluster with the same scripts that I am using. Could it be that something did not compile right? Do you perhaps have a log file that they could look at?



The collaborators are using the QB3 cluster (at UCSF) and they don’t use the MPI version. They have:

January 20, 2016 at 3:29 pm #11405
Anonymous
From the user…

“Output is not as expected” means that the total_score at the end of relax run are not the same. Perhaps the difference is not significant. When I run it with gcc compiled version, relax gives a final total_score below -200. While running relax with intel compiled version gave me total_scores between -190 and -167. I could do several runs to generate statistics on the scores if that would help.

The score files look like this:

==> gcc_TestRun_Relax.sc <==

SEQUENCE:

SCORE: total_score dslf_fa13 fa_atr fa_dun fa_elec fa_intra_rep fa_rep fa_sol hbond_bb_sc hbond_lr_bb hbond_sc hbond_sr_bb omega p_aa_pp pro_close rama ref description

SCORE: -205.193 0.000 -413.155 104.034 -55.167 0.940 40.477 236.847 -17.384 -35.506 -11.994 -20.385 6.055 -25.877 0.234 -10.497 -3.814 relax_1BE9_clean_0001

==> intel_TestRun_Relax.sc <==

SEQUENCE:

SCORE: total_score dslf_fa13 fa_atr fa_dun fa_elec fa_intra_rep fa_rep fa_sol hbond_bb_sc hbond_lr_bb hbond_sc hbond_sr_bb omega p_aa_pp pro_close rama ref description

SCORE: -167.911 0.000 -413.977 79.443 -47.998 0.960 77.783 237.462 -11.119 -32.917 -8.990 -19.018 6.097 -22.813 0.512 -9.521 -3.814 relax_1BE9_clean_0001
January 21, 2016 at 4:06 pm #11408
Anonymous
A score difference of 40 units is surprisingly large, but within the boundaries of what the random sampling of Monte Carlo will do. I would 100% not expect you to get identical scores from this test (even with the same RNG seeds, I’d be maybe 75/25 on a different score due to compiler and processor differences). Run maybe 100 models (-nstruct 100) and see what the averages look like.

January 21, 2016 at 9:07 pm #11409

Anonymous

Many thanks for the input. Have collected averages on 100 models for both our compiled intel/mvapich2 version and the gcc version in the download from this site.

The intel compiled version gives lower, and more variable ‘total_score’ values. Would appreciate any info to pass to user r.e. whether this is as expected / within reason?

Many Thanks,

	INTEL		GCC		DIFFERENCE
	MEAN	STDEV	MEAN	STDEV	MEAN
total_score	-187.961	11.255	-200.810	3.913	-12.849
dslf_fa13	0.000	0.000	0.000	0.000	0.000
fa_atr	-404.379	7.544	-407.013	4.233	-2.634
fa_dun	79.907	1.738	103.700	1.742	23.793
fa_elec	-49.324	2.153	-55.068	2.176	-5.744
fa_intra_rep	0.981	0.020	0.959	0.017	-0.022
fa_rep	53.182	13.008	40.024	0.999	-13.158
fa_sol	233.381	4.160	232.609	3.195	-0.771
hbond_bb_sc	-11.173	1.367	-14.666	1.547	-3.493
hbond_lr_bb	-32.314	0.898	-34.760	0.796	-2.446
hbond_sc	-10.442	1.097	-12.090	1.257	-1.648
hbond_sr_bb	-19.517	0.793	-20.403	0.427	-0.887
omega	7.391	1.565	5.558	0.452	-1.833
p_aa_pp	-22.739	0.687	-25.176	0.356	-2.437
pro_close	0.303	0.248	0.171	0.031	-0.132
rama	-9.403	0.690	-10.839	0.569	-1.436
ref	-3.814		-3.814		0.000

Author

Posts

Viewing 5 reply threads

You must be logged in to reply to this topic.