Member Site › Forums › Rosetta 3 › Rosetta 3 – Build/Install › gcc 4.4 (4.4.2) compile error with mode=release related to -finline-limit=20000 compile option [SOLVED] › Reply To: gcc 4.4 (4.4.2) compile error with mode=release related to -finline-limit=20000 compile option [SOLVED]
Here are my benchmark results regarding the use of a lower value for -inline-limit in order to compile with gcc 4.4.2.
System : 90 residues protein
Protocols tested :
- “AbinitioRelax -abinitio -out:nstruct 100”
- “relax -relax” on above abinitio structures
- “relax -relax:fast” on above abinitio structures
Benchmarks were all done on a single node (dual Intel Nahalem-EP at 2.8 GHz, 24 GB RAM, ), with a single process per node (i.e. yes, I am wasting 7 cores… until I get the pseudo-mpi version of Rosetta compiled).
Linux r107-n37 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
All compiles were static and included the following CCFLAGS
CCFLAGS = -pipe -ffor-scope -W -Wall -pedantic -Wno-long-long -O3 -ffast-math -funroll-loops -finline-functions -s -Wno-unused-variable
The following were tested (GCC version and additional CCFLAGS):
A) GCC 4.4.2 -finline-limit=1133 –param inline-unit-growth=1000 –param large-function-growth=50000
GCC 4.4.2 -finline-limit=1133
C) GCC 4.1.2 -finline-limit=20000 –param inline-unit-growth=1000 –param large-function-growth=50000
D) GCC 4.1.2 -finline-limit=20000
E) GCC 4.1.2 -finline-limit=1133 –param inline-unit-growth=1000 –param large-function-growth=50000
F) GCC 4.4.2
The following computation times are average user cpu time in seconds/structure for the above three protocols
A) 26.2 – 84.0 – 30.2
26.0 – 99.1 – 35.5
C) 28.4 – 110.3 – 33.6
D) 28.4 – 102.5 – 38.3
E) 28.4 – 110.1 – 33.2
F) 35.9 – 101.2 – 35.9
In summary:
- there is no negative effect on performance associated with -finline-limit=1133 and GCC 4.4.2 (A) compared with the default 4.1.2 compilation (C). In fact, (A) is 8%, 24%, and 10% faster than (C) for “-abinitio”, “-relax” and “-relax:fast”, respectively.
- the “–param inline-unit-growth=1000 –param large-function-growth=50000” appears to improve performance in most cases, with one exception (“-relax” and 4.1.2)
- even with gcc 4.1.2, -finline-limit 20000 (C) or 1133 (E) does not affect performance. From above comment, things were different with older compilers (3.x)
- there is a significant performance cost if -finline-limit is not set (F)
To conclude, there is no apparent performance cost associated with compiling using gcc 4.4.2 and -finline-limit=1133.
Please remember that the above benchmark are only about AbinitioRelax and relax protocols, gcc 4.4.2, and my hardware. Other settings may give different results.
Hopefully this will be useful to other Rosetta users who may have to compile with GCC 4.4, or maybe to developers that may be interested in revisiting inline optimization with newer compilers.
Cheers,
Stéphane