This document was edited by Colin Smith on 12/4/2008. Yi Liu created the initial page. Thanks Oliver and Firas for providing information. Last edited by Steven Lewis on 30 Aug 2016.
Rosetta constraints are additions to the scorefunction. (This corresponds to "restraints" in other programs.) They're used to score geometric and other features of the structure which may not be evaluated by standard scoreterms. For example, adding a scoring bias based on experimental knowledge.
Each constraint consists of two parts: A) what's being measured B) how that measured value is transformed into a scoring bonus/penalty. These two parts can be mixed and matched to derive the desired behavior.
In order for constraints to be correctly recognized by Rosetta, two things must occur. First, the constraints themselves must be applied to the pose (structure). How this is done is somewhat protocol dependent, but most often takes the form of an option or parameter which specifies which file contains the constraint specification. (The format of this file is described below.) For example, by adding it in your xml script using the ConstraintSetMover.
The second requirement is that the scorefunction being used needs to have a non-zero weight for the appropriate constraint scoreterm. The particular scoreterm depends on the type of constraint being used. The value of the penalty/bonus consists of the sum of the raw constraint scores (from the measured value and the specified transforming function of all the constraints) multiplied by the weight of the appropriate score term in the score function. Many protocols which use constraints will turn the constraint weights on for you, but others will require you to specify a scorefunction weights file which has non-zero constraint terms.
Constraints can be specified in a line-based constraint file formatted like so:
Constraint_Type1 Constraint_Def1
Constraint_Type2 Constraint_Def2
...
Generally speaking, the Constraint_Type will contain a type, defining what sort of value to be constrained (distance, angle, dihedral, etc), and a series of atom and/or residue labels defining a specific quality to be constrained. Residue numbers are assumed to be in Rosetta numbering (from 1, no gaps), not PDB numbering. If you want PDB numbering, pass the chain letter immediately after the residue number (no spaces): residue 30 of chain A would be "30A". You cannot pass insertion codes through this mechanism; you'll need the renumbered pose. Not all constraint types can take PDB numbering.
The Constraint_Def will define function by which the constraint is constrained, to answer the question: what should the score of the constraint be when the constrained value has a deviation of X units?
Constraint types are all implemented as subclasses of the core::scoring::constraints::Constraint class.
Single constraints restrain the value of a single metric
AtomPair:
AtomPair Atom1_Name Atom1_ResNum Atom2_Name Atom2_ResNum Func_Type Func_Def
score term: atom_pair_constraint
NamedAtomPair:
NamedAtomPair Atom1_Name Atom1_ResNum Atom2_Name Atom2_ResNum Func_Def
score term: atom_pair_constraint
Angle:
Angle Atom1_Name Atom1_ResNum Atom2_Name Atom2_ResNum Atom3_Name Atom3_ResNum Func_Type Func_Def
score term: angle_constraint
NamedAngle:
NamedAngle Atom1_Name Atom1_ResNum Atom2_Name Atom2_ResNum Atom3_Name Atom3_ResNum Func_Type Func_Def
score term: angle_constraint
Dihedral:
Dihedral Atom1_Name Atom1_ResNum Atom2_Name Atom2_ResNum Atom3_Name Atom3_ResNum Atom4_Name Atom4_ResNum Func_Type Func_Def
score term: dihedral_constraint
DihedralPair:
DihedralPair Atom1_Name Atom1_ResNum Atom2_Name Atom2_ResNum Atom3_Name Atom3_ResNum Atom4_Name Atom4_ResNum Atom5_Name Atom5_ResNum Atom6_Name Atom6_ResNum Atom7_Name Atom7_ResNum Atom8_Name Atom8_ResNum Func_Type Func_Def
score term: dihedral_constraint
CoordinateConstraint:
CoordinateConstraint Atom1_Name Atom1_ResNum[Atom1_ChainID] Atom2_Name Atom2_ResNum[Atom2_ChainID] Atom1_target_X_coordinate Atom1_target_Y_coordinate Atom1_target_Z_coordinate Func_Type Func_Def
score term: coordinate_constraint
LocalCoordinateConstraint:
LocalCoordinateConstraint Atom1_Name Atom1_ResNum Atom2_Name Atom3_Name Atom4_Name Atom234_ResNum Atom1_target_X_coordinate Atom1_target_Y_coordinate Atom1_target_Z_coordinate Func_Type Func_Def
score term: coordinate_constraint
AmbiguousNMRDistance:
AmbiguousNMRDistance Atom1_Name Atom1_ResNum Atom2_Name Atom2_ResNum Func_Type Func_Def
score term: atom_pair_constraint
SiteConstraint:
SiteConstraint Atom1_Name Atom1_ResNum Opposing_chain Func_Type Func_Def
score term: atom_pair_constraint
SiteConstraintResidues:
SiteConstraintResidues Atom1_ResNum Atom1_Name Res2 Res3 Func_Type Func_Def
score term: atom_pair_constraint
BigBin:
BigBin res_number bin_char sdev
score term: dihedral_constraint
Nested constraints take as their parameters one or more other constraints, and allow optimization across multiple constraints. Typically in constraint files these are listed across multiple lines, with the name of the constraint opening the block of sub-constraints, and a line starting with "END" or "End" ending the block. In general, the scoretypes used by the nested constraints depends on which sub-constraints are used (this can normally be mixed).
MultiConstraint:
MultiConstraint
Constraint_Type1 Constraint_Def1
[Constraint_Type2 Constraint_Def2
[...]]
END
AmbiguousConstraint:
AmbiguousConstraint
Constraint_Type1 Constraint_Def1
[Constraint_Type2 Constraint_Def2
[...]]
END
KofNConstraint:
KofNConstraint k
Constraint_Type1 Constraint_Def1
[Constraint_Type2 Constraint_Def2
[...]]
END
Functions are listed as "Func_Type Func_Def".
Specialized for angles:
CIRCULARHARMONIC x0 sd
PERIODICBOUNDED period lb ub sd rswitch tag
Note: Setting rswitch
to anything other than 0.5 will create a discontinuity in the derivative. rswitch
and tag
should not be treated as optional.
A BOUNDED constraint after mapping the measured value to the range -period/2 to +period/2. Useful for angle measures centered on zero.
OFFSETPERIODICBOUNDED offset period lb ub sd rswitch tag
A BOUNDED constraint, where the measured value is (x - offset) mapped to the range -period/2 to +period/2. Useful for angle measures. (Note that lb and ub are interpreted after subtraction, so the true range of zero constraint is from lb+offfset to ub+offset.)
AMBERPERIODIC x0 n_period k
An AMBERPERIODIC function is a cosine function of the angle x, with a maximum at x0 and a periodicity of n_period. The amplitude is k, and the minimum value is 0:
CHARMMPERIODIC x0 n_period k
A CHARMMPERIODIC function penalizes deviations from angle x0 by values from 0 to k, with n_period periods:
CIRCULARSIGMOIDAL xC m o1 o2
A CIRCULARSIGMOIDAL function penalizes deviations x0 from angles o1 and/or o2 by values from 0 to 1, with n_period periods:
CIRCULARSPLINE weight [36 energy values]
A CIRCULARSPLINE function sets up a periodic cubic spline trained on the provided energy values, which represent the centers of thirty-six 10 degree bins
HARMONIC x0 sd
FLAT_HARMONIC x0 sd tol
Zero in the range of x0 - tol
to x0 + tol
. Harmonic with width parameter sd outside that range. Basically, a HARMONIC potential (see above) split at x0 with a 2*tol length region of zero inserted.
BOUNDED lb ub sd rswitch tag
rswitch
to anything other than 0.5 will create a discontinuity in the derivative. (If tag
is not numeric, then rswitch
may be omitted and will default to 0.5.) tag
is NOT optional. GAUSSIANFUNC mean sd tag WEIGHT weight
tag
is NOT optional, as for BoundFunc/BOUNDED. If tag
= NOLOG, it triggers some undocumented behavior involving a logarithm of some sort.SOGFUNC n_funcs [mean1 sdev1 weight1 [mean2 sdev2 weight2 [...]]]
MIXTUREFUNC anchor gaussian_param exp_param mixture_param bg_mean bg_sd
CONSTANTFUNC return_val
IDENTITY
SCALARWEIGHTEDFUNC weight Func_Type Func_Def
SUMFUNC n_funcs Func_Type1 Func_Def1 [Func_Type2 Func_Def2 [...]]
SPLINE description histogram_file_path experimental_value weight bin_size
or, if the option -constraints:epr_distance
is set:
SPLINE description experimental_value weight bin_size
or, if one wishes to provide a spline definition in a single line in a constraints file:
SPLINE description NONE experimental_value weight bin_size x_axis <val1> ... <valN> y_axis <val1> ... <valN>
In the first form, this function reads in a histogram file and creates a cubic spline over it using the Rosetta SplineGenerator. The full path to the file must be specified. The basic form of the histogram file is a TAB SEPARATED file of the following format:
x_axis -1.750 -1.250 -0.750 -0.250 0.250 0.750 1.250 1.750
y_axis 0.000 -0.500 -1.000 -2.000 -1.500 -0.500 -0.250 0.000
The values in the x_axis
line must be in ascending order, and it is assumed that the values are for the center of the histogram bin and that all bin widths are the same as that specified in the constraint file. (In practice, the bin_size setting is only used for the bins on the end.) It's assumed that the values return to the baseline at the edge of the given x_axis range, and that all y_axis values outside the range are zero.
If the description
parameter is EPR_DISTANCE
, then the functional transformation of the measurement x is weight * S(experimental_value - x)
, otherwise the functional transformation is weight * S(x)
, and experimental_value
is ignored.
In the second form, if the -constraints:epr_distance
option is given on the command line, then the histogram_file_path
should be omitted, and the RosettaEPR knowledge-based potential will be read from a file in the Rosetta database. (See Hirst et al. (2011) J. Struct Biol. 173:506 and Alexander et al. (2013) PLoS One e72851 for details on this potential.) See example below for using with EPR knowledge-based potential.
In the third form, if the -constraints:epr_distance
option is not used but NONE
is substituted for the histogram_file_path
, then the spline definition must be provided on the same line. It must include x_axis
followed by a series of values, and y_axis
followed by a series of values. Optionally, it may include lb_function <cutoff> <slope> <intercept>
to define a linear function for the curve past the lower bound, and ub_function <cutoff> <slope> <intercept>
to define a linear function past the upper bound.
FADE lb ub d wd [ wo ]
wd
between the boundaries lb
and ub
. An optional offset wo
(default 0) can be added to the whole function; this is useful if you want to make the function be zero in the 'golden range' and then give a penalty elsewhere (e.g., specify wd of -20 and wo of +20). To make sure the function and its derivative are continuous, the function is connected by cubic splines in the boundary regions in slivers of width d, between lb
to lb+d
and between ub-d
to ub
:SIGMOID x0 m
SQUARE_WELL x0 depth
SQUARE_WELL2 x0 width depth [DEGREES]
For angle measures. Parameters are presumed to be in radians unless optional DEGREES tag is present
LINEAR_PENALTY x0 depth width slope
(Currently has a bug in minimization calculations.)
KARPLUS A B C D x0 sd
SOEDINGFUNC w1 mean1 sd1 w2 mean2 sd2
where Gauss(x,mean,sd) is the value of the normal distribution at x, given the mean and standard deviation
TOPOUT weight x0 limit
Harmonic near x0, flat past limit
ETABLE min max [many numbers]
Defines a function via values ranging from min to max inclusive, spaced out by 0.1. Linear interpolation for intermediate values.
USOG num_gaussians mean1 sd1 mean2 sd2...
Defines an unweighted sum of a given number of Gaussians
SOG num_gaussians mean1 sd1 weight1 mean2 sd2 weight2...
Defines a weighted sum of a given number of Gaussians
Function types are all implemented as subclasses of the core::scoring::constraints::Func class.
AtomPair CZ 20 CA 6 GAUSSIANFUNC 5.54 2.0 TAG
AtomPair CZ 20 CA 54 GAUSSIANFUNC 5.27 2.0 TAG
AtomPair CZ 20 CA 50 GAUSSIANFUNC 5.26 2.0 TAG
AtomPair CZ 20 CA 10 GAUSSIANFUNC 4.81 2.0 STILL_TAG
AtomPair CZ 20 CA 41 GAUSSIANFUNC 9.90 2.0 I_AM_ANNOYED_BY_THIS_TAG_FIELD
AtomPair SG 5 V1 32 HARMONIC 0.0 0.2
Angle CB 5 SG 5 ZN 32 HARMONIC 1.95 0.35
AtomPair SG 8 V2 32 HARMONIC 0.0 0.2
Angle CB 8 SG 8 ZN 32 HARMONIC 1.95 0.35
AtomPair NE2 13 V3 32 HARMONIC 0.0 0.2
Angle CD2 13 NE2 13 ZN 32 HARMONIC 2.09 0.35
Dihedral CG 13 CD2 13 NE2 13 ZN 32 CIRCULARHARMONIC 3.14 0.35
AtomPair SG 18 V4 32 HARMONIC 0.0 0.2
Angle CB 18 SG 18 ZN 32 HARMONIC 1.95 0.35
#discourages residue 16, chain B, from interacting with chain A; encourages residue 47
SiteConstraint CA 16B A SIGMOID 5.0 -2.0
SiteConstraint CA 47B A SIGMOID 5.0 2.0
AtomPair H 6 HA 134 BOUNDED 1.80 5.25 0.50 NOE ;dist 5.000 1.800
AtomPair HA 7 HA 132 HARMONIC 1.2 0.2
AtomPair CB 31 CB 43 SPLINE EPR_DISTANCE 6.0 4.0 0.5
AtomPair CB 29 CB 62 SPLINE EPR_DISTANCE 15.0 4.0 0.5
There are different sets of functions for reading full-atom and non-full-atom constraints, as indicated by the respective functions listed below. The only current difference between the functions are which command line arguments are read. The values of the arguments are processed identically.
To use constraints, both the scoring function and pose objects should be updated. The functions for adding constraints to the scoring function are:
Currently these functions only set the weights of the atom_pair_constraint, angle_constraint, and dihedral_constraint score function terms to the value of either the -constraints:cst_fa_weight or -constraints:cst_weight command line argument.
The functions for adding constraints to the pose object are:
These functons read a random constraint file from the list defined by either the -constraints:cst_fa_file or -constraints:cst_file argument.
There are also convenience functions for doing both at once:
These constraint types cannot currently be specified in a file. They need to have read_def methods implemented and be added to the ConstraintFactory constructor.
These function types cannot currently be specified in a file. They need to have read_data methods implemented and be added to the FuncFactory constructor.