Metadata

Author: Sinisa Bjelic and TJ Brunette

This document was last updated August 11, 2010 by TJ Brunette & Sinisa Bjelic. The corresponding PIs for this application are David Baker (dbaker@u.washington.edu) .

An introductory loop modeling tutorial can be found here.

Code and Demo

The application for this method is rosetta/main/source/bin/loopmodel.* . The main ccd loop movers are perturb_ccd which exists in rosetta/main/source/src/protocols/loops/LoopMover_CCD.cc and quick_ccd which exists in LoopMover_QuickCCD.cc . Both protocols generate loops by assembling them from fragments by Monte Carlo sampling and uses CCD to close the loops. The most commonly used method for fragment-based loop modeling is currently quick_ccd. Option -loops:remodel quick_ccd_moves is no longer in widespread use, but exists for backward compatibility.

A usage example that remodels 10 and 5-residue loops is in the loop_modeling integration test, found here:

rosetta/main/tests/integration/tests/loop_modeling

References

For CCD loop modeling (flags containing 'ccd') please cite

Wang C, Bradley P, Baker D (2007).
Protein-protein docking with backbone flexibility. J. Mol. Biol. 373, 503.

The original CCD algorithm is described in

Canutescu A, Dunbrack R., Jr (2003).
Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein Sci. 12, 963.

Purpose

In protein structure prediction, it is often the case that a protein segment must be adjusted to connect two fixed segments. For example, loop modeling is used to close chain breaks during the Rosetta jumping protocol (Bradley, 2005), and to remodel loops during the homology modeling protocol (Qian, 2007).

Algorithm

Loop modeling is performed by two different algorithms CCD (Cyclic coordinate descent) and KIC (Kinematic closure). Here only CCD is described and the explanation of the latter can be found the corresponding KIC documentation. The goal of both algorithms is to explore the conformational space of the loop using a centroid representation of protein side-chains and explicit backbone representation, followed by a higher-resolution search using explicit representations of all atoms and hydrogen.

The centroid stage of loop-modeling generates loops by performing fragment insertions using Monte Carlo sampling, a score to reward closed chains, and CCD is used to close the loop at the end of the simulation. As the fragments are necessary for the sampling these have to be generated by fragment picker ( c.f. fragment picker documentation) or downloaded from the Robetta web-server. Fragments are loaded with the following options:

-loops:frag_sizes (defines the number of residues in each fragment file)
-loops:frag_files    (defines the name of each fragment file)

An alternative option exists, -loops:vall_file, that lets the user pick fragments on the fly using a sequence-identity based scoring method for selecting fragments. In most cases, this option will lead to less suboptimal results for a given input sequence, as the fragments picked will not take advantage of sequence profile and secondary structure information.

Input Files

Loop definition file specified by (adapted from kinematic loop modelling documentation)

-loops:loop_file

and shared across all loop modeling protocols. For each loop to be modeled, include the following on one line:

column1  "LOOP":     The loop file identify tag
column2  "integer":  Loop start residue number
column3  "integer":  Loop end residue number
column4  "integer":  Cut point residue number, >=startRes, <=endRes. default - let the loop modeling code choose cutpoint
column5  "float":    Skip rate. default - never skip
column6  "boolean":  Extend loop. Default false

An example loop definition file can be found at rosetta/main/tests/integration/tests/kinematic_looprelax/input/4fxn.loop, which looks like this:

LOOP 88 95 92 0 1

Options

Options used in Loop Modeling

Loop modeling control: A series of string options control what sorts of loop modeling you get. The executable contains many different loop modeling modes, you use string selections to tell it which paths to take.

Necessary options to invoke CCD loop modeling

-in:file:s
input pdb file that loop modelling is done on (any "jd2" input options can be used) 

-loops:remodel
legal=['no', 'perturb_ccd', 'perturb_kic', 'quick_ccd',
       'quick_ccd_moves', 'old_loop_relax', 'sdwindow']
Centroid version of loopmodeling.
    peturb_kic: for the kic loopclosure
    perturb_ccd:  original fragment/ccd loop closure method
    quick_ccd: faster fragment/ccd loop closure method

older unused methods: quick_ccd_moves, old_loop_relax, sdwindow

The preferred method CCD looprelax is quick_ccd

-loops:refine
legal=['no','refine_ccd','refine_kic']
Method for performing full-atom refinement on loops.
The preferred method for full-atom refinment is refine_kic

-loops:loop_file
Loop definition file(s). When multiple files are given a *random* one will be picked
each time when this parameter is requested.

Fragment setting options

-loops:vall_file
vall database file for picking crude fragments on the fly
without inputting pregenerated fragments
default='vall_file'

-loops:frag_sizes
lengths of fragments to be used in loop modeling
default=['9','3','1']

-loops:frag_files
fragment libraries files
default=['frag9', 'frag3', 'frag1']

Additional options for in-detail control

Additional flags that set structure and loop sampling.

-loops:relax
legal=['no','fastrelax','shortrelax','fullrelax','seqrelax','minirelax']
does a relax on the structure - minimization in the torsion space
default = 'no'

-loops:extended
force extended on loops (phi-psi angles set to 180 degrees),
independent of loop input file
legal=['true','false']
default='false'

General options for a ROSETTA run

-nstruct
number of outputs. [Integer]

-database
Path to Rosetta databases. [PathVector]

Post Processing

Expected Outputs

For production runs, it is recommended to use the following flags. -loops::remodel quick_ccd -loops::refine refine_kic -loops::relax fastrelax -relax::fastrelax_repeats 8 -loops::extended and to generate at least 1000 models using -nstruct 1000 .

quick_ccd can also remodel termini. To do this set the cutpoint in the loops file to be equal to the last residue in the chain. For example for a 80 residue protein, if you want to remodel the first 10 residues the loop file would have 1 10 10 0 0.

For modeling a C-terminus, set the first residue as the residue BEFORE the loop to avoid chainbreaks. For example, if you want to model the C-terminus from residues 259 to 267, your loops file should look like this: 258 267 267 0 1.

quick_ccd does not require constraints, but using constraints from homologs or experimental data can produce more accurate results. Output consists of a pdb and a scorefile. The job concludes with the following command:

protocols.looprelax: ===
protocols::checkpoint: Deleting checkpoints of Loopbuild

Analyzing data

For benchmarking purposes, creating a score vs rmsd plot across decoys and looking for near native 'energy funnels' is good way to test the performance of the protocols on a system, and can help to determine whether errors are due to scoring or sampling. For blind prediction and refinement, such plots can still be useful to look for convergence or multiple minima in the energy landscape. Decoys may also be pairwise-clustered to search for well-populated regions of conformational space that may represent alternative low-energy conformations. (from KIC loopclosure)

New things since last release

No improvements have made since last release.

Fragment-based CCD loop modeling