PyRosetta AbInitio Folding protocol

This topic has 7 replies, 5 voices, and was last updated 9 years, 5 months ago by Anonymous.

Viewing 6 reply threads

Author

Posts
- December 15, 2009 at 7:09 pm #400
  Anonymous
  The first thing I’m interested in within PyRosetta package is ”ab initio” predictions of protein structure. As I started with common __Rosetta__, I was able to get predictions of reasonable accuracy with three test cases — a headpiece of villin, ubiquitin and barstar. With AbinitioRelax module of Rosetta I got moderate RMSD from native models, and I observed some correlation RMSD x Rosetta energy. Moreover, when I put on the same plot “refined native” models, I see the “energy gap”.
  
  After that I walked through pyRosetta tutorial trying to get similar results by making ab initio folding protocol by myself. There are no program listings, but I tried to implement accurately all recommendations that were in the tutorial. The most recent protocol that I developed (listed below) includes subsequent low-resolution 9-mer and 3-mer fragment insertions under Metropolis procedure control and then high-resolution small and shear moves with periodic sidechain repacking and energy minimization. I also applied simulated annealing regime and sometimes “ramping” of VdW energy.
  
  But unfortunately, with this I get some low-rmsd models only in villin case (36 aa), although there is no clear correlation with the score. For bigger proteins, I don’t get close-to-native models at all, that means that my protocol is inadequate.
  
  Could anyone please point out errors that make my program inefficient, or share your own example of ”ab initio” folding algorithm that performs at least as good as “native” Rosetta AbinitioRelax program?
  
  —
  pdb_to_fold = “1UBI_ideal.pdb”
  
  pdb_native = “1UBI.pdb”
  
  frag3_file = “aa1ubi_03_05.200_v1_3”
  
  frag9_file = “aa1ubi_09_05.200_v1_3”
  
  initTemp = 2.0; finalTemp = 0.8
  
  // Initial set up
  
  p = Pose(); start = Pose(); start_c = Pose(); native = Pose(); native_c = Pose()
  
  pose_from_pdb(p, pdb_to_fold); pose_from_pdb(native, pdb_native)
  native_c.assign(native); start.assign(p)
  
  // Scoring functions
  
  sc_c = create_score_function(‘cen_std’)
  
  sc_f = create_score_function(‘standard’)
  
  // Packer mover
  
  task_pack = TaskFactory.create_packer_task(start)
  task_pack.restrict_to_repacking()
  
  pack = PackRotamersMover(sc_f, task_pack)
  
  // Centroid/Fullatom conversion
  
  switch_c = SwitchResidueTypeSetMover(‘centroid’)
  
  switch_f = SwitchResidueTypeSetMover(‘fa_standard’)
  
  switch_c.apply(p); switch_c.apply(native_c); start_c.assign(p)
  
  // Fragment movers
  
  movemap = MoveMap()
  
  movemap.set_bb(True)
  
  fragset9 = ConstantLengthFragSet(9, frag9_file)
  
  fragset3 = ConstantLengthFragSet(3, frag3_file)
  
  mover_9mer = ClassicFragmentMover(fragset9, movemap)
  
  mover_3mer = ClassicFragmentMover(fragset3, movemap)
  
  // Small & shear movers
  
  smallmover = SmallMover(movemap, initTemp, 3)
  
  shearmover = ShearMover(movemap, initTemp, 3)
  
  small_random = RandomMover()
  
  small_random.add_mover(smallmover)
  
  small_random.add_mover(shearmover)
  
  // Minnimizer mover
  
  min = MinMover(movemap, sc_f, ‘dfpmin’, 0.5, True)
  
  def frag_insert(pose, scoreFunction, frag_mover):
  
  N1 = 20
  
  N2 = 300
  
  mc_c.reset(pose)
  
  kT = initTemp
  
  gamma = math.pow(finalTemp / initTemp, 1.0 / (N1 * N2))
  
  for i in range(1, N1 + 1):
  
  mc_c.recover_low(pose)
  
  print “Low-resolution energy”, scoreFunction(pose)
  
  for j in range(1, N2 + 1):
  
  kT = kT * gamma
  
  mc_c.set_temperature(kT)
  
  frag_mover.apply(pose)
  
  mc_c.boltzmann(pose)
  
  //END for j
  
  // END for i
  
  mc_c.recover_low(pose)
  
  return(pose)
  
  // END frag_insert
  
  def small_moves_centroid(pose, mc, scoreFunction):
  
  mc.reset(pose)
  
  mc.set_temperature(finalTemp)
  
  for i in range(1, 10000):
  
  small_random.apply(pose)
  
  mc.boltzmann(pose)
  
  if (i % 1000 == 0): mc.recover_low(pose)
  
  // END for i
  
  mc.recover_low(pose)
  
  return pose
  
  // END small_moves_centroid
  
  // Job parallelization & main algorythm
  
  jd = PyJobDistributor(“ubi”, 5000, sc_f)
  
  jd.native_pose = native
  
  while (jd.job_complete == False):
  
  // Low-resolution modeling
  
  p.assign(start_c)
  
  mc_c = MonteCarlo(p, sc_c, initTemp)
  
  frag_insert(p, sc_c, mover_9mer)
  
  frag_insert(p, sc_c, mover_3mer)
  
  // High-resolution modeling
  
  switch_f.apply(p)
  
  pack.apply(p)
  
  min.apply(p)
  
  kT = initTemp
  
  gamma = math.pow(finalTemp / initTemp, 1.0 / 10000)
  
  mc_f = MonteCarlo(p, sc_f, kT)
  
  for i in range(1, 10000):
  
  kT = kT * gamma
  
  mc_f.set_temperature(kT)
  
  small_random.apply(p)
  
  mc_f.boltzmann(p)
  
  if (i % 1000 == 0): mc_f.recover_low(p)
  
  if (i % 100 == 0):
  
  pack.apply(p)
  
  min.apply(p)
  
  // END if i
  
  // END for i
  
  mc_f.recover_low(p)
  
  jd.output_decoy(p)
  
  // END while (jd)
- January 4, 2010 at 8:38 pm #4289
  Anonymous
  What is your starting structure? 5000 decoys seems small for folding a 36aa protein, did you try increasing the number of decoys significantly to 20,000 or 50,000?
- January 11, 2010 at 9:13 am #4301
  Anonymous
  I use extended conformation to avoid any bias. Of cause 5000 may be not enough, but this is certainly not the main fault — the classic “AbinitioRelax” program from Rosetta 3.1 package uses only 1000 decoys in my case, and results are apparently better.
  So it seems that the protocol listed above is for some reason not efficient, do you have any clues?
- February 24, 2010 at 2:46 am #4358
  Anonymous
  Hey Batch. I know this is a little late, but I went my own way for a while trying to develop an abinitio folder, and had some of the same luck. While I did some interesting things, I was told to stick with what works; and so I am trying to design the abinitio program found in rosetta, within pyrosetta.
  
  Sounds easier then it is, but I have made some strides. This is the paper that describes the low res, abinitio program (as used in all CASP experiments from 04 to present):
  
  Rohl, et all; Protein Structure Prediction using Rosetta, methods in enzymology vol 383, 2004.
  
  I have been able to implement every step correctly except the Gunn approach to fragment insertion (last step). I may be able to program it, but I am doubtful. As soon as I am done, I will post it here.
  
  The other paper that further defines how Rosetta is used within the CASP experiments is as follows:
  
  Bradley, et al; ‘Toward High-Resolution de Novo Structure Prediction for Small Proteins’, Science 2005, vol 309 pgs 1868 – 1871
  
  This is a little stranger, as it uses homologies in a unique way, but I am fairly sure this was done in CASP8.
  
  Wish you well.
  
  -Jared
- February 24, 2010 at 2:33 pm #4359
  Anonymous
  Thanks Jared — I’d like to see your algorythm once it starts working Actually, I read Bradley paper and my program was similar to that I found there, except for realistic results I might have been doing something wrong, so it’ll be interesting to see what you’ve got.
- March 21, 2010 at 7:15 am #4417
  Anonymous
  The topic is very useful.
  
  Jared, so excellent! I think many PyRosetta users will be interested your programs.
  
  Sid, here is my suggestion: since PyRosetta provides more interactive interface for performing most of Rosetta tasks/protocols, would you write some scripts to re-implement the protocols referred in the Baker’s famous papers as listed above? I think many users would love the operation manners in PyRosetta if they know exactly how to transit from the traditional rosetta command line options.
  
  cheers.
  
  -Jarod
- February 5, 2015 at 4:58 pm #10810
  Anonymous
  Hello everyone,
  
  I’m newbie in the field of Pyrosetta and I’ve already starting using the script written by batch2k.
  
  However, even increasing the number of cycles the script does not reach the suitable target conformation for a experimentally known protein structure.
  
  I was wondering if finally the script with newer implementation Jared was pointing out previously. Is it accesible from elsewhere ?
  
  Thanks in advance!
- February 5, 2015 at 6:32 pm #10811
  Anonymous
  Hi Jseco,
  
  I never ended up finishing that script. I basically decided to use vanilla Rosetta abinitio, which worked pretty well in a few predictions after that.
  
  However,
  It looks as though a lot of the code that didn’t seem to be accessible in PyRosetta, is now. You’ll have to figure it out, but try this in ipython to look around:
  
  from rosetta import *
  rosetta.init()
  import rosetta.protocols.abinitio as ab
  
  It looks like the full c++ level application simply calls AbRelaxApplication class, which reads everything from the command line. So thats awesome If it works, you would be able run the full protocol in python – but you will need to control pretty much everything through the options system. (rosetta.init(“-my string -of options”). In addition, its unclear to me how to get a pose out after the fold through the run() function, which is what is called at the app level.
  
  The run function calls setup(), then creates a new empty pose and empty protocol class, then calls setup_fold(pose, protocol), and fold(pose, protocol). Really not sure if its worth it to get this to work in PyRosetta.
  
  What you probably want to try to use is the ClassicAbinitio class. This accepts fragsets, movemaps, and a pose (I assume its an extended pose, but I really don’t know.) There is also currently a lot of work going into the Environment/Topology Broker framework. I believe that the paper for this should come out soon:https://www.rosettacommons.org/docs/latest/EnvironmentFramework.html
  
  Here is a nice description from the ClassicAbinitio code:
  
  //@ brief The Classic Abinitio protocol from rosetta++
  /*!
  @ detail
  general usage:
  ClassicAbinitio abinitio;
  abinitio.init( pose );
  …
  while(nstruct) {
  abinitio.apply( pose );
  }
  
  call ClassicAbinitio::register_options() before core::init::init to add relevant options to the applications help
  
  , with the following
  stages, all of which uses a different ScoreFunction based on the cen_std.wts in minirosetta_database:
  
  – Stage 1: large (usually 9mer) randomly selected fragment insertions, only VDW term turned on.
  Uses score0.wts_patch and runs for either a maximum of 2000 cycles or until all moveable phi/psi values
  have been changed.
  
  – Stage 2: large randomly selected fragment insertions, more score terms turned on. Uses score1.wts_patch
  and runs for 2000 cycles.
  
  – Stage 3: uses large randomly selected fragment insertions, although the size of the fragment insertions
  is tunable via the set_apply_large_frags( bool ) method. Alternates between score2.wts_patch and score5.wts_patch,
  running tunable numbers of 2000-cycle iterations between the two scoring functions.
  
  – Stage 4: uses small (usually 3mer) fragment insertions with the fragment selection based on the Gunn cost for
  finding local fragment moves. Runs for 4000-cycles and uses score3.wts_patch.
  
  The class implements the basic abinito approach as known from rosetta++. We tried to set this up, such that
  behaviour of the protocol can be changed in many different ways ( see, e.g., FoldConstraints ). To be able to change the
  behaviour of the protocol easily the class-apply function and methods called therein (e.g., prepare_XXX() / do_XXX_cycles() ) should
  not directly change moves or trials. A reference to the currently used score-function should be obtained by
  mc().score_function() …
  
  Behaviour can be changed in the following ways:
  
  use non-classic FragmentMover –> eg. not uniformly sampled fragments, but using some weighting
  –> large and small moves doesn’t have to be 3mers and 9mers… use other movers…
  —> or other fragets for the “convenience constructor”
  use custom trial classes –> overload update_moves()
  
  change sampling behaviour:
  overload prepare_XXX() methods: these are called before the cycling for a certain stage begins
  overload do_stageX_cycles() : the actual loops over trial-moves …
  
  change scoring functions:
  overload set_default_scores()
  weight-changes effective for all stages: set_score_weight()
  
  Good luck!!!
Author

Posts

Viewing 6 reply threads

You must be logged in to reply to this topic.