Member Site › Forums › PyRosetta › PyRosetta – Applications › Problems when working with PTMs – cannot make sugar poses
- This topic has 6 replies, 2 voices, and was last updated 5 years, 11 months ago by Anonymous.
-
AuthorPosts
-
-
January 10, 2019 at 3:44 pm #3069Anonymous
Hello everyone,
I have a protein which has a few PTMs in the form of sugars, and I have therefore started experimenting with sugars in PyRosetta. I have found a very nice guide:
https://graylab.jhu.edu/pyrosetta/downloads/documentation/RosettaCarbohydrate_Tutorial-Demo.pdf
which I have used as a starting point together with the documentation e.g.
I start out by initializing PyRosetta with the appropriate options, and import the specific methods I need:
from pyrosetta import *
pyrosetta.init('-include_sugars','-write_pdb_link_records')
from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose
from pyrosetta.rosetta.core.pose import make_pose_from_saccharide_sequence, pose_from_saccharide_sequenceHowever, whenever I try to use the methods I get runtime errors. If I e.g. enter:
mannose = pose_from_saccharide_sequence('->3)-a-D-Manp')
I get:
ERROR: Unrecognized residue ->3)-alpha-D-Manp when building saccharide sequence ->3)-a-D-Manp
ERROR:: Exit from: /Volumes/MacintoshHD3/benchmark/W.fujii/rosetta.Fujii/_commits_/main/source/src/core/pose/annotated_sequence.cc line: 478
BACKTRACE:
(lot of lines ala 1 ‘rosetta.so 0x000000012135ff78 rosetta.so + 71184248’)I got the input for ‘pose_from_saccharide_sequence’ from the guide I linked above, so it should be correct. The documentation states that it:
“Return a Pose from an annotated, linear, IUPAC polysaccharide sequence <sequence> with residue type set name”. I am not an expert on sugars, but as far as I can tell is the input used above ok. I have also tried to input simply ‘galp’, which also does not work.
Can anyone see what I am doing wrong?
My setup is
MacOS 10.14, python 3.6.6, Pyrosetta: PyRosetta4.Release.python36.mac r206
Kind regards,
Martin
-
January 10, 2019 at 4:20 pm #14555Anonymous
Hello!
Those functions will work for your purposes, but you should also checkout the SimpleGlycosylateMover. If you need to introduce a sequon, the CreateGlycanSequonMover can be used to do this. You can also checkout the CreateSequenceMotifMover in protocols.calc_taskop_movers
Use this to import most of the new glycan classes and utils:
from rosetta.protocols.carbohdyrates import *
There are also a few glycan residue selectors in rosetta.protocols.select.residue_selectors
Once you glycosylate the pose at the place you want, you can access the new GlycanTreeSet via pose.glycan_tree_set(). Lots of functions that can be useful to you.
Finally, for modeling, you should probably use the GlycanTreeModelor. Its also in protocols.carbohydrates. It is not published yet, but has been used to do de-novo modeling of glycans with decent results. Jason Labonte and I have spent the past few years developing it (and associated methods/classes) and it should be published sometime this year (with full documentation on all of these). However, they are all accessible in PyRosetta, all bug tested, and all benchmarked on high-resolution crystal structures, so go ahead and use them if you want.
As for the IUPAC naming, here is the README we use when defining new common glycans such as man5 or man9 (which can simply be passed to the SimpleGlycosylateMover. It should help you define the correct nomenclature used. You can find it in PyRosetta in database/chemical/carbohydrates/common_glycans I have included it here.
Cheers,
Jared Adolf-Bryfogle
-
January 16, 2019 at 12:47 pm #14558Anonymous
Dear jared,
Thank you so much for the very detailed reply – very much appreciated!
For some reason, pose_from_saccharide_sequence worked after restarting my IDE, so perhaps I had initialited it wrong the first time.
I have played a bit around with the different methods and movers, and I am almost able to do everything I want to do. I have however run into into a problem: I cannot get sialic acids to work, even though two are listed in ‘codes_to_roots.map’:
# Ketononoses (e.g., Sialic Acids)
Neu neuramin * 2 # D inherent
Kdn keto-deoxy-nonulon * 2 # D inherentSialic acids have nine carbons but only have five carbons in the ring ( https://en.wikipedia.org/wiki/Sialic_acid ). I would therefore expect ”a-Neuf5Ac” to work but is does not. The Error I get is:
ERROR: Unrecognized residue ->4)-alpha-Neuf:5-Ac when building saccharide sequence a-Neuf5Ac-(2->3)
Can anyone help?
Kind regards,
Martin
EDIT:
it should be ‘p’ when there are five carbons in the ring, but if I try ”a-Neup5Ac” I also get an error
EDIT 2:
I got it working using ‘->-a-Neup5Ac-(2->3)’ so it was the default value of ->4)… which caused it not to work. Like I said, I am still a rookie with glycans, so not always super on top of the correct settings for the different glycans
-
-
January 17, 2019 at 9:55 am #14559Anonymous
Hello again,
I have tried to include the option ‘-auto_detect_glycan_connections’ described here: https://www.rosettacommons.org/docs/latest/application_documentation/carbohydrates/WorkingWithGlycans
It, however, gives me the import error:
from pyrosetta import *
pyrosetta.init('-include_sugars','-write_pdb_link_records','-auto_detect_glycan_connections')
Traceback (most recent call last):
File "<ipython-input-21-fcfdbbb80342>", line 8, in <module>
pyrosetta.init('-include_sugars','-write_pdb_link_records','-auto_detect_glycan_connections')
File "/Users/martinpedersen/anaconda3/envs/py36/lib/python3.6/site-packages/pyrosetta-2019.1+release.dbc838b6ae6-py3.6-macosx-10.7-x86_64.egg/pyrosetta/__init__.py", line 160, in init
assert set_logging_handler in (None, True, False, "interactive", "logging")
AssertionErrorThought it would be nice for you to know.
Kind regards,
Martin
-
January 17, 2019 at 4:46 pm #14560Anonymous
If you are within a notebook, make sure to restart the kernal any time you re-init.
-
January 21, 2019 at 1:56 pm #14562Anonymous
I usually run my scripts on our cluster with the command: python <script.py> > pyros.log
I am therefore not sure it has to do with the kernel
Kind regards,
Martin
-
-
January 21, 2019 at 1:54 pm #14561Anonymous
Hello again,
I have been able to modifiy our MX structure, albeit with some errors, failures and warnings from GlycanTreeModeler. I would also like to use standard movers like SmallMover and ShearMover to model two hinge regions, which has also proved difficult. I hope it is OK I keep asking the questions in this thread, otherwise I am happy to make a new one.
Regading GlycanTreeModeler:
After adding the glycosylations using SimpleGlycosylateMover, I call GlycanTreeModeler which succesfully minimise the structure, although I have to call it many times (> 20) to get reasonable minimisation and even after 100 iterations are there problematic tryptophan geomtries/bond lengths (Trp can have mannose added).
The warnings I tipycally get are of the sort:
protocols.carbohydrates.GlycanTreeMinMover: Minimizing from carbohydrate root: 53
protocols.carbohydrates.GlycanTreeMinMover: Minimizing 3 residues
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 13!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 1does not have enough heavy atoms about its connection to define a torsion angle!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 14!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 2does not have enough heavy atoms about its connection to define a torsion angle!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 13!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 1does not have enough heavy atoms about its connection to define a torsion angle!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 14!
core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
core.conformation.Conformation: [ WARNING ] Branch 2does not have enough heavy atoms about its connection to define a torsion angle!
protocols.carbohydrates.LinkageConformerMover: Sampling alpha-L-Fucp-(?->6)-beta-D-GlcpNAc- linkage
protocols.carbohydrates.LinkageConformerMover: Sampling conformer 2 which has a population of 0.1813
protocols.carbohydrates.LinkageConformerMover: Complete
...
core.pack.task: Packer task: initialize from command line()
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 4 ASN:N-glycosylated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 23 SER:O-conjugated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 26 SER:O-conjugated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 42 TRP:C-glycosylated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 45 TRP:C-glycosylated
core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 48 TRP:C-glycosylatedand the failures/errors I get:
protocols.carbohydrates.LinkageConformerMover: Sampling beta-D-Glcp-(?->3)-alpha-L-Fucp- linkage
protocols.carbohydrates.LinkageConformerMover: no conformer found. Using sugar BB sampler if possible.
protocols.carbohydrates.GlycanTreeSampler: Last mover failed. Continueing!
protocols.carbohydrates.LinkageConformerMover: Sampling alpha-D-Manp-(?TRP linkage
protocols.carbohydrates.LinkageConformerMover: no conformer found. Using sugar BB sampler if possible.
protocols.carbohydrates.LinkageConformerMover: Cannot run sugar BB Sampler on linkage as linkage as a protein-glycan linkage. Skipping
protocols.simple_moves.bb_sampler.SugarBBSampler: 81 81 A torsion: 2
protocols.simple_moves.bb_sampler.SugarBBSampler: No data for linkage. Either this is psi and previous residue is not carbohydrate or we do not have a pyranose ring in the previous residue.
protocols.simple_moves.BBDihedralSamplerMover: [ ERROR ] Could not set torsion for resnum 81
protocols.carbohydrates.GlycanTreeSampler: Last mover failed. Continueing!To investigate further, I made a simply dummy pose (a pose_from_sequence) which was a straight line and tried to add the glycosilation to that – perhaps some of the problems stem from the glycosylations being added in regions without so much space around it. Yet, I still ran into problems. I have attached a screenshot of wierd Trp geomtry/bond lengths, with the Trp sitting next to Ala so there should be plenty of space.
being unfamiliar with GlycanTreeModeler, I thought that maybe by using standard relaxtion movers like MinMover I could at least fix the tryptophan geometries (with a movemap with set_chi(True)), which in turn would allow GlycanTreeModeler to fix the errors with the glycans (if any is present, SimpleGlycosylateMover does not produce any warnings). That is, perhaps if I iteratively called GlycanTreeModeler and MinMover then problems would might go away. This proved not to be the case, at least not with 20 iterations – perhaps I could improve them by iterating longer.
Regarding other movers (e.g. SmallMover and ShearMover)
The main problem I have had is that I get zero accepted moves regardless of how high I put kT. This prompted me to log in the log and I can see that I get the error of the type:
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ; atomno= 6 rsd= 10 ; atomno= 8 rsd= 10 ; atomno= 1 rsd= 56
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ; atomno= 6 rsd= 10 ; atomno= 8 rsd= 10 ; atomno= 1 rsd= 56
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 42 ; atomno= 6 rsd= 42 ; atomno= 7 rsd= 42 ; atomno= 1 rsd= 59
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 45 ; atomno= 6 rsd= 45 ; atomno= 7 rsd= 45 ; atomno= 1 rsd= 60
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 48 ; atomno= 6 rsd= 48 ; atomno= 7 rsd= 48 ; atomno= 1 rsd= 61Interestingly, it is core.conformation.carbohydrates.util which outputs the error, but the errors are printed after GlycanTreeModeler has printed the summary of its run e.g.
protocols.carbohydrates.GlycanTreeModeler: Start- : 13198.6
protocols.carbohydrates.GlycanTreeModeler: Pre - : 790.491
protocols.carbohydrates.GlycanTreeModeler: Post - : 829.677
protocols.carbohydrates.GlycanTreeModeler: Final- : 790.491
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 4 ; atomno= 6 rsd= 4 ; atomno= 8 rsd= 4 ; atomno= 1 rsd= 53
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ; atomno= 6 rsd= 10 ; atomno= 8 rsd= 10 ; atomno= 1 rsd= 56
core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 42 ; atomno= 6 rsd= 42 ; atomno= 7 rsd= 42 ; atomno= 1 rsd= 59
...One thing I learned is that if I output a pose (pose.dump_pdb) and reload it again, it get the following warnings which are likely connected to the AtomID errors:
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc53 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Gal54 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Neu55 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc56 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
...
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 53. Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 56. Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 59. Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 60. Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 61. Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-L-Fucp 62. Returning BOGUS ID instead.
core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-L-Fucp 64. Returning BOGUS ID instead.In conclusion:
I get both warning, error and failure flags whenever I try to add glycosylations. If I furthermore try to use other movers to perturb the backbone in hinge regions, no moves are accepted. Here is a dummy version of the script where I add the glycosylation to a AA sequence I just cooked up, in case it has something to do with the way I do it:
from pyrosetta import *
pyrosetta.init('-include_sugars','-write_pdb_link_records')#,'-auto_detect_glycan_connections')
#%%
from pyrosetta.rosetta.protocols import relax
from pyrosetta.rosetta.protocols.simple_moves import SmallMover, ShearMover
from pyrosetta.rosetta.protocols.minimization_packing import MinMover, PackRotamersMover
from pyrosetta.rosetta.core.pack.task import TaskFactory
from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose
from pyrosetta.rosetta.core.pose import make_pose_from_saccharide_sequence,
pose_from_saccharide_sequence
from pyrosetta.rosetta.protocols.carbohydrates import *
from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose, glycosylate_pose_by_file
from pyrosetta.rosetta.utility import vector1_std_string as vstr
sugar_bb = pyrosetta.rosetta.core.scoring.ScoreType.sugar_bb
#%%
import os, glob
import numpy as np
from timeit import default_timer as timer
import datetime
#%%
path = os.path.abspath('.') + '/'
os.chdir(path)
sequence = 'NWS'*4+'PPPP'*2+'NWS'*4+'PPPP'*2+'AWA'*4
pose = pose_from_sequence(sequence)
#pose.dump_pdb('NoPTM.pdb')
#%%
string1 = '->-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->6)-'+
'[->-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->3)]-' +
'[b-D-GlcpNAc-(1->4)]-b-D-Manp-(1->4)-b-D-GlcpNAc-(1->4)-'+
'[a-L-Fucp-(1->6)]-b-D-GlcpNAc'
string2 = 'b-D-Glcp-(1->3)-a-L-Fucp'
string3 = '->4)-a-D-Manp'
#%%
string1 = '->-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->6)-'+
'[->-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->3)]-' +
'[b-D-GlcpNAc-(1->4)]-b-D-Manp-(1->4)-b-D-GlcpNAc-(1->4)-'+
'[a-L-Fucp-(1->6)]-b-D-GlcpNAc'
string2 = 'b-D-Glcp-(1->3)-a-L-Fucp'
string3 = '->4)-a-D-Manp'
#%%
mov = SimpleGlycosylateMover()
mov.set_glycosylation(string1)
pos = [4,10]
for ps in pos:
mov.set_position(ps)
mov.apply(pose)
mov.set_glycosylation(string3)
pos = [42,45,48]
for ps in pos:
mov.set_position(ps)
mov.apply(pose)
mov.set_glycosylation(string2) # cannot finish with mannose addition, throws error
pos = [23,26]
for ps in pos:
mov.set_position(ps)
mov.apply(pose)
#%%
scorefxn = get_fa_scorefxn()
kT = 2
n_moves = 1
mc = MonteCarlo(pose, scorefxn, kT)
# Glycosylation mins
mins_glyc = GlycanTreeModeler()
# prepare for hinge region
hinge_reg = [1,52]
hinge = MoveMap()
hinge.set_bb_true_range(hinge_reg[0], hinge_reg[1])
# Movers for hinge region
smallmover = SmallMover(hinge, kT, n_moves)
smallmover.angle_max(10)
shearmover = ShearMover(hinge, kT, n_moves)
# optimese rotamers
to_pack = TaskFactory.create_packer_task(pose)
to_pack.restrict_to_repacking() # prevents design, packing only
to_pack.or_include_current(True)
packmover = PackRotamersMover(scorefxn, to_pack)
# prepare the sequence mover
seq_mover = SequenceMover()
seq_mover.add_mover(mins_glyc)
num_reps = 10 #
for num in range(num_reps):
seq_mover.add_mover(smallmover)
seq_mover.add_mover(shearmover)
seq_mover.add_mover(packmover)
trial_mover = TrialMover(seq_mover, mc)
file = 'Dummy.log'
header = '# kT = {:0.0f}, n moves/mover = {:d}n'.format(kT, n_moves) +
'# Timestamp, elapsed, Cycle, Score [REU], Acc. rate [Acc/trials]n'
with open(file,'w+') as f:
f.writelines(header)
time = timer()
for num in range(200):
trial_mover.apply(pose)
if np.mod(num+1,1) == 0: # conditional used if I want to output less
E_temp = scorefxn(pose)
elapsed = timer()-time
date = datetime.datetime.now().strftime("%H:%M:%S")
acc_rate = trial_mover.acceptance_rate()
string = date + ', {:0.0f}, {:d}'.format(elapsed, num+1) +
', {:0.2f}, {:0.2e}n'.format(E_temp, acc_rate)
f.writelines(string)
if np.mod(num+1,1) == 0: # conditional used if I want to output less
filename = 'dum_{:03d}.pdb'.format(num+1)
pose.dump_pdb(filename)
void = mc.recover_low(pose)
filename = 'xDum_lowest.pdb'
pose.dump_pdb(filename)
Note that in this script, the movers are able to change BB angles
-
-
AuthorPosts
- You must be logged in to reply to this topic.