Problems when working with PTMs – cannot make sugar poses

Member Site Forums PyRosetta PyRosetta – Applications Problems when working with PTMs – cannot make sugar poses

Viewing 3 reply threads
  • Author
    Posts
    • #3069
      Anonymous

        Hello everyone,

        I have a protein which has a few PTMs in the form of sugars, and I have therefore started experimenting with sugars in PyRosetta. I have found a very nice guide:

        https://graylab.jhu.edu/pyrosetta/downloads/documentation/RosettaCarbohydrate_Tutorial-Demo.pdf

        which I have used as a starting point together with the documentation e.g.

        https://graylab.jhu.edu/PyRosetta.documentation/pyrosetta.rosetta.core.pose.html?highlight=make_pose_from_saccharide_sequence#pyrosetta.rosetta.core.pose.make_pose_from_saccharide_sequence

        I start out by initializing PyRosetta with the appropriate options, and import the specific methods I need:


        from pyrosetta import *
        pyrosetta.init('-include_sugars','-write_pdb_link_records')

        from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose
        from pyrosetta.rosetta.core.pose import make_pose_from_saccharide_sequence, pose_from_saccharide_sequence

         

        However, whenever I try to use the methods I get runtime errors. If I e.g. enter:


        mannose = pose_from_saccharide_sequence('->3)-a-D-Manp')

        I get:


        ERROR: Unrecognized residue ->3)-alpha-D-Manp when building saccharide sequence ->3)-a-D-Manp

        ERROR:: Exit from: /Volumes/MacintoshHD3/benchmark/W.fujii/rosetta.Fujii/_commits_/main/source/src/core/pose/annotated_sequence.cc line: 478

        BACKTRACE:
        (lot of lines ala 1 ‘rosetta.so 0x000000012135ff78 rosetta.so + 71184248’)

        I got the input for ‘pose_from_saccharide_sequence’ from the guide I linked above, so it should be correct. The documentation states that it:

        “Return a Pose from an annotated, linear, IUPAC polysaccharide sequence <sequence> with residue type set name”. I am not an expert on sugars, but as far as I can tell is the input used above ok. I have also tried to input simply ‘galp’, which also does not work.

        Can anyone see what I am doing wrong?

        My setup is

        MacOS 10.14, python 3.6.6, Pyrosetta: PyRosetta4.Release.python36.mac r206

        Kind regards,

        Martin  

      • #14555
        Anonymous

          Hello!

          Those functions will work for your purposes, but you should also checkout the SimpleGlycosylateMover. If you need to introduce a sequon, the CreateGlycanSequonMover can be used to do this. You can also checkout the CreateSequenceMotifMover in protocols.calc_taskop_movers

          Use this to import most of the new glycan classes and utils:

            from rosetta.protocols.carbohdyrates import *

            There are also a few glycan residue selectors in rosetta.protocols.select.residue_selectors

          Once you glycosylate the pose at the place you want, you can access the new GlycanTreeSet via pose.glycan_tree_set().  Lots of functions that can be useful to you.  

          Finally, for modeling, you should probably use the GlycanTreeModelor.  Its also in protocols.carbohydrates.  It is not published yet, but has been used to do de-novo modeling of glycans with decent results.  Jason Labonte and I have spent the past few years developing it (and associated methods/classes) and it should be published sometime this year (with full documentation on all of these).  However, they are all accessible in PyRosetta, all bug tested, and all benchmarked on high-resolution crystal structures, so go ahead and use them if you want. 

           

          As for the IUPAC naming, here is the README we use when defining new common glycans such as man5 or man9 (which can simply be passed to the SimpleGlycosylateMover.  It should help you define the correct nomenclature used. You can find it in PyRosetta in database/chemical/carbohydrates/common_glycans I have included it here. 

          Cheers,

          Jared Adolf-Bryfogle

          • #14558
            Anonymous

              Dear jared,

              Thank you so much for the very detailed reply – very much appreciated!

              For some reason, pose_from_saccharide_sequence worked after restarting my IDE, so perhaps I had initialited it wrong the first time.

              I have played a bit around with the different methods and movers, and I am almost able to do everything I want to do. I have however run into into a problem: I cannot get sialic acids to work, even though two are listed in ‘codes_to_roots.map’:


              # Ketononoses (e.g., Sialic Acids)
              Neu neuramin * 2 # D inherent
              Kdn keto-deoxy-nonulon * 2 # D inherent

              Sialic acids have nine carbons but only have five carbons in the ring ( https://en.wikipedia.org/wiki/Sialic_acid ). I would therefore expect ”a-Neuf5Ac” to work but is does not. The Error I get is:


              ERROR: Unrecognized residue ->4)-alpha-Neuf:5-Ac when building saccharide sequence a-Neuf5Ac-(2->3)

              Can anyone help?

              Kind regards,

              Martin

              EDIT:

              it should be ‘p’ when there are five carbons in the ring, but if I try ”a-Neup5Ac” I also get an error

              EDIT 2:

              I got it working using ‘->8)-a-Neup5Ac-(2->3)’ so it was the default value of ->4)… which caused it not to work. Like I said, I am still a rookie with glycans, so not always super on top of the correct settings for the different glycans :)

               

          • #14559
            Anonymous

              Hello again,

              I have tried to include the option ‘-auto_detect_glycan_connections’ described here: https://www.rosettacommons.org/docs/latest/application_documentation/carbohydrates/WorkingWithGlycans

              It, however, gives me the import error:


              from pyrosetta import *
              pyrosetta.init('-include_sugars','-write_pdb_link_records','-auto_detect_glycan_connections')
              Traceback (most recent call last):

              File "<ipython-input-21-fcfdbbb80342>", line 8, in <module>
              pyrosetta.init('-include_sugars','-write_pdb_link_records','-auto_detect_glycan_connections')

              File "/Users/martinpedersen/anaconda3/envs/py36/lib/python3.6/site-packages/pyrosetta-2019.1+release.dbc838b6ae6-py3.6-macosx-10.7-x86_64.egg/pyrosetta/__init__.py", line 160, in init
              assert set_logging_handler in (None, True, False, "interactive", "logging")

              AssertionError

               

              Thought it would be nice for you to know.

              Kind regards,

              Martin

               

               

              • #14560
                Anonymous

                  If you are within a notebook, make sure to restart the kernal any time you re-init. 

                • #14562
                  Anonymous

                    I usually run my scripts on our cluster with the command: python <script.py> > pyros.log

                    I am therefore not sure it has to do with the kernel :)

                     

                    Kind regards,

                    Martin

                • #14561
                  Anonymous

                    Hello again,

                    I have been able to modifiy our MX structure, albeit with some errors, failures and warnings from GlycanTreeModeler. I would also like to use standard movers like SmallMover and ShearMover to model two hinge regions, which has also proved difficult. I hope it is OK I keep asking the questions in this thread, otherwise I am  happy to make a new one.

                    Regading GlycanTreeModeler:

                    After adding the glycosylations using SimpleGlycosylateMover, I call GlycanTreeModeler which succesfully minimise the structure, although I have to call it many times (> 20) to get reasonable minimisation and even after 100 iterations are there problematic tryptophan geomtries/bond lengths (Trp can have mannose added).

                    The warnings I tipycally get are of the sort:


                    protocols.carbohydrates.GlycanTreeMinMover: Minimizing from carbohydrate root: 53
                    protocols.carbohydrates.GlycanTreeMinMover: Minimizing 3 residues
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 13!
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
                    core.conformation.Conformation: [ WARNING ] Branch 1does not have enough heavy atoms about its connection to define a torsion angle!
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 14!
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
                    core.conformation.Conformation: [ WARNING ] Branch 2does not have enough heavy atoms about its connection to define a torsion angle!
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 13!
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
                    core.conformation.Conformation: [ WARNING ] Branch 1does not have enough heavy atoms about its connection to define a torsion angle!
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 14!
                    core.conformation.Residue: [ WARNING ] There are no adjacent heavy atoms to atom index 1!
                    core.conformation.Conformation: [ WARNING ] Branch 2does not have enough heavy atoms about its connection to define a torsion angle!
                    protocols.carbohydrates.LinkageConformerMover: Sampling alpha-L-Fucp-(?->6)-beta-D-GlcpNAc- linkage
                    protocols.carbohydrates.LinkageConformerMover: Sampling conformer 2 which has a population of 0.1813
                    protocols.carbohydrates.LinkageConformerMover: Complete
                    ...
                    core.pack.task: Packer task: initialize from command line()
                    core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 4 ASN:N-glycosylated
                    core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 23 SER:O-conjugated
                    core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 26 SER:O-conjugated
                    core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 42 TRP:C-glycosylated
                    core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 45 TRP:C-glycosylated
                    core.pack.rotamer_set.RotamerSet_: [ WARNING ] including current in order to get at least 1 rotamer !!!!!! 48 TRP:C-glycosylated

                    and the failures/errors I get:


                    protocols.carbohydrates.LinkageConformerMover: Sampling beta-D-Glcp-(?->3)-alpha-L-Fucp- linkage
                    protocols.carbohydrates.LinkageConformerMover: no conformer found. Using sugar BB sampler if possible.
                    protocols.carbohydrates.GlycanTreeSampler: Last mover failed. Continueing!
                    protocols.carbohydrates.LinkageConformerMover: Sampling alpha-D-Manp-(?TRP linkage
                    protocols.carbohydrates.LinkageConformerMover: no conformer found. Using sugar BB sampler if possible.
                    protocols.carbohydrates.LinkageConformerMover: Cannot run sugar BB Sampler on linkage as linkage as a protein-glycan linkage. Skipping
                    protocols.simple_moves.bb_sampler.SugarBBSampler: 81 81 A torsion: 2
                    protocols.simple_moves.bb_sampler.SugarBBSampler: No data for linkage. Either this is psi and previous residue is not carbohydrate or we do not have a pyranose ring in the previous residue.
                    protocols.simple_moves.BBDihedralSamplerMover: [ ERROR ] Could not set torsion for resnum 81
                    protocols.carbohydrates.GlycanTreeSampler: Last mover failed. Continueing!

                    To investigate further, I made a simply dummy pose (a pose_from_sequence) which was a straight line and tried to add the glycosilation to that – perhaps some of the problems stem from the glycosylations being added in regions without so much space around it. Yet, I still ran into problems. I have attached a screenshot of wierd Trp geomtry/bond lengths, with the Trp sitting next to Ala so there should be plenty of space.

                    being unfamiliar with GlycanTreeModeler, I thought that maybe by using standard relaxtion movers like MinMover I could at least fix the tryptophan geometries (with a movemap with set_chi(True)), which in turn would allow  GlycanTreeModeler to fix the errors with the glycans (if any is present, SimpleGlycosylateMover does not produce any warnings). That is, perhaps if I iteratively called GlycanTreeModeler and MinMover then problems would might go away. This proved not to be the case, at least not with 20 iterations – perhaps I could improve them by iterating longer.

                    Regarding other movers (e.g. SmallMover and ShearMover)

                    The main problem I have had is that I get zero accepted moves regardless of how high I put kT. This prompted me to log in the log and I can see that I get the error of the type:


                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ; atomno= 6 rsd= 10 ; atomno= 8 rsd= 10 ; atomno= 1 rsd= 56
                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ; atomno= 6 rsd= 10 ; atomno= 8 rsd= 10 ; atomno= 1 rsd= 56
                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 42 ; atomno= 6 rsd= 42 ; atomno= 7 rsd= 42 ; atomno= 1 rsd= 59
                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 45 ; atomno= 6 rsd= 45 ; atomno= 7 rsd= 45 ; atomno= 1 rsd= 60
                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 48 ; atomno= 6 rsd= 48 ; atomno= 7 rsd= 48 ; atomno= 1 rsd= 61

                     

                    Interestingly, it is core.conformation.carbohydrates.util which outputs the error, but the errors are printed after GlycanTreeModeler has printed the summary of its run e.g.


                    protocols.carbohydrates.GlycanTreeModeler: Start- : 13198.6
                    protocols.carbohydrates.GlycanTreeModeler: Pre - : 790.491
                    protocols.carbohydrates.GlycanTreeModeler: Post - : 829.677
                    protocols.carbohydrates.GlycanTreeModeler: Final- : 790.491
                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 4 ; atomno= 6 rsd= 4 ; atomno= 8 rsd= 4 ; atomno= 1 rsd= 53
                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 10 ; atomno= 6 rsd= 10 ; atomno= 8 rsd= 10 ; atomno= 1 rsd= 56
                    core.conformation.carbohydrates.util: [ ERROR ] get_non_NU_TorsionID_from_AtomIDs() could not determine the BB id for theTorsionID defined by these atoms: atomno= 5 rsd= 42 ; atomno= 6 rsd= 42 ; atomno= 7 rsd= 42 ; atomno= 1 rsd= 59
                    ...

                    One thing I learned is that if I output a pose (pose.dump_pdb) and reload it again, it get the following warnings which are likely connected to the AtomID errors:


                    core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc53 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
                    core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Gal54 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
                    core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Neu55 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
                    core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] Glc56 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.
                    ...
                    core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 53. Returning BOGUS ID instead.
                    core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 56. Returning BOGUS ID instead.
                    core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 59. Returning BOGUS ID instead.
                    core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 60. Returning BOGUS ID instead.
                    core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Manp:non-reducing_end 61. Returning BOGUS ID instead.
                    core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-L-Fucp 62. Returning BOGUS ID instead.
                    core.chemical.AtomICoor: [ WARNING ] IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-L-Fucp 64. Returning BOGUS ID instead.

                     

                    In conclusion:

                    I get both warning, error and failure flags whenever I try to add glycosylations. If I furthermore try to use other movers to perturb the backbone in hinge regions, no moves are accepted. Here is a dummy version of the script where I add the glycosylation to a AA sequence I just cooked up, in case it has something to do with the way I do it:


                    from pyrosetta import *
                    pyrosetta.init('-include_sugars','-write_pdb_link_records')#,'-auto_detect_glycan_connections')

                    #%%
                    from pyrosetta.rosetta.protocols import relax
                    from pyrosetta.rosetta.protocols.simple_moves import SmallMover, ShearMover
                    from pyrosetta.rosetta.protocols.minimization_packing import MinMover, PackRotamersMover

                    from pyrosetta.rosetta.core.pack.task import TaskFactory

                    from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose
                    from pyrosetta.rosetta.core.pose import make_pose_from_saccharide_sequence,
                    pose_from_saccharide_sequence

                    from pyrosetta.rosetta.protocols.carbohydrates import *
                    from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose, glycosylate_pose_by_file

                    from pyrosetta.rosetta.utility import vector1_std_string as vstr
                    sugar_bb = pyrosetta.rosetta.core.scoring.ScoreType.sugar_bb

                    #%%

                    import os, glob
                    import numpy as np
                    from timeit import default_timer as timer
                    import datetime

                    #%%

                    path = os.path.abspath('.') + '/'
                    os.chdir(path)


                    sequence = 'NWS'*4+'PPPP'*2+'NWS'*4+'PPPP'*2+'AWA'*4

                    pose = pose_from_sequence(sequence)
                    #pose.dump_pdb('NoPTM.pdb')

                    #%%

                    string1 = '->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->6)-'+
                    '[->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->3)]-' +
                    '[b-D-GlcpNAc-(1->4)]-b-D-Manp-(1->4)-b-D-GlcpNAc-(1->4)-'+
                    '[a-L-Fucp-(1->6)]-b-D-GlcpNAc'

                    string2 = 'b-D-Glcp-(1->3)-a-L-Fucp'

                    string3 = '->4)-a-D-Manp'

                    #%%

                    string1 = '->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->6)-'+
                    '[->8)-a-Neup5Ac-(2->3)-b-D-Galp-(1->4)-b-D-GlcpNAc-(1->2)-a-D-Manp-(1->3)]-' +
                    '[b-D-GlcpNAc-(1->4)]-b-D-Manp-(1->4)-b-D-GlcpNAc-(1->4)-'+
                    '[a-L-Fucp-(1->6)]-b-D-GlcpNAc'

                    string2 = 'b-D-Glcp-(1->3)-a-L-Fucp'

                    string3 = '->4)-a-D-Manp'

                    #%%
                    mov = SimpleGlycosylateMover()
                    mov.set_glycosylation(string1)
                    pos = [4,10]
                    for ps in pos:
                    mov.set_position(ps)
                    mov.apply(pose)

                    mov.set_glycosylation(string3)
                    pos = [42,45,48]
                    for ps in pos:
                    mov.set_position(ps)
                    mov.apply(pose)

                    mov.set_glycosylation(string2) # cannot finish with mannose addition, throws error
                    pos = [23,26]
                    for ps in pos:
                    mov.set_position(ps)
                    mov.apply(pose)


                    #%%
                    scorefxn = get_fa_scorefxn()
                    kT = 2
                    n_moves = 1
                    mc = MonteCarlo(pose, scorefxn, kT)


                    # Glycosylation mins
                    mins_glyc = GlycanTreeModeler()


                    # prepare for hinge region
                    hinge_reg = [1,52]
                    hinge = MoveMap()
                    hinge.set_bb_true_range(hinge_reg[0], hinge_reg[1])

                    # Movers for hinge region
                    smallmover = SmallMover(hinge, kT, n_moves)
                    smallmover.angle_max(10)
                    shearmover = ShearMover(hinge, kT, n_moves)

                    # optimese rotamers
                    to_pack = TaskFactory.create_packer_task(pose)
                    to_pack.restrict_to_repacking() # prevents design, packing only
                    to_pack.or_include_current(True)
                    packmover = PackRotamersMover(scorefxn, to_pack)

                    # prepare the sequence mover
                    seq_mover = SequenceMover()

                    seq_mover.add_mover(mins_glyc)
                    num_reps = 10 #
                    for num in range(num_reps):
                    seq_mover.add_mover(smallmover)
                    seq_mover.add_mover(shearmover)

                    seq_mover.add_mover(packmover)
                    trial_mover = TrialMover(seq_mover, mc)

                    file = 'Dummy.log'
                    header = '# kT = {:0.0f}, n moves/mover = {:d}n'.format(kT, n_moves) +
                    '# Timestamp, elapsed , Cycle, Score [REU], Acc. rate [Acc/trials]n'


                    with open(file,'w+') as f:
                    f.writelines(header)
                    time = timer()

                    for num in range(200):
                    trial_mover.apply(pose)

                    if np.mod(num+1,1) == 0: # conditional used if I want to output less
                    E_temp = scorefxn(pose)
                    elapsed = timer()-time
                    date = datetime.datetime.now().strftime("%H:%M:%S")
                    acc_rate = trial_mover.acceptance_rate()

                    string = date + ', {:0.0f}, {:d}'.format(elapsed, num+1) +
                    ', {:0.2f}, {:0.2e}n'.format(E_temp, acc_rate)
                    f.writelines(string)

                    if np.mod(num+1,1) == 0: # conditional used if I want to output less
                    filename = 'dum_{:03d}.pdb'.format(num+1)
                    pose.dump_pdb(filename)

                    void = mc.recover_low(pose)
                    filename = 'xDum_lowest.pdb'
                    pose.dump_pdb(filename)

                    Note that in this script, the movers are able to change BB angles

                Viewing 3 reply threads
                • You must be logged in to reply to this topic.