Disulfide prediction from primary sequence

Member Site Forums Rosetta 3 Rosetta 3 – Applications Disulfide prediction from primary sequence

Viewing 2 reply threads
  • Author
    • #2080

        Hi all,
        Forgive what may be a naive question (novice Rosetta user here), but I was wondering whether there is a way, using existing Rosetta scripts, to take a primary amino acid sequence and predict the disulfide bonding pattern. To explain a bit better, I am interested in cysteine knot peptides. Many of these have known NMR structures, and if I want to use these in the standard motif grafting -> interface design -> forward fold validation pipeline, I can extract the disulfide bond pattern from those .pdb files for use in forward folding of variants (just instruct the script to keep the same disulfide pattern but relax / fold the rest of it). However, the list of such peptides with known NMR structures is very small when compared to the number of such proteins found in protein sequence databases (i.e. homologs).

        Would it be best to rely on homology to peptides with known structures, assigning disulfide patterns that way? Or is there a way to take the primary sequence, and directly predict the disulfide pattern absent any other known structural information, for incorporation in design / relaxation / folding down the road?


        -Zach Crook

      • #10575

          The presence/absence of disulfide bonds is primarily due to (tertiary) structural effects. If the two cysteines are spacially close and in the correct geometric orientation (and the surrounding environment is conducive to disulfide bonds), they’ll form a disulfide. If they are too far apart, or if the disulfide would be strained (or the protein is in a reducing environment), they won’t. I’m unaware of a “disulfide propensities” which would be able to generally predict disulfide bonds from primary sequence in the same fashion that secondary structure prediction can use amino acid propensities to predict secondary structure.

          I think the best you’re going to do for prediction is to look at homologs, and the sequence alignments to your protein of interest. This is especially true if the disulfides are crucial to the stability of the protein. If the cysteines of a crucial disulfide in a homolog align with cysteines of your protein, there’s likely to be a disulfide between the residues. You can extend this to other residues, as well, assuming you have distance information for the residues. If two cysteines are close enough, there’s a possibility that they form a disulfide. (Internally in certain situations, Rosetta uses a Cbeta-Cbeta distance of further than 4.72 Ang as indicating that two residues are too far apart to make a disulfide bond.)

          For a regular abinitio folding run, one way to approach things is to compile a list of disulfides you’re sure of and ones you’re less sure about. You can then do short folding runs with combinations of the disulfides you’re unsure of. Take a look at the results, and see if one or more of them gives better results than the others. You can then focus your computation on the disulfide sets which give the best results, while discarding those disulfide combinations which result in misfolded proteins. Alternatively, you can try folding runs with only those disulfides which you’re sure of. You can then examine the results to see which cysteines are close enough to possibly make disulfides.

          As you’re doing design, I’d suggest modifying the approach in the previous paragraph. Only enforce the disulfides you think are critical to maintaining a well folded structure. Then do a forward folding run to see if you can recover the other disulfides you want. If you want to test if those critical disulfides really will stabilize the protein, you can try folding with alternate disulfide patterns (or with no disulfides), and see if the folds you get from those are more stable than the ones with the “correct” disulfide pattern. With design it’s a bit easier, as you can discard/ignore those sequences which don’t give you a clear good answer.

        • #10642

            Thank you, that is very useful. I will play around and see if I hit upon something that can be generalizable. Cheers!

        Viewing 2 reply threads
        • You must be logged in to reply to this topic.