Member Site › Forums › PyRosetta › PyRosetta – General › Creating rotamers, computing pairwise and total energies
- This topic has 18 replies, 2 voices, and was last updated 7 years, 8 months ago by Anonymous.
-
AuthorPosts
-
-
April 12, 2017 at 9:57 am #2638Anonymous
I just started working with PyRosetta, and i stumbled upon a lot of questions.
I would like to introduce rotamers, and calculate their pairwise energies, and the total energy of the system with given rotamers.
To test, i created a very simple PDB that consist only 2 amino acid.
I have the following code:
from pyrosetta import *
from pyrosetta.teaching import *
import numpy as np
from rosetta.core.pack.rotamer_set import *
from rosetta.core.pack import *
from rosetta.core.pack.interaction_graph import *
from rosetta.core.pack.task import * # for using resfiles
import sys
init()
# scorefxn = get_fa_scorefxn() # create_score_function()
scorefxn = ScoreFunction()
scorefxn.set_weight(fa_atr, 1.0)
scorefxn.set_weight(fa_rep, 1.0)
scorefxn.set_weight(fa_sol, 1.0)
pose = pose_from_pdb(sys.argv[1])
task_pack = TaskFactory.create_packer_task(pose)
# print('Task pack is ',task_pack)
rotsets = RotamerSets()
pose.update_residue_neighbors()
scorefxn.setup_for_packing(pose, task_pack.designing_residues(), task_pack.designing_residues())
packer_neighbor_graph = create_packer_graph(pose, scorefxn, task_pack)
rotsets.set_task(task_pack)
rotsets.build_rotamers(pose, scorefxn, packer_neighbor_graph)
rotsets.prepare_sets_for_packing(pose, scorefxn)
ig = InteractionGraphFactory.create_interaction_graph(task_pack, rotsets, pose, scorefxn)
print("built", rotsets.nrotamers(), "rotamers at", rotsets.nmoltenres(), "positions.")
rotsets.compute_energies(pose, scorefxn, packer_neighbor_graph, ig)
n_rots_I = rotsets.nrotamers_for_moltenres(1)
n_rots_J = rotsets.nrotamers_for_moltenres(2)
# print(ig.get_two_body_energy_for_edge(1, 2, 299, 288))
E = np.zeros((n_rots_I, n_rots_J))
# print(rotsets.rotamer_set_for_residue(2))
for s_i in range(1, n_rots_I + 1):
for s_j in range(1, n_rots_J + 1):
E[s_i - 1, s_j - 1] = ig.get_two_body_energy_for_edge(1, 2, s_i, s_j)
print(E[0, 1])Some questions regarding the code.
1: What does a task_packer do? Why is update_residue_neightbors() needed? What is a packer_graph? What is an InteractionGraphFactory? What are sotred in these objects? I was searching through the API, however my lack of programming knowlegde keeps me from understanding.
2: The above code creats ALL rotamer for my two amino acid, however i would only like to use amino acids with certain type. For example, i would only like to compute ALA, LYS and ARG at position 1. How can i do this?
3: I (think) compute the pairwise energies of rotamers, and store it at variable E. Are these the pairwise energies of the two rotamers (s_i and s_j)?
4: How can i comupte the total energy of a state? For example i would like to calculate the total enery o the system with rotamer(1) at position (1) and rotamer(2) at position 2.
Thanks a lot!
-
April 14, 2017 at 3:24 pm #12274Anonymous
First off, I might encourage you to step back and think about the main goal you want to accomplish. Most people who use Rosetta don’t ever need to muck around with the low-levels of rotamer selection and scoring. If all you want to do is find the lowest energy structure in various contexts, or even just get the score of a particular structure, there’s other higher level interfaces you could be using. Depending on your goal, you might be going about this the hard way.
That said, I’ll try my best to answer your questions on the off chance you really do need to be working at a lower level.
1: A PackerTask is an object which controls how the packer (the main rotamer optimization machinery in Rosetta) works. It specifies how extensively to sample rotamers at each position, as well as specifying if you should be designing at that position, and if so, what residue identities should be used. The Packer is really the core rotamer optimization protocol in Rosetta, and anything that deals with rotamers has been built in context of the packer – that’s why there’s all those additional objects – they keep track of various bits of data for the Packer. For example, the NeighborGraph keeps track of which residues are close to each other, to speed up two-body energy calculations. The InteractionGraph keeps track of interaction energies, so in the repeated energy evaluation of the Packer you don’t have to recompute interaction energies you’ve already calculated.
2: The default PackerTask is a design-all-positions one. The analogy is with sculpture carving. You start with a big block that can be anything, and then remove those portions which you don’t want. So if you want to restrict your design to ALA, LYS and ARG, you have to turn off design to other amino acids at that position. You can do that directly on the PackerTask object by getting ther ResidueLevelTask for the position with the nonconst_residue_task() method, and then using the restrict_absent_canonical_aas() method of the ResideLevelTask (takes a vector of bools, indexed by core.chemical.AA values). Alternatively, you can find a TaskOperation that does what you want, add that to the TaskFactory, and then when you create_packer_task() with that TaskFactory, it will be automatically applied to the PackerTask – this is probably easier for most use cases.
3: Yes, it does look like you calculate the (pairwise decomposable) two-body energy for the residue pairs with that code. The one caveat I can mention offhand is that there’s a translation between the actual residue number and the “molten residue” number, which in general aren’t going to match up. You need do the conversion for that, otherwise you may get the energies for the wrong residue pair.
4: Unless you’re trying to optimize a slow bit of code, I’d probably just suggest applying the rotamer to the Pose (e.g. with pose.replace_residue() ), and then calling the scorefunction on the entire pose. It’s much simpler and less likely to go wrong. — If you do need the optimization, then the total energy of the structure will be the sum of all the two body energies for all the residue pairs, plus the sum of all the one body energies for each residue, plus all the whole structure energies.
-
April 14, 2017 at 3:24 pm #12795Anonymous
First off, I might encourage you to step back and think about the main goal you want to accomplish. Most people who use Rosetta don’t ever need to muck around with the low-levels of rotamer selection and scoring. If all you want to do is find the lowest energy structure in various contexts, or even just get the score of a particular structure, there’s other higher level interfaces you could be using. Depending on your goal, you might be going about this the hard way.
That said, I’ll try my best to answer your questions on the off chance you really do need to be working at a lower level.
1: A PackerTask is an object which controls how the packer (the main rotamer optimization machinery in Rosetta) works. It specifies how extensively to sample rotamers at each position, as well as specifying if you should be designing at that position, and if so, what residue identities should be used. The Packer is really the core rotamer optimization protocol in Rosetta, and anything that deals with rotamers has been built in context of the packer – that’s why there’s all those additional objects – they keep track of various bits of data for the Packer. For example, the NeighborGraph keeps track of which residues are close to each other, to speed up two-body energy calculations. The InteractionGraph keeps track of interaction energies, so in the repeated energy evaluation of the Packer you don’t have to recompute interaction energies you’ve already calculated.
2: The default PackerTask is a design-all-positions one. The analogy is with sculpture carving. You start with a big block that can be anything, and then remove those portions which you don’t want. So if you want to restrict your design to ALA, LYS and ARG, you have to turn off design to other amino acids at that position. You can do that directly on the PackerTask object by getting ther ResidueLevelTask for the position with the nonconst_residue_task() method, and then using the restrict_absent_canonical_aas() method of the ResideLevelTask (takes a vector of bools, indexed by core.chemical.AA values). Alternatively, you can find a TaskOperation that does what you want, add that to the TaskFactory, and then when you create_packer_task() with that TaskFactory, it will be automatically applied to the PackerTask – this is probably easier for most use cases.
3: Yes, it does look like you calculate the (pairwise decomposable) two-body energy for the residue pairs with that code. The one caveat I can mention offhand is that there’s a translation between the actual residue number and the “molten residue” number, which in general aren’t going to match up. You need do the conversion for that, otherwise you may get the energies for the wrong residue pair.
4: Unless you’re trying to optimize a slow bit of code, I’d probably just suggest applying the rotamer to the Pose (e.g. with pose.replace_residue() ), and then calling the scorefunction on the entire pose. It’s much simpler and less likely to go wrong. — If you do need the optimization, then the total energy of the structure will be the sum of all the two body energies for all the residue pairs, plus the sum of all the one body energies for each residue, plus all the whole structure energies.
-
April 14, 2017 at 3:24 pm #13316Anonymous
First off, I might encourage you to step back and think about the main goal you want to accomplish. Most people who use Rosetta don’t ever need to muck around with the low-levels of rotamer selection and scoring. If all you want to do is find the lowest energy structure in various contexts, or even just get the score of a particular structure, there’s other higher level interfaces you could be using. Depending on your goal, you might be going about this the hard way.
That said, I’ll try my best to answer your questions on the off chance you really do need to be working at a lower level.
1: A PackerTask is an object which controls how the packer (the main rotamer optimization machinery in Rosetta) works. It specifies how extensively to sample rotamers at each position, as well as specifying if you should be designing at that position, and if so, what residue identities should be used. The Packer is really the core rotamer optimization protocol in Rosetta, and anything that deals with rotamers has been built in context of the packer – that’s why there’s all those additional objects – they keep track of various bits of data for the Packer. For example, the NeighborGraph keeps track of which residues are close to each other, to speed up two-body energy calculations. The InteractionGraph keeps track of interaction energies, so in the repeated energy evaluation of the Packer you don’t have to recompute interaction energies you’ve already calculated.
2: The default PackerTask is a design-all-positions one. The analogy is with sculpture carving. You start with a big block that can be anything, and then remove those portions which you don’t want. So if you want to restrict your design to ALA, LYS and ARG, you have to turn off design to other amino acids at that position. You can do that directly on the PackerTask object by getting ther ResidueLevelTask for the position with the nonconst_residue_task() method, and then using the restrict_absent_canonical_aas() method of the ResideLevelTask (takes a vector of bools, indexed by core.chemical.AA values). Alternatively, you can find a TaskOperation that does what you want, add that to the TaskFactory, and then when you create_packer_task() with that TaskFactory, it will be automatically applied to the PackerTask – this is probably easier for most use cases.
3: Yes, it does look like you calculate the (pairwise decomposable) two-body energy for the residue pairs with that code. The one caveat I can mention offhand is that there’s a translation between the actual residue number and the “molten residue” number, which in general aren’t going to match up. You need do the conversion for that, otherwise you may get the energies for the wrong residue pair.
4: Unless you’re trying to optimize a slow bit of code, I’d probably just suggest applying the rotamer to the Pose (e.g. with pose.replace_residue() ), and then calling the scorefunction on the entire pose. It’s much simpler and less likely to go wrong. — If you do need the optimization, then the total energy of the structure will be the sum of all the two body energies for all the residue pairs, plus the sum of all the one body energies for each residue, plus all the whole structure energies.
-
April 15, 2017 at 1:34 pm #12281Anonymous
Thanks a lot for your answer, it was really helpful! The task I wish to do is complex, it is basically my PhD project. I was testing it with ROSETTAs FlexPepDock with great success, however, I would like to skip some functions in it, and I would like to test functions that are missing from it as well. This is the reason I switched to PyRosetta. Currently I (think) I was able to code a Dead-end elimination based GMEC algorithm, however I need to test is. While doing this I stumbled upon further questions.
1: I was able to create a Pecker Task that only creates rotamers for the original AA in a certain position using task_pack.restrict_to_repacking(). However, as I see PyRosetta uses the Dunbrack 2010 library, but i see less rotamers than the PHI/PSI angles suggest. Is there a way to get all of the possible romtamers? Or do i get all of them this way?
2: I was experimenting with the code i put in my original question, and i found that the line:
scorefxn.setup_for_packing(POSE, task_pack.designing_residues(), task_pack.designing_residues())
Is not mandatory, i get the same results if i skip it. What does this do? Is this needed?
3: Currently at the Dead-end elimination I calculate the energies as follow:
Sum the one-body energies of all residues using: eval_cd_1b + eval_ci_1b
Plus the linear combination of all two body energies for all residues using: eval_cd_2b + eval_ci_2b
However this is not equal to the energy i get from the full POSE object with POSE.energy(). You mention a structural energy as well. How can I obtain it?
Thanks a lot for your help!
-
April 15, 2017 at 1:34 pm #12802Anonymous
Thanks a lot for your answer, it was really helpful! The task I wish to do is complex, it is basically my PhD project. I was testing it with ROSETTAs FlexPepDock with great success, however, I would like to skip some functions in it, and I would like to test functions that are missing from it as well. This is the reason I switched to PyRosetta. Currently I (think) I was able to code a Dead-end elimination based GMEC algorithm, however I need to test is. While doing this I stumbled upon further questions.
1: I was able to create a Pecker Task that only creates rotamers for the original AA in a certain position using task_pack.restrict_to_repacking(). However, as I see PyRosetta uses the Dunbrack 2010 library, but i see less rotamers than the PHI/PSI angles suggest. Is there a way to get all of the possible romtamers? Or do i get all of them this way?
2: I was experimenting with the code i put in my original question, and i found that the line:
scorefxn.setup_for_packing(POSE, task_pack.designing_residues(), task_pack.designing_residues())
Is not mandatory, i get the same results if i skip it. What does this do? Is this needed?
3: Currently at the Dead-end elimination I calculate the energies as follow:
Sum the one-body energies of all residues using: eval_cd_1b + eval_ci_1b
Plus the linear combination of all two body energies for all residues using: eval_cd_2b + eval_ci_2b
However this is not equal to the energy i get from the full POSE object with POSE.energy(). You mention a structural energy as well. How can I obtain it?
Thanks a lot for your help!
-
April 15, 2017 at 1:34 pm #13323Anonymous
Thanks a lot for your answer, it was really helpful! The task I wish to do is complex, it is basically my PhD project. I was testing it with ROSETTAs FlexPepDock with great success, however, I would like to skip some functions in it, and I would like to test functions that are missing from it as well. This is the reason I switched to PyRosetta. Currently I (think) I was able to code a Dead-end elimination based GMEC algorithm, however I need to test is. While doing this I stumbled upon further questions.
1: I was able to create a Pecker Task that only creates rotamers for the original AA in a certain position using task_pack.restrict_to_repacking(). However, as I see PyRosetta uses the Dunbrack 2010 library, but i see less rotamers than the PHI/PSI angles suggest. Is there a way to get all of the possible romtamers? Or do i get all of them this way?
2: I was experimenting with the code i put in my original question, and i found that the line:
scorefxn.setup_for_packing(POSE, task_pack.designing_residues(), task_pack.designing_residues())
Is not mandatory, i get the same results if i skip it. What does this do? Is this needed?
3: Currently at the Dead-end elimination I calculate the energies as follow:
Sum the one-body energies of all residues using: eval_cd_1b + eval_ci_1b
Plus the linear combination of all two body energies for all residues using: eval_cd_2b + eval_ci_2b
However this is not equal to the energy i get from the full POSE object with POSE.energy(). You mention a structural energy as well. How can I obtain it?
Thanks a lot for your help!
-
April 15, 2017 at 8:26 pm #12282Anonymous
1: Rosetta has a probablility based cutoff for which residues it uses – for speed purposes it normally skips the low probabilty rotamers. The two options you want are -dunbrack_prob_buried and -dunbrack_prob_nonburied Set both of these to 1.0 to get all the rotamers and not to filter based on probability. (Unfortunately there isn’t any Python-level control of this, only option-level.)
2: This is entirely dependent on the particulars of the score function terms you’re using. Some require a setup for packing opportunity (to set up data structures, etc.), whereas others do not. Depending on what score terms you’re using, you may or may not have issues with this if it doesn’t get called. I’d recommend keeping it to be on the safe side. (Also, it should be called as scorefxn.setup_for_packing(POSE, task_pack.repacking_residues(), task_pack.designing_residues()) — the first is repacking, not designing.)
3: Getting the whole structure energies isn’t necessarily straightforward and even then it won’t necessairily be useful in a DEE context, as you need to evaulate it for the complete set of rotamers, rather than pairwise. — Also, it’s unlikely to matter, as the default scorefunction doesn’t contain whole structure terms. (If you do want them, you’d probably have to extract the whole structure terms with ws_methods_begin()/ws_methods_end() and then call finalize_total_energy() for each method.)
Instead, your discrepancy is likely to be due to the backbone hydrogen bonds not being included in the packing energy. For speed purposes, these are ignored in the packing pair calculations. Try setting `decompose_bb_hb_into_pair_energies( True )` on the EnergyMethod option for your scorefunction, and see if that helps.
The other thing you might be missing is the long-range two body energies, which are things like the constraint and disulfide terms. To do this, I think you have to use something like the ci_lr_2b_methods_begin () and cd_lr_2b_methods_begin() methods, and then call the residue_pair_energy() on each type.
As a final note, I would highly recommend paying attention to the values for each subterm that you’re calculating. You can compare that to the pose.energies().total_energies() EnergyMap, to see which terms are giving you trouble. Knowing which terms aren’t summing appropriately can help to track down why.
-
April 15, 2017 at 8:26 pm #12803Anonymous
1: Rosetta has a probablility based cutoff for which residues it uses – for speed purposes it normally skips the low probabilty rotamers. The two options you want are -dunbrack_prob_buried and -dunbrack_prob_nonburied Set both of these to 1.0 to get all the rotamers and not to filter based on probability. (Unfortunately there isn’t any Python-level control of this, only option-level.)
2: This is entirely dependent on the particulars of the score function terms you’re using. Some require a setup for packing opportunity (to set up data structures, etc.), whereas others do not. Depending on what score terms you’re using, you may or may not have issues with this if it doesn’t get called. I’d recommend keeping it to be on the safe side. (Also, it should be called as scorefxn.setup_for_packing(POSE, task_pack.repacking_residues(), task_pack.designing_residues()) — the first is repacking, not designing.)
3: Getting the whole structure energies isn’t necessarily straightforward and even then it won’t necessairily be useful in a DEE context, as you need to evaulate it for the complete set of rotamers, rather than pairwise. — Also, it’s unlikely to matter, as the default scorefunction doesn’t contain whole structure terms. (If you do want them, you’d probably have to extract the whole structure terms with ws_methods_begin()/ws_methods_end() and then call finalize_total_energy() for each method.)
Instead, your discrepancy is likely to be due to the backbone hydrogen bonds not being included in the packing energy. For speed purposes, these are ignored in the packing pair calculations. Try setting `decompose_bb_hb_into_pair_energies( True )` on the EnergyMethod option for your scorefunction, and see if that helps.
The other thing you might be missing is the long-range two body energies, which are things like the constraint and disulfide terms. To do this, I think you have to use something like the ci_lr_2b_methods_begin () and cd_lr_2b_methods_begin() methods, and then call the residue_pair_energy() on each type.
As a final note, I would highly recommend paying attention to the values for each subterm that you’re calculating. You can compare that to the pose.energies().total_energies() EnergyMap, to see which terms are giving you trouble. Knowing which terms aren’t summing appropriately can help to track down why.
-
April 15, 2017 at 8:26 pm #13324Anonymous
1: Rosetta has a probablility based cutoff for which residues it uses – for speed purposes it normally skips the low probabilty rotamers. The two options you want are -dunbrack_prob_buried and -dunbrack_prob_nonburied Set both of these to 1.0 to get all the rotamers and not to filter based on probability. (Unfortunately there isn’t any Python-level control of this, only option-level.)
2: This is entirely dependent on the particulars of the score function terms you’re using. Some require a setup for packing opportunity (to set up data structures, etc.), whereas others do not. Depending on what score terms you’re using, you may or may not have issues with this if it doesn’t get called. I’d recommend keeping it to be on the safe side. (Also, it should be called as scorefxn.setup_for_packing(POSE, task_pack.repacking_residues(), task_pack.designing_residues()) — the first is repacking, not designing.)
3: Getting the whole structure energies isn’t necessarily straightforward and even then it won’t necessairily be useful in a DEE context, as you need to evaulate it for the complete set of rotamers, rather than pairwise. — Also, it’s unlikely to matter, as the default scorefunction doesn’t contain whole structure terms. (If you do want them, you’d probably have to extract the whole structure terms with ws_methods_begin()/ws_methods_end() and then call finalize_total_energy() for each method.)
Instead, your discrepancy is likely to be due to the backbone hydrogen bonds not being included in the packing energy. For speed purposes, these are ignored in the packing pair calculations. Try setting `decompose_bb_hb_into_pair_energies( True )` on the EnergyMethod option for your scorefunction, and see if that helps.
The other thing you might be missing is the long-range two body energies, which are things like the constraint and disulfide terms. To do this, I think you have to use something like the ci_lr_2b_methods_begin () and cd_lr_2b_methods_begin() methods, and then call the residue_pair_energy() on each type.
As a final note, I would highly recommend paying attention to the values for each subterm that you’re calculating. You can compare that to the pose.energies().total_energies() EnergyMap, to see which terms are giving you trouble. Knowing which terms aren’t summing appropriately can help to track down why.
-
April 16, 2017 at 8:09 am #12284Anonymous
Thanks again for the response!
How and where can i apply these option? (-dunbrack_prob_buried and -dunbrack_prob_nonburied)
-
April 16, 2017 at 8:09 am #12805Anonymous
Thanks again for the response!
How and where can i apply these option? (-dunbrack_prob_buried and -dunbrack_prob_nonburied)
-
April 16, 2017 at 8:09 am #13326Anonymous
Thanks again for the response!
How and where can i apply these option? (-dunbrack_prob_buried and -dunbrack_prob_nonburied)
-
April 18, 2017 at 3:53 pm #12289Anonymous
You’d pass them to rosetta.init() – FAQ question about it.
-
April 18, 2017 at 3:53 pm #12810Anonymous
You’d pass them to rosetta.init() – FAQ question about it.
-
April 18, 2017 at 3:53 pm #13331Anonymous
You’d pass them to rosetta.init() – FAQ question about it.
-
April 18, 2017 at 5:14 pm #12290Anonymous
Thanks a lot! I was able to sort out the energy problem of mine, and i started a new question about them here, as i thought they dont belong in this question, I ll mark this as solved, thankls a lot!
-
April 18, 2017 at 5:14 pm #12811Anonymous
Thanks a lot! I was able to sort out the energy problem of mine, and i started a new question about them here, as i thought they dont belong in this question, I ll mark this as solved, thankls a lot!
-
April 18, 2017 at 5:14 pm #13332Anonymous
Thanks a lot! I was able to sort out the energy problem of mine, and i started a new question about them here, as i thought they dont belong in this question, I ll mark this as solved, thankls a lot!
-
-
AuthorPosts
- You must be logged in to reply to this topic.