Member Site › Forums › Rosetta 3 › Rosetta 3 – General › Warning when running cluster program
- This topic has 14 replies, 2 voices, and was last updated 12 years, 4 months ago by attesor.
-
AuthorPosts
-
-
June 15, 2012 at 9:39 am #1310Anonymous
I have a protein of 106 residues (numbered from 1 to 106). I ran relax and cluster the resulting 200 poses.
cluster.linuxgccrelease -in:file:silent mysilentfile
But I got tons of such warning:
core.scoring.rms_util: WARNING: In CA_rmsd, residue range 1 to 111 requested but only 106 protein CA atoms found.
Why does cluster thinks I have 111 residues? Can I still trust the cluster result with all the warnings? Does this have anything to do with the “jump” concept?
-
June 15, 2012 at 2:05 pm #7260Anonymous
Do you have any ligands present?
-
June 15, 2012 at 4:06 pm #7261attesor
No, no ligands.
The warnings disappeared after I set the native structure:cluster.linuxgccrelease -in:file:silent mysilentfile -in:file:fullatom -in:file:native mynative.pdb
-
June 26, 2012 at 1:36 pm #7323attesor
Also, it seems the energy calculated by cluster is different from those by score_jd2.
I have docking decoys in silent files. I calculated energy for each decoy using score_jd2. Then I clustered the decoys using cluster. cluster outputs a list of scores like:protocols.cluster: Adding struc: -103.753
But they are totally different from score_jd2 score values. Not the order of them, but the values. (Rosetta v3.4 is used)
-
July 2, 2012 at 1:41 pm #7345attesor
Yes, your are totally right. I noticed centroid atoms in the .pdb files after clustering. I specified -in:file:fullatom when doing clustering. This did not help, though.
Missing residues complained by cluster program are the virtual residues used in FoldTree. Using -cluster:exclude_res to exclude them will make cluster program stop complaining. (Is it normal that cluster does not recognize the virtual residues generated by Rosetta itself?)
-
June 26, 2012 at 1:25 am #7312Anonymous
Does “mynative” have the same number of residues as the poses in the silent file? The error is that you have 111 CAs in one (c-alpha, not calcium), and 106 in the other.
-
June 26, 2012 at 11:41 am #7317attesor
Yes, both the native decoy and Rosetta output decoys have 106 CA atoms in them. But they are dimers (chain A&. I have no idea where the number 111 is from. 111-106=5, it is not even an even number (since it is homodimer, I expect the difference to be multiples of 2).
I worked on another protein, it is a trimer. When I do clustering, it generates the same warning 67 instead of 54. I checked the input and output. Both have only 54 residues as well as CA atoms in it.
-
June 26, 2012 at 1:41 pm #7325Anonymous
I can’t find an answer for the # CA atoms mismatch.
If the score difference is small (a few units), it’s due to imprecision of structure storage on disk, especially if you are using PDBs. A PDB has three decimal places, but the rosetta scorefunction is sensitive to many more decimal places. This means a PDB will never rescore the same as the in-code pose from which the PDB was produced. The problem is ameliorated, if not eliminated, for binary-style silent files.
-
June 26, 2012 at 1:54 pm #7326attesor
Yes, I am aware of this precision problem from previous threads. But the difference is huge:
score_jd2 says:
SCORE: 436.487 -380.814 669.460 182.095 0.763 0.000 0.000 -66.162 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -2.372 2.081 24.410 3.725 3.300 0.000 0.000 -13.400 0.000 mystruc_0091_0001
while cluster says:
protocols.cluster: mystruc_0091 -51.781 -
June 26, 2012 at 2:12 pm #7328attesor
Can you give some hint on the concept of “jump residue”?
-
June 26, 2012 at 6:24 pm #7331Anonymous
You’re right, that’s not a precision thing. Could it be a fullatom/centroid thing? Is the previous code emitting centroid or fullatom PDBs?
-
June 26, 2012 at 6:28 pm #7332Anonymous
This paper
http://www.ncbi.nlm.nih.gov/pubmed/17825317describes the “fold tree” in Rosetta. Basically, Rosetta uses internal coordinates wherever possible, and only converts to XYZ coordinates for some score functions and to print results. Atom positions are calculated by iterating along the network of internal coordinates. The AtomTree and FoldTree specify which atoms are connected to which other atoms, and by which degrees of freedom. This lets Rosetta know things like how to move the end of a lysine when the base chi angles rotate. The tree MUST be a directed acyclic graph that contains all atoms, so that all are accounted for and singly connected. Most connections in these Trees represent chemical bonds. Jumps come in where connections are needed but can’t be represented by chemistry. For example, if you have two chains, there is a Jump between the first and second chains representing how the second chain is positioned with respect to the first. Jumps do other things too, especially in loop modeling, but non-chemical internal-coordinate connections between independent molecules in the pose is the common case.
-
June 27, 2012 at 8:52 am #7339attesor
Yes, your are totally right. I noticed centroid atoms in the .pdb files after clustering. I specified -in:file:fullatom when doing clustering. This did not help, though.
-
July 2, 2012 at 2:27 pm #7369Anonymous
(Is it normal that cluster does not recognize the virtual residues generated by Rosetta itself?)
Probably – this is exactly the sort of thing developers ignore, because they know what the warning means and know it’s irrelevant, so they ignore it. I’ve never seen it, but I’ve never used cluster…
-
July 2, 2012 at 3:25 pm #7370attesor
I read the source code and played with my poses using PyRosetta. I realized that the cluster program uses CA_rmsd(), which checks whether the number of residues equals to the number of CA atoms in the protein. If not it spits the WARNING. Virtual residues do not have CA atoms.
It seems CA_rmsd carries on with the calculation of RMSD using all CA atoms it gets. For different poses from the same protein, the WARNING can be safely ignored (unofficial judgement!).
-
-
AuthorPosts
- You must be logged in to reply to this topic.