Warning when running cluster program

This topic has 14 replies, 2 voices, and was last updated 12 years ago by attesor.

Viewing 4 reply threads

Author

Posts
- June 15, 2012 at 9:39 am #1310
  Anonymous
  I have a protein of 106 residues (numbered from 1 to 106). I ran relax and cluster the resulting 200 poses.
  
  cluster.linuxgccrelease -in:file:silent mysilentfile
  
  But I got tons of such warning:
  
  core.scoring.rms_util: WARNING: In CA_rmsd, residue range 1 to 111 requested but only 106 protein CA atoms found.
  
  Why does cluster thinks I have 111 residues? Can I still trust the cluster result with all the warnings? Does this have anything to do with the “jump” concept?
- June 15, 2012 at 2:05 pm #7260
  Anonymous
  Do you have any ligands present?
- June 15, 2012 at 4:06 pm #7261
  attesor
  No, no ligands.
  The warnings disappeared after I set the native structure:
  
  cluster.linuxgccrelease -in:file:silent mysilentfile -in:file:fullatom -in:file:native mynative.pdb
- June 26, 2012 at 1:36 pm #7323
  attesor
  Also, it seems the energy calculated by cluster is different from those by score_jd2.
  I have docking decoys in silent files. I calculated energy for each decoy using score_jd2. Then I clustered the decoys using cluster. cluster outputs a list of scores like:
  
  protocols.cluster: Adding struc: -103.753
  
  But they are totally different from score_jd2 score values. Not the order of them, but the values. (Rosetta v3.4 is used)
- July 2, 2012 at 1:41 pm #7345
  attesor
  Yes, your are totally right. I noticed centroid atoms in the .pdb files after clustering. I specified -in:file:fullatom when doing clustering. This did not help, though.
  
  Missing residues complained by cluster program are the virtual residues used in FoldTree. Using -cluster:exclude_res to exclude them will make cluster program stop complaining. (Is it normal that cluster does not recognize the virtual residues generated by Rosetta itself?)
- June 26, 2012 at 1:25 am #7312
  Anonymous
  Does “mynative” have the same number of residues as the poses in the silent file? The error is that you have 111 CAs in one (c-alpha, not calcium), and 106 in the other.
- June 26, 2012 at 11:41 am #7317
  attesor
  Yes, both the native decoy and Rosetta output decoys have 106 CA atoms in them. But they are dimers (chain A&. I have no idea where the number 111 is from. 111-106=5, it is not even an even number (since it is homodimer, I expect the difference to be multiples of 2).
  
  I worked on another protein, it is a trimer. When I do clustering, it generates the same warning 67 instead of 54. I checked the input and output. Both have only 54 residues as well as CA atoms in it.
- June 26, 2012 at 1:41 pm #7325
  Anonymous
  I can’t find an answer for the # CA atoms mismatch.
  
  If the score difference is small (a few units), it’s due to imprecision of structure storage on disk, especially if you are using PDBs. A PDB has three decimal places, but the rosetta scorefunction is sensitive to many more decimal places. This means a PDB will never rescore the same as the in-code pose from which the PDB was produced. The problem is ameliorated, if not eliminated, for binary-style silent files.
- June 26, 2012 at 1:54 pm #7326
  attesor
  Yes, I am aware of this precision problem from previous threads. But the difference is huge:
  
  score_jd2 says:
  
  SCORE: 436.487 -380.814 669.460 182.095 0.763 0.000 0.000 -66.162 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -2.372 2.081 24.410 3.725 3.300 0.000 0.000 -13.400 0.000 mystruc_0091_0001
  
  while cluster says:
  protocols.cluster: mystruc_0091 -51.781
- June 26, 2012 at 2:12 pm #7328
  attesor
  Can you give some hint on the concept of “jump residue”?
- June 26, 2012 at 6:24 pm #7331
  Anonymous
  You’re right, that’s not a precision thing. Could it be a fullatom/centroid thing? Is the previous code emitting centroid or fullatom PDBs?
- June 26, 2012 at 6:28 pm #7332
  Anonymous
  This paper
  http://www.ncbi.nlm.nih.gov/pubmed/17825317
  
  describes the “fold tree” in Rosetta. Basically, Rosetta uses internal coordinates wherever possible, and only converts to XYZ coordinates for some score functions and to print results. Atom positions are calculated by iterating along the network of internal coordinates. The AtomTree and FoldTree specify which atoms are connected to which other atoms, and by which degrees of freedom. This lets Rosetta know things like how to move the end of a lysine when the base chi angles rotate. The tree MUST be a directed acyclic graph that contains all atoms, so that all are accounted for and singly connected. Most connections in these Trees represent chemical bonds. Jumps come in where connections are needed but can’t be represented by chemistry. For example, if you have two chains, there is a Jump between the first and second chains representing how the second chain is positioned with respect to the first. Jumps do other things too, especially in loop modeling, but non-chemical internal-coordinate connections between independent molecules in the pose is the common case.
- June 27, 2012 at 8:52 am #7339
  attesor
  Yes, your are totally right. I noticed centroid atoms in the .pdb files after clustering. I specified -in:file:fullatom when doing clustering. This did not help, though.
- July 2, 2012 at 2:27 pm #7369
  Anonymous
  (Is it normal that cluster does not recognize the virtual residues generated by Rosetta itself?)
  
  Probably – this is exactly the sort of thing developers ignore, because they know what the warning means and know it’s irrelevant, so they ignore it. I’ve never seen it, but I’ve never used cluster…
- July 2, 2012 at 3:25 pm #7370
  attesor
  I read the source code and played with my poses using PyRosetta. I realized that the cluster program uses CA_rmsd(), which checks whether the number of residues equals to the number of CA atoms in the protein. If not it spits the WARNING. Virtual residues do not have CA atoms.
  
  It seems CA_rmsd carries on with the calculation of RMSD using all CA atoms it gets. For different poses from the same protein, the WARNING can be safely ignored (unofficial judgement!).
Author

Posts

Viewing 4 reply threads

You must be logged in to reply to this topic.