Cluster gdtmm vs. rmsd – Rosetta Commons

This topic has 2 replies, 3 voices, and was last updated 13 years, 7 months ago by Anonymous.

Viewing 2 reply threads

Author

Posts
- April 19, 2012 at 3:19 pm #1242
  Anonymous
  Hi there,
  
  I’ve been trying to find the definition of the cluster application’s -gdtmm flag, and why we might use it over clustering by rmsd (no flag required).
  All the examples I’ve run across use the -gdtmm flag, but again, I’m just not how exactly this is clustering.
  
  Then, following that, when the flag -cluster:sort_groups_by_energy is used, how does it sort them? Does it label the first cluster as the one with the lowest average energy for all the structures in the cluster? It almost seems at odds with clustering by RMSD.
  
  Any ideas?
  
  Cheers,
  Brett
- June 24, 2012 at 9:24 pm #7303
  Anonymous
  GDTMM seems to be dark magic used during CASP but nobody seems to know what it is, how it works, or who uses it…
- June 25, 2012 at 7:17 pm #7309
  Anonymous
  After asking some people in the know, here’s my understanding of GDTMM.
  
  It’s a variant of the GDT metric (global distance test) used by the CASP structure prediction contest (see Zemla A. “LGA: A method for finding 3D similarities in protein structures.” (2003) NAR 31(13):3370-4. http://nar.oxfordjournals.org/content/31/13/3370.long ). The basic algorithm is to iteratively align two structures under a set of different cutoffs, and then tally the fraction of residues which fall within the cutoffs. So you end up with scores in the range of 0.0 to 1.0, with 1.0 being a perfect match.
  
  The benefits of this metric over RMSD is that RMSD is very sensitive to outliers. For example, if you have an almost perfect match, but have an unstructured tail that’s 20 Ang different between the two models, that 20^2 differences is going to dominate the RMSD, even though the differences may be limited to a small fraction of the structure.
  
  For CASP, there are typically two variants used GDT-TS (“total score”), which is the standard metric (I think the cutoffs are 1, 2, 4 and 8 Ang), and GDT-HA (“high accuracy” – cutoffs are something like 0.5, 1, 2, 4).
  
  GDTMM is a Baker-lab specific metric, where the MAMMOTH alignment algorithm (MM = MAMMOTH) is used for the superposition (it should match GDT-TS in all other respects, though). You can get *slight* differences in GDT metrics based on which alignment algorithm you use.
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.