RMSD values in docking2 and symmetric_docking runs

Member Site Forums ROSIE ROSIE – General RMSD values in docking2 and symmetric_docking runs

Viewing 3 reply threads
  • Author
    • #2251

        I am curious how ROSIE docking2 and symmetric_docking runs determine RMSD values.

        For docking2 runs, are RMSD values determined between
        the outputtrigger_00001.docknative.pdb file
        and each outputtrigger_00001.dockproteins_*.pdb file?
        If so, how is the native.pdb file made?
        Is it a weighted average of all the proteins_*.pdb files
        with more weight given to files with better scores or I_sc values?

        For symmetric_docking runs, are RMSD values determined between
        the outputtrigger-00000.make_rms_refrms_ref.pdb file
        and each outputtrigger-00008.extract_pdbs*protein_*.pdb file?
        If so, how is the rms_ref.pdb file made?
        Is it a weighted average of all the *protein_*.pdb files
        with more weight given to files with better scores or I_sc values?

        Where can I find more details about these things?


      • #11140

          For symmetric_docking runs, http://rosie.rosettacommons.org/symmetric_docking/documentation says the following:

          “The server also returns a plot of the energies of the 400 lowest energy models. Each point on this plot represents a structure created by the server. The y- axis is the energy of the structure. The x-axis is a distance measure to a reference complex structure (Ca rmsd in Angstrom). The reference complex is the lowest energy model predicted by Rosetta. This is only true if the lowest energy model is found among the models making up the top 5 clusters. If not the reference model is selected to be the cluster center of the largest cluster. The cluster centers for the 5 largest clusters are shown in the plot as well as red points. “

        • #11146

            From what I can tell, it looks like the RMSD in the Docking2 application is calculated to the input structure. So the “native.pdb” should be identical to the input pdb.

          • #11162

              Thanks for your response, rmoretti, but it leaves me puzzled.

              Every ROSIE symmetric_docking run I have done so far has RMSD=0.0 clearly labeled on its Score/RMSD plot.
              This makes sense because each run chooses one of its models or decoys to be the run’s reference for RMSD calculations.

              Meanwhile, none of the ROSIE docking2 runs I have done so far have RMSD=0 labeled on their I_sc/RMSD or Score/RMSD plots.
              I would think that if the proteins.pdb file used as input to a docking2 run also served as the run’s reference for RMSD calculations,
              at least one of the run’s models or decoys would give RMSD=0. I have even done runs that used the best-scoring or best-I_sc
              model/decoy from a previous run as their proteins.pdb file. You would think that at least one of these runs would have given RMSD=0.

              Almost all of my docking2 runs so far have used the local_docking protocol, but I have just started doing docking2 runs
              with the docking_local_refine protocol. The first of these gave a smallest RMSD value of 0.036. I will let you know if any
              of these docking_local_refine runs gives RMSD=0. I think they should have a better chance than local_docking runs of
              giving RMSD=0.

              In my mind, the question remains, what determines the RMSD values in docking2 runs?

            • #11189

                No, in general you wouldn’t expect an output at exactly 0 Ang, as any sort of movement would cause a non-zero rmsd. The only practical way you’d get a zero rmsd is if you have absolutely no movement from the reference structure. In the symmetric_docking case, you get a zero rmsd value because one of the output structures is picked as a reference, so when you compare it to itself, you get a zero rmsd. (Note that all other structures have values which are distinctly not zero). In Docking2, it’s not an output structure which is the reference, but the input. Since the reference structure isn’t in the output set, you don’t get a zero rmsd value in the output.

                It makes sense the local refine docking would give smaller rmsds, because it does much smaller perturbations. Smaller perturbations means that the structure stays closer to the starting structure, and the local space is more densely sampled. The more perturbation you use, the further afield you’re likely to go.

                One thing to keep in mind is that “structural space” is vast, and the Rosetta energy function is rugged. You shouldn’t expect to perfectly recapitulate a structure, even if you feed it back into the same protocol. Imagine a marble in a bowl. You jiggle the bowl lightly to “settle” the ball. The marble ends up at the low part of the bowl. But if you repeat the process, you wouldn’t expect the marble to come to rest at *exactly* the same spot, even if it started at the bottom of the bowl – there’s just too much variation in the process to do so. Same with Rosetta runs. There’s always some random perturbation involved, so you’re not necessarily going to get exactly the starting structure, even if you use an output structure from a previous run.

                You shouldn’t be expecting a “perfect minimum” from Rosetta runs – that’s just not how Rosetta approaches the modeling problem. Instead, Rosetta aims to cover the search space and give you a number of “good enough” structures. It’s important to remember that even high resolution X-ray crystallography structures have some error in atom location. Not to mention the fact that protein structures aren’t static, and atoms are always shaking around.

            Viewing 3 reply threads
            • You must be logged in to reply to this topic.