- This topic has 23 replies, 4 voices, and was last updated 10 years, 9 months ago by Anonymous.
April 4, 2012 at 12:23 pm #1224Anonymous
I’ve generated 20,000 structures by ab inito methods in Rosetta. I am wondering how to make benchmark by plot SCORE vs RMSD? I would like to take the lowest energy model as the reference model.
thank you very much
April 4, 2012 at 4:54 pm #6901Anonymous
The scorefile output of most Rosetta runs is a tabular, white-space separated format that can be read into your favorite plotting program. If a protocol doesn’t give you a separate scorefile, but instead gives you just a silent file, you can get the equivalent scorefile by greping out the SCORE: lines of the silent file. You just then need to use your favorite plotting program to generate your plots.
For example, I use R, so I’d probably be do something like:
bash$ grep “SCORE:” silentfile.out > scorefile.sc
R$ t = read.table(“scorefile.sc”,header=T)
R$ summary(t) # Will show if the table was read in correctly, and what columns are availible, as it varies a bit with protocol
R$ plot(t$rmsd, t$total_score) # Change names based on columns you want to plot
You don’t need to use R, of course, any plotting program that reads whitespace separated tabular data will work.
April 5, 2012 at 7:33 am #6920Anonymous
Hi guys. Thanks a lot for kind comments.
Could anyone provide more comments on how to “-score_app:superimpose_to_native” and “score_jd2”? I don’t find any documentation for both of them.
thank you very much
February 19, 2013 at 11:24 am #8426Anonymous
Category: Structure prediction
I want to model 1000 structure by Rosetta , then calculate plot by energy vs rmsd for each structure to find for low energy
would you please guide me to do it . my commands:
thanks in advance,
February 20, 2013 at 6:05 pm #8438Anonymous
Not whole, only Ca RMSD of loop,
But when I employ -in:file:native I face with this error
****core.pose.util: Cannot open psipred_ss2 file tt
protocols.loops.loops_main: can not open DSSP file tt
No I do not have any RMSD column in my output.
February 20, 2013 at 8:16 pm #8442Anonymous
this is my input, but I did not get the RMSD as one column and total_score as the first or second column
February 21, 2013 at 9:54 am #8444Anonymous
Yes I have a scorefile name but it does not give me scorefile ,I use this command to run ***loopmodel.default.linuxgccrelease @flag***
March 9, 2013 at 2:58 pm #8469Anonymous
I have 10000 PDB file that I remodeled their loops,
But Now I want to make a graph of :
x-axis: RMSD based on backbone structure
To select low-RMSD, low-energy structures for further analysis
How can I make it.
March 10, 2013 at 4:31 pm #8471Anonymous
I have a question,
When I add Hydrogen atoms to my protein I face with a lot warning like **discarding 2 atoms at position 243 in file Best match rsd_type: ALA ** …
when I remodel my system (loops) should I add Hydrogen atoms to my system or not.
April 4, 2012 at 4:59 pm #6903Anonymous
I just realized that you probably don’t have the rmsd value you want in the current scorefile.
To get that, you can use the scoring applications ( http://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d2/da0/score_commands.html ). If you pass the lowest energy structure (as a PDB) to the command line flag -in:file:native, your output scorefile should contain a column representing the C-alpha rmsd to the “native” (in your case the lowest energy structure).
April 4, 2012 at 5:02 pm #6904Anonymous
Does that trigger superposition or just RMSD calculation? (I’m thinking the latter; I’ve always written my own RMSD apps as needed…)
April 4, 2012 at 5:53 pm #6911Anonymous
Right, by default it looks like it does a raw rmsd calculation using current coordinates. However, there is an (apparently undocumented) -score_app:superimpose_to_native command line option which should trigger the superposition.
That’s for the regular/old scoring application. The rmsd calculation for score_jd2 is triggered in some deep internals, and I’m not quite sure how one controls that.
April 4, 2012 at 8:57 pm #6916Anonymous
April 5, 2012 at 12:43 pm #6921Anonymous
As far as I can tell, neither exist.
The latter option appears to work only with the older score app, not score_jd2.
April 5, 2012 at 4:09 pm #6926Anonymous
A Steven says, there isn’t currently any documentation for either of them.
score_jd2 is intended as a “jd2” (the unified scheme for file input, output, and job control which we’re trying to transition Rosetta applications to) version of the older “score” application. They are unfortunately not flag/flag or even feature/feature compatible. The documentation for the scoring application was written for the older, non-jd2 score.
-score_app:superimpose_to_native is a command line flag for the older score application which causes it to do superposition of structure with the “native” prior to computing the rmsd. I unfortunately don’t know if there is a score_jd2 equivalent for this.
February 19, 2013 at 5:12 pm #8429Anonymous
Which RMSD do you want? Whole-structure or loop RMSD? Did you run Rosetta already? Is there an RMSD column in the output you already have?
If you haven’t run loop modeling yet, try adding -in:file:native to your command line, it will then generate RMSDs against that native. (You can re-use the -s argument as the -native argument). It generates backbone heavyatom loop RMSDs if you are using KIC.
Assuming you have run your 1000 structures and do not have RMSDs handy, the simplest way to generate RMSDs is to re-score the 1000 PDBs via score_jd2:
score_jd2.linuxgccrelease -l pdblist -native native.pdb -out:file:score_only scorefile.sc -database your_database
where “pdblist” was generated via “ls *pdb > pdblist” (in other words, an endline-delimited list of all PDBs to run on).
This will probably give you an output scorefile that includes an RMSD as one column and total_score as the first or second column, which you can use as input to your plotting software of choice.
If that doesn’t work, let me know, and I’ll fiddle with flags some more until I find something that does.
If you need loop RMSDs, the easiest way to get it is probably going to be to write a script to extract only the loop residues, and use the older score executable with -native and score_app:superimpose_to_native as Rocco suggests above.
February 20, 2013 at 6:20 pm #8439Anonymous
Can you give me your complete command line here? None of the other stuff you’ve mentioned involves psipred files.
February 20, 2013 at 6:47 pm #8440Anonymous
Getting the RMSD specifically of the loop might be a little tricky, as you would have to specify the residues over which you wish to calculate the RMSD. If you can get the loop remodeling protocol to calculate it for you as it makes the structures, that would probably be ideal (as it knows which residues it needs to operate over). Most RMSD calculations in Rosetta would be over the entire structure (because they don’t have facilities to input the loop subset designations).
Trying to calculate it after the fact may be tricky, but I think RosettaScripts (https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/RosettaScripts_Documentation.html) can permit you to do it. In RosettaScripts there’s an Rmsd filter (https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/Filters_%28RosettaScripts%29#Rmsd) which allows you to specify a subset of residues over which to calculate the RMSD. You can set a script that defines that filter, and then add the filter to the PROTOCOLS section. If you then run RosettaScripts with that script, the structures with which to calculate the RMSD as the input, and the reference structure as -in:file:native, the output scorefile should contain a column which represents the RMSD value.
One complication would be superposition. I believe the filter superimposes on the specified residues, so you probably want to set superimpose=0 and make sure the input poses and the reference structures are pre-aligned how you want them to be.
February 20, 2013 at 6:52 pm #8441Anonymous
It is also possible to grep out the loop residues and run regular RMSD calculations on the truncated PDBs.
I agree with you that it’s not appropriate to superimpose loops for RMSD calculations – their endpoints are already superimposed in the Rosetta output, and it doesn’t make sense to transform the isolated loop residues without the rigid core protein context.
February 20, 2013 at 10:20 pm #8443Anonymous
What does your scorefile look like? Post the first few lines (header and contents).
Also, you are using -out:file:scorefile without passing a scorefile name – this may be causing Rosetta to write to a nameless file (essentially, not writing a scorefile at all). Give an argument to that flag, like “-out:file:scorefile score.sc”. score.sc should not exist beforehand.
February 21, 2013 at 3:28 pm #8445Anonymous
What output does it give you? Do you get your output PDB? Does the end of the log output (1.log) suggest that Rosetta completed successfully? I’m having trouble duplicating the problem.
March 9, 2013 at 5:26 pm #8470Anonymous
Assuming you have the RMSDs and scores in a simple file format, either A) the Rosetta scorefile, grepped from the PDBs, or C) grepped from the log files, then just take your pick of plotting software: excel, OpenOffice calc, gnuplot, R…. It’s just a scatterplot of X versus Y.
March 10, 2013 at 8:58 pm #8473Anonymous
Rosetta is…idiosyncratic when it comes to hydrogen atoms. Short version: do not worry about adding hydrogens, and you never need to add hydrogens to get Rosetta to work well.
Rosetta prefers to add hydrogens itself when it loads in a structure. It will leave existing hydrogens in place if it successfully reads them in, which is dependent on using Rosetta’s preferred hydrogen naming scheme (which may not match the current PDB scheme, although it matched at some point). The discarding atom messages are due to Rosetta ignoring the hydrogens whose names it does not like. If you want really specific hydrogen placements, I can work with you on getting the naming consistent with what Rosetta expects…but unless you have a really good reason to like your hydrogen placements, just leave them out and let Rosetta autoplace them and don’t worry about it.
March 10, 2013 at 9:18 pm #8474Anonymous
Rule of thumb – the names that Rosetta uses in its output PDBs are the names it expects in input PDBs.
That said, most protocols will likely change the hydrogen (and some heavy atom) placements anyway, so I agree with Steven that in most cases worrying about hydrogens isn’t necessary.
- You must be logged in to reply to this topic.