how to make benchmark? – Rosetta Commons

This topic has 23 replies, 4 voices, and was last updated 11 years, 7 months ago by Anonymous.

Viewing 8 reply threads

Author

Posts
- April 4, 2012 at 12:23 pm #1224
  Anonymous
  Dear:
  
  I’ve generated 20,000 structures by ab inito methods in Rosetta. I am wondering how to make benchmark by plot SCORE vs RMSD? I would like to take the lowest energy model as the reference model.
  
  thank you very much
- April 4, 2012 at 4:54 pm #6901
  Anonymous
  The scorefile output of most Rosetta runs is a tabular, white-space separated format that can be read into your favorite plotting program. If a protocol doesn’t give you a separate scorefile, but instead gives you just a silent file, you can get the equivalent scorefile by greping out the SCORE: lines of the silent file. You just then need to use your favorite plotting program to generate your plots.
  
  For example, I use R, so I’d probably be do something like:
  
  bash$ grep “SCORE:” silentfile.out > scorefile.sc
  bash$ R
  R$ t = read.table(“scorefile.sc”,header=T)
  R$ summary(t) # Will show if the table was read in correctly, and what columns are availible, as it varies a bit with protocol
  R$ plot(t$rmsd, t$total_score) # Change names based on columns you want to plot
  
  You don’t need to use R, of course, any plotting program that reads whitespace separated tabular data will work.
- April 5, 2012 at 7:33 am #6920
  Anonymous
  Hi guys. Thanks a lot for kind comments.
  
  Could anyone provide more comments on how to “-score_app:superimpose_to_native” and “score_jd2”? I don’t find any documentation for both of them.
  
  thank you very much
- February 19, 2013 at 11:24 am #8426
  Anonymous
  Category: Structure prediction
  Dears,
  
  I want to model 1000 structure by Rosetta , then calculate plot by energy vs rmsd for each structure to find for low energy
  would you please guide me to do it . my commands:
  -loops:input_pdb Model_4gbr.pdb
  -loops:loop_file 4gbr.loop_file
  -loops:remodel perturb_kic
  -loops:refine refine_kic
  -in:file:fullatom
  -out:prefix myloop
  -loops:extended true
  -nstruct 1000
  -ex1
  -ex2
  
  thanks in advance,
- February 20, 2013 at 6:05 pm #8438
  Anonymous
  Not whole, only Ca RMSD of loop,
  But when I employ -in:file:native I face with this error
  ****core.pose.util: Cannot open psipred_ss2 file tt
  protocols.loops.loops_main: can not open DSSP file tt
  
  ERROR: !pdb.empty()****
  No I do not have any RMSD column in my output.
  
  thanks,
- February 20, 2013 at 8:16 pm #8442
  Anonymous
  this is my input, but I did not get the RMSD as one column and total_score as the first or second column
  
  -loops:input_pdb Model_3pbl.pdb
  -loops:loop_file 3pbl.loop_file
  -loops:remodel perturb_kic
  -loops:refine refine_kic
  -in:file:fullatom
  -in:file:native Model_3pbl.pdb
  -out:prefix myloop
  -loops:extended true
  -nstruct 1
  -ex1
  -ex2
  -out:file:scorefile
  -overwrite
  > 1.log
  ~
- February 21, 2013 at 9:54 am #8444
  Anonymous
  Yes I have a scorefile name but it does not give me scorefile ,I use this command to run ***loopmodel.default.linuxgccrelease @flag***
  
  -loops:input_pdb rec2_3pwh.pdb
  -loops:loop_file 3pwh.loop_file
  -loops:remodel perturb_kic
  -loops:refine refine_kic
  -in:file:fullatom
  -in:file:native rec2_3pwh.pdb
  -out:prefix myloop
  -loops:extended true
  -nstruct 1
  -ex1
  -ex2
  -out:file:scorefile output.sc
  > 1.log
- March 9, 2013 at 2:58 pm #8469
  Anonymous
  Hi,
  
  I have 10000 PDB file that I remodeled their loops,
  But Now I want to make a graph of :
  x-axis: RMSD based on backbone structure
  y-axis: Score
  To select low-RMSD, low-energy structures for further analysis
  How can I make it.
  
  thanks
- March 10, 2013 at 4:31 pm #8471
  Anonymous
  I have a question,
  When I add Hydrogen atoms to my protein I face with a lot warning like **discarding 2 atoms at position 243 in file Best match rsd_type: ALA ** …
  when I remodel my system (loops) should I add Hydrogen atoms to my system or not.
  
  thanks
- April 4, 2012 at 4:59 pm #6903
  Anonymous
  I just realized that you probably don’t have the rmsd value you want in the current scorefile.
  
  To get that, you can use the scoring applications ( http://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d2/da0/score_commands.html ). If you pass the lowest energy structure (as a PDB) to the command line flag -in:file:native, your output scorefile should contain a column representing the C-alpha rmsd to the “native” (in your case the lowest energy structure).
- April 4, 2012 at 5:02 pm #6904
  Anonymous
  Does that trigger superposition or just RMSD calculation? (I’m thinking the latter; I’ve always written my own RMSD apps as needed…)
- April 4, 2012 at 5:53 pm #6911
  Anonymous
  Right, by default it looks like it does a raw rmsd calculation using current coordinates. However, there is an (apparently undocumented) -score_app:superimpose_to_native command line option which should trigger the superposition.
  
  That’s for the regular/old scoring application. The rmsd calculation for score_jd2 is triggered in some deep internals, and I’m not quite sure how one controls that.
- April 4, 2012 at 8:57 pm #6916
  Anonymous
  Wheeeeeee! Undocumentation!
- April 5, 2012 at 12:43 pm #6921
  Anonymous
  As far as I can tell, neither exist.
  
  The latter option appears to work only with the older score app, not score_jd2.
- April 5, 2012 at 4:09 pm #6926
  Anonymous
  A Steven says, there isn’t currently any documentation for either of them.
  
  score_jd2 is intended as a “jd2” (the unified scheme for file input, output, and job control which we’re trying to transition Rosetta applications to) version of the older “score” application. They are unfortunately not flag/flag or even feature/feature compatible. The documentation for the scoring application was written for the older, non-jd2 score.
  
  -score_app:superimpose_to_native is a command line flag for the older score application which causes it to do superposition of structure with the “native” prior to computing the rmsd. I unfortunately don’t know if there is a score_jd2 equivalent for this.
- February 19, 2013 at 5:12 pm #8429
  Anonymous
  Which RMSD do you want? Whole-structure or loop RMSD? Did you run Rosetta already? Is there an RMSD column in the output you already have?
  
  If you haven’t run loop modeling yet, try adding -in:file:native to your command line, it will then generate RMSDs against that native. (You can re-use the -s argument as the -native argument). It generates backbone heavyatom loop RMSDs if you are using KIC.
  
  Assuming you have run your 1000 structures and do not have RMSDs handy, the simplest way to generate RMSDs is to re-score the 1000 PDBs via score_jd2:
  
  score_jd2.linuxgccrelease -l pdblist -native native.pdb -out:file:score_only scorefile.sc -database your_database
  
  where “pdblist” was generated via “ls *pdb > pdblist” (in other words, an endline-delimited list of all PDBs to run on).
  
  This will probably give you an output scorefile that includes an RMSD as one column and total_score as the first or second column, which you can use as input to your plotting software of choice.
  
  If that doesn’t work, let me know, and I’ll fiddle with flags some more until I find something that does.
  
  If you need loop RMSDs, the easiest way to get it is probably going to be to write a script to extract only the loop residues, and use the older score executable with -native and score_app:superimpose_to_native as Rocco suggests above.
- February 20, 2013 at 6:20 pm #8439
  Anonymous
  Can you give me your complete command line here? None of the other stuff you’ve mentioned involves psipred files.
- February 20, 2013 at 6:47 pm #8440
  Anonymous
  Getting the RMSD specifically of the loop might be a little tricky, as you would have to specify the residues over which you wish to calculate the RMSD. If you can get the loop remodeling protocol to calculate it for you as it makes the structures, that would probably be ideal (as it knows which residues it needs to operate over). Most RMSD calculations in Rosetta would be over the entire structure (because they don’t have facilities to input the loop subset designations).
  
  Trying to calculate it after the fact may be tricky, but I think RosettaScripts (https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/RosettaScripts_Documentation.html) can permit you to do it. In RosettaScripts there’s an Rmsd filter (https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/Filters_%28RosettaScripts%29#Rmsd) which allows you to specify a subset of residues over which to calculate the RMSD. You can set a script that defines that filter, and then add the filter to the PROTOCOLS section. If you then run RosettaScripts with that script, the structures with which to calculate the RMSD as the input, and the reference structure as -in:file:native, the output scorefile should contain a column which represents the RMSD value.
  
  One complication would be superposition. I believe the filter superimposes on the specified residues, so you probably want to set superimpose=0 and make sure the input poses and the reference structures are pre-aligned how you want them to be.
- February 20, 2013 at 6:52 pm #8441
  Anonymous
  It is also possible to grep out the loop residues and run regular RMSD calculations on the truncated PDBs.
  
  I agree with you that it’s not appropriate to superimpose loops for RMSD calculations – their endpoints are already superimposed in the Rosetta output, and it doesn’t make sense to transform the isolated loop residues without the rigid core protein context.
- February 20, 2013 at 10:20 pm #8443
  Anonymous
  What does your scorefile look like? Post the first few lines (header and contents).
  
  Also, you are using -out:file:scorefile without passing a scorefile name – this may be causing Rosetta to write to a nameless file (essentially, not writing a scorefile at all). Give an argument to that flag, like “-out:file:scorefile score.sc”. score.sc should not exist beforehand.
- February 21, 2013 at 3:28 pm #8445
  Anonymous
  What output does it give you? Do you get your output PDB? Does the end of the log output (1.log) suggest that Rosetta completed successfully? I’m having trouble duplicating the problem.
- March 9, 2013 at 5:26 pm #8470
  Anonymous
  Assuming you have the RMSDs and scores in a simple file format, either A) the Rosetta scorefile, grepped from the PDBs, or C) grepped from the log files, then just take your pick of plotting software: excel, OpenOffice calc, gnuplot, R…. It’s just a scatterplot of X versus Y.
- March 10, 2013 at 8:58 pm #8473
  Anonymous
  Rosetta is…idiosyncratic when it comes to hydrogen atoms. Short version: do not worry about adding hydrogens, and you never need to add hydrogens to get Rosetta to work well.
  
  Long version:
  Rosetta prefers to add hydrogens itself when it loads in a structure. It will leave existing hydrogens in place if it successfully reads them in, which is dependent on using Rosetta’s preferred hydrogen naming scheme (which may not match the current PDB scheme, although it matched at some point). The discarding atom messages are due to Rosetta ignoring the hydrogens whose names it does not like. If you want really specific hydrogen placements, I can work with you on getting the naming consistent with what Rosetta expects…but unless you have a really good reason to like your hydrogen placements, just leave them out and let Rosetta autoplace them and don’t worry about it.
- March 10, 2013 at 9:18 pm #8474
  Anonymous
  Rule of thumb – the names that Rosetta uses in its output PDBs are the names it expects in input PDBs.
  
  That said, most protocols will likely change the hydrogen (and some heavy atom) placements anyway, so I agree with Steven that in most cases worrying about hydrogens isn’t necessary.
Author

Posts

Viewing 8 reply threads

You must be logged in to reply to this topic.