cluster pdb structures questions

This topic has 5 replies, 3 voices, and was last updated 12 years, 10 months ago by Anonymous.

Viewing 2 reply threads

Author

Posts
- March 29, 2013 at 2:20 am #1545
  Anonymous
  Hi,
  
  I’m new to rosetta and trying to cluster 100 pdb structures using cluster and get a most common one. What should be the expected output files?
  I read the cluster command user guide that the script need first 400 structures as the starting point, so is it because my 100 structures is too less, which cause the problem?
  (https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d7/d6f/cluster_commands.html)
  
  I tried it twice:
  A) my flag
  -database /Applications/rosetta3.4/rosetta_database/
  -in:file:l 5S-100pdb-list (where I put the name list of all the 100 pdb structures)
  
  my error
  
  ERROR: Illegal attempt to score with non-identical atom set between pose and etable
  ERROR:: Exit from: src/core/scoring/etable/EtableEnergy.cc line: 75
  
  my flag
  -database /Applications/rosetta3.4/rosetta_database/
  -in:file:l 5S-100pdb-list
  -score:weights /Applications/rosetta3.4/rosetta_database/scoring/weights/cen_std
  -score:patch /Applications/rosetta3.4/rosetta_database/scoring/weights/score12
  
  my output files are another 100 pdb files named c. XXX.0.pdb. But I expect it to be one or two structures that represent the most common ones.
  
  It will be great if you can give me some suggestions.
  
  Thanks!
- March 29, 2013 at 1:46 pm #8574
  Anonymous
  Your error (and your flag set) are related to the fact that you aren’t clear whether you are using centroid or fullatom scorefunctions. Are your PDBs centroid or fullatom? Centroid means they have CB atoms only (and those atoms do not act like real CBs), fullatom means all sidechain atoms and hydrogens.
  
  This:
  
  -score:weights /Applications/rosetta3.4/rosetta_database/scoring/weights/cen_std
  -score:patch /Applications/rosetta3.4/rosetta_database/scoring/weights/score12
  
  can never ever be done – cen_std is the *centroid* standard scorefunction, score12 is the *fullatom* standard scorefunction.
  
  The solution to your problem is probably to use either -in:file:centroid or -in:file:fullatom (depending on your PDBs).
  
  I have heard good things about using calibur http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881085/ instead of rosetta for clustering.
- April 2, 2013 at 9:16 pm #8594
  Anonymous
  Thank you guys!
  I tried Calibur as you mentioned and it just works!
  It listed the largest cluster with no problem.
- April 2, 2013 at 3:59 am #8583
  Anonymous
  Hi,
  
  Thank you Smlewis for your reply. But seems I still get errors.
  
  My PDB has more than CB atoms (as attachment), so I choose the fullatom one.(Is there a third option?) But I got lots of warnings in the logfile (~15M!).
  And after the run, I still got another 100 PDB files generated named” C.XX.0.pdb”. Besides these PDBs, I don’t get the clustered structures (PDBs), which I suppose I can get after the run, right?
  
  My input flag: (Do I need to add anything else?)
  -database /Applications/rosetta3.4/rosetta_database/
  -in:file:l 5S-100pdb-list
  -in:file:fullatom
  -score:weights /Applications/rosetta3.4/rosetta_database/scoring/weights/score12
  -score:patch /Applications/rosetta3.4/rosetta_database/scoring/weights/score12
  
  My warnings are like this: (and it repeats for all 100 structures from S_00000001.pdb to S_00000100.pdb
  core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 1 in file S_00000001.pdb. Best match rsd_type: MET_p:NtermProteinFull
  core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 2 in file S_00000001.pdb. Best match rsd_type: GLY
  core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 3 in file S_00000001.pdb. Best match rsd_type: PRO
  core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 4 in file S_00000001.pdb. Best match rsd_type: LEU
  …
  core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue MET_p:NtermProteinFull 1
  core.conformation.Conformation: [ WARNING ] missing heavyatom: SD on residue MET_p:NtermProteinFull 1
  core.conformation.Conformation: [ WARNING ] missing heavyatom: CE on residue MET_p:NtermProteinFull 1
  core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue PRO 3
  core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue PRO 3
  …
  core.io.pdb.file_data: [ WARNING ] can’t find atom for res 1 atom CEN (trying to set temp)
  core.io.pdb.file_data: [ WARNING ] can’t find atom for res 2 atom CEN (trying to set temp)
  
  I’d appreciate for any suggestions!
- April 2, 2013 at 5:19 am #8584
  Anonymous
  A few quick late night suggestions before Steven’s comments tomorrow morning:
  
  1) Use standard for weights and score12 for patch with fullatom structures. You don’t need to give the full path here. It will search in the rosetta_database. I usually give the extension as well, but perhaps you don’t need to.
  
  2) The CEN is for the centroid-based sidechain. Your PDB has backbone atoms and then CEN for the sidechain. Don’t pass in:file:full atom. Although, here, what you would really want to do is use the relax step after abinitio. It really helps. If you are using some other application, use the flag out:file:fullatom to output fullatom structures. There is a way to convert your centroid PDBs to fullatom outside of the protocol – I think it’s through the JD2 application but I’ve never used it. Steven?
  
  3) Use calibur as Steven suggested. It is easy to use, and will save you this headache.
- April 2, 2013 at 5:33 pm #8587
  Anonymous
  These are defintely centroid structures. I was unclear/incorrect/vague in my statement of “only CB atoms” – I meant that the only real atoms were CB, and all beyond-CB atoms were missing and replaced with centroid (CEN). My apologies. Don’t use -in:file:fullatom, and don’t use those weight sets, use cen_std as before.
  
  If you want to upconvert your centroids to fullatom, do this:
  
  fixbb -nstruct 1 -ndruns 10 -packing:repack_only -ex1 -ex2 -in:file:fullatom -s (your PDBs)
  
  This will repack on real sidechains in place of the centroids; then your existing cluster command line should work. Don’t pass score12 as both a weights and a patch, do just the weights but no patch. (You don’t need to apply the patch if it is already score12 – the patch converts “standard”, which isn’t standard, to score12.)
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.