Simple comparative modeling (threading) example is failing with “unknown atom name: CA CB”

This topic has 9 replies, 2 voices, and was last updated 7 years, 9 months ago by Anonymous.

Viewing 9 reply threads

Author

Posts

March 24, 2017 at 5:52 pm #2622

Anonymous

I am trying to run a very simple threading scenario (comparative modeling – but as basic as it gets).

I am executing the command:

partial_thread.static.linuxgccrelease -database $ROSETTA_DATABASE -in:file:fasta /inputs/2yhd_model.fasta -in:file:alignment /inputs/2yhd_model.grishin.txt -in:file:template_pdb /inputs/2yhd.pdb

But an getting the error:

ERROR: unknown atom_name: CA CB

I’ve been stuck on this for a few days now, and need to ask for pointers. I saw another question on this forum with the same issue, but its left unresolved, so I figured I’d try to give as much detail here as I can.

My template PDB is (/inputs/2yhd.pdb) has been run through clean_pdb.py, with no apparent issues. The PDB is attached.

My grishin formatted alignment was manually created, but I’m pretty sure the formatting is correct since I followed all the documentation. The alignment file is attached.

My target FASTA is also attached in the zip.

Does anyone know why I am getting this error? You’ll see in the alignment file, there is only one difference – a single mutation from AALLSSL to AALHSSL.

Full output:



core.init: Rosetta version unknown:cbe8723f7038f0b9e5d24fca9c3728b2fc952a37 2016-08-02 10:58:29 -0400 from /scratch/local-benchmark/release/rosetta/git/release/rosetta.binary.linux.release.git

core.init: command: /rosetta/bin/partial_thread.static.linuxgccrelease -database /databases/rosetta/ -in:file:fasta /inputs/2yhd_model.fasta -in:file:alignment /inputs/2yhd_model.grishin.aln -in:file:template_pdb /inputs/2yhd.pdb -ignore_unrecognized_res

core.init: 'RNG device' seed mode, using '/dev/urandom', seed=-519025839 seed_offset=0 real_seed=-519025839

core.init.random: RandomGenerator:init: Normal mode, seed=-519025839 RG_type=mt19937

core.chemical.ResidueTypeSet: Finished initializing fa_standard residue type set.  Created 414 residue types

core.chemical.ResidueTypeSet: Total time to initialize 0.34 seconds.

core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] skipping pdb residue b/c it's missing too many mainchain atoms: 1920 A TES TES

core.io.pose_from_sfr.PoseFromSFRBuilder: missing:  N

core.io.pose_from_sfr.PoseFromSFRBuilder: missing:  CA

core.io.pose_from_sfr.PoseFromSFRBuilder: missing:  C

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue GLN 23

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CD  on residue GLN 23

core.conformation.Conformation: [ WARNING ] missing heavyatom:  OE1 on residue GLN 23

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NE2 on residue GLN 23

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue ARG 90

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CD  on residue ARG 90

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NE  on residue ARG 90

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CZ  on residue ARG 90

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NH1 on residue ARG 90

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NH2 on residue ARG 90

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue LYS 155

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CD  on residue LYS 155

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CE  on residue LYS 155

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NZ  on residue LYS 155

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue LYS 166

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CD  on residue LYS 166

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CE  on residue LYS 166

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NZ  on residue LYS 166

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue LYS 177

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CD  on residue LYS 177

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CE  on residue LYS 177

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NZ  on residue LYS 177

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue ASN 178

core.conformation.Conformation: [ WARNING ] missing heavyatom:  OD1 on residue ASN 178

core.conformation.Conformation: [ WARNING ] missing heavyatom:  ND2 on residue ASN 178

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue ARG 184

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CD  on residue ARG 184

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NE  on residue ARG 184

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CZ  on residue ARG 184

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NH1 on residue ARG 184

core.conformation.Conformation: [ WARNING ] missing heavyatom:  NH2 on residue ARG 184

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CG  on residue GLU 223

core.conformation.Conformation: [ WARNING ] missing heavyatom:  CD  on residue GLU 223

core.conformation.Conformation: [ WARNING ] missing heavyatom:  OE1 on residue GLU 223

core.conformation.Conformation: [ WARNING ] missing heavyatom:  OE2 on residue GLU 223

core.pack.pack_missing_sidechains: packing residue number 23 because of missing atom number 6 atom name  CG

core.pack.pack_missing_sidechains: packing residue number 90 because of missing atom number 6 atom name  CG

core.pack.pack_missing_sidechains: packing residue number 155 because of missing atom number 6 atom name  CG

core.pack.pack_missing_sidechains: packing residue number 166 because of missing atom number 6 atom name  CG

core.pack.pack_missing_sidechains: packing residue number 177 because of missing atom number 6 atom name  CG

core.pack.pack_missing_sidechains: packing residue number 178 because of missing atom number 6 atom name  CG

core.pack.pack_missing_sidechains: packing residue number 184 because of missing atom number 6 atom name  CG

core.pack.pack_missing_sidechains: packing residue number 223 because of missing atom number 6 atom name  CG

core.pack.task: Packer task: initialize from command line()

core.scoring.ScoreFunctionFactory: SCOREFUNCTION: talaris2014

core.scoring.etable: Starting energy table calculation

core.scoring.etable: smooth_etable: changing atr/rep split to bottom of energy well

core.scoring.etable: smooth_etable: spline smoothing lj etables (maxdis = 6)

core.scoring.etable: smooth_etable: spline smoothing solvation etables (max_dis = 6)

core.scoring.etable: Finished calculating energy tables.

basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBPoly1D.csv

basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBFadeIntervals.csv

basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBEval.csv

basic.io.database: Database file opened: scoring/score_functions/rama/Rama_smooth_dyn.dat_ss_6.4

basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA

basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_n

basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_pp

core.pack.dunbrack.RotamerLibrary: Using Dunbrack library binary file '/databases/rosetta/rotamer/ExtendedOpt1-5/Dunbrack10.lib.bin'.

core.pack.dunbrack.RotamerLibrary: Dunbrack 2010 library took 0.16 seconds to load from binary

core.pack.pack_rotamers: built 155 rotamers at 8 positions.

core.pack.interaction_graph.interaction_graph_factory: Instantiating DensePDInteractionGraph

core.pack.interaction_graph.interaction_graph_factory: IG: 5396 bytes

partial_thread: score: 0 identities: 247/248 gaps: 0/248

partial_thread:           2yhd_model       1 PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHT

partial_thread:             2yhd.pdb       1 PIFLNVLEAIEPGVVCAGHDNNQPDSFAALLSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHT

partial_thread:

partial_thread: id 2yhd.pdb => 2yhd.

core.chemical.ResidueType: atom name : CB not available in residue  CA

core.chemical.ResidueType: 'CA  ' 0x7ecb4a8

core.chemical.ResidueType: ' V1 ' 0x7ecb718

core.chemical.ResidueType: ' V2 ' 0x7ecc498

core.chemical.ResidueType: ' V3 ' 0x7ecc668

core.chemical.ResidueType: ' V4 ' 0x7eb5ba8

core.chemical.ResidueType:



ERROR: unknown atom_name: CA  CB

ERROR:: Exit from: src/core/chemical/ResidueType.cc line: 3540

[0x3130f71]

[0x492061f]

[0x43c7d57]

[0x18560e4]

[0x184cd74]

[0x412f7c]

[0x4c2e6f4]

[0x953d6d]

caught exception



[ERROR] EXCN_utility_exit has been thrown from: src/core/chemical/ResidueType.cc line: 3540

ERROR: unknown atom_name: CA  CB

March 24, 2017 at 6:14 pm #12237
Anonymous
clean_pdb.py does not overwrite the original PDB. Instead, it writes a new PDB file as the cleaned version.

Right now you’re using the pre-cleaned PDB as your template input, which will fail in threading for two reasons 1) It still contains non-protein residues like calcium (hence the issue finding the Cbeta atom on residue CA) and 2) the sequence of the template PDB (pre-cleaning) probably does not match the sequence (post-cleaning) you’re using for your alignment.

Try using the cleaned PDB output (something like inputs/2yhd_A.pdb, but the exact name will depend on how you ran the clean_pdb.py script) as the input to the threading run.
March 24, 2017 at 6:14 pm #12758
Anonymous
clean_pdb.py does not overwrite the original PDB. Instead, it writes a new PDB file as the cleaned version.

Right now you’re using the pre-cleaned PDB as your template input, which will fail in threading for two reasons 1) It still contains non-protein residues like calcium (hence the issue finding the Cbeta atom on residue CA) and 2) the sequence of the template PDB (pre-cleaning) probably does not match the sequence (post-cleaning) you’re using for your alignment.

Try using the cleaned PDB output (something like inputs/2yhd_A.pdb, but the exact name will depend on how you ran the clean_pdb.py script) as the input to the threading run.
March 24, 2017 at 6:14 pm #13279
Anonymous
clean_pdb.py does not overwrite the original PDB. Instead, it writes a new PDB file as the cleaned version.

Right now you’re using the pre-cleaned PDB as your template input, which will fail in threading for two reasons 1) It still contains non-protein residues like calcium (hence the issue finding the Cbeta atom on residue CA) and 2) the sequence of the template PDB (pre-cleaning) probably does not match the sequence (post-cleaning) you’re using for your alignment.

Try using the cleaned PDB output (something like inputs/2yhd_A.pdb, but the exact name will depend on how you ran the clean_pdb.py script) as the input to the threading run.
March 24, 2017 at 6:34 pm #12238
Anonymous
Thanks for your reply!

Two things:

(1), I had uploaded the uncleaned pdb instead of the cleaned one, yet, the cleaned one was indeed being run in the partial_thread command. HOWEVER, because I am only making a single AA change, I was extracting the FASTA sequence from the template – uncleaned – PDB. This was a no-no. I should have cleaned it as a very first step. Thank you for clearing that up.

(2) Fixing 1 didn’t lead to an exact fix. After fixing (1), I was getting “length” errors. Turns out, my target fasta was originally fromatted (with line breaks):
```
>/inputs/2yhd_model

PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHV

DDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHL

SQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPT

SCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGK

VKPIYFHTQ
```
and it turns out to need to be (without line breaks):
```
>/inputs/2yhd_model

PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHTQ
```
So, I’m all good now. But this fasta format line break weirdness really threw me for a loop. I hope this post can help somebody else.
March 24, 2017 at 6:34 pm #12759
Anonymous
Thanks for your reply!

Two things:

(1), I had uploaded the uncleaned pdb instead of the cleaned one, yet, the cleaned one was indeed being run in the partial_thread command. HOWEVER, because I am only making a single AA change, I was extracting the FASTA sequence from the template – uncleaned – PDB. This was a no-no. I should have cleaned it as a very first step. Thank you for clearing that up.

(2) Fixing 1 didn’t lead to an exact fix. After fixing (1), I was getting “length” errors. Turns out, my target fasta was originally fromatted (with line breaks):
```
>/inputs/2yhd_model

PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHV

DDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHL

SQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPT

SCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGK

VKPIYFHTQ
```
and it turns out to need to be (without line breaks):
```
>/inputs/2yhd_model

PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHTQ
```
So, I’m all good now. But this fasta format line break weirdness really threw me for a loop. I hope this post can help somebody else.
March 24, 2017 at 6:34 pm #13280
Anonymous
Thanks for your reply!

Two things:

(1), I had uploaded the uncleaned pdb instead of the cleaned one, yet, the cleaned one was indeed being run in the partial_thread command. HOWEVER, because I am only making a single AA change, I was extracting the FASTA sequence from the template – uncleaned – PDB. This was a no-no. I should have cleaned it as a very first step. Thank you for clearing that up.

(2) Fixing 1 didn’t lead to an exact fix. After fixing (1), I was getting “length” errors. Turns out, my target fasta was originally fromatted (with line breaks):
```
>/inputs/2yhd_model

PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHV

DDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHL

SQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPT

SCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGK

VKPIYFHTQ
```
and it turns out to need to be (without line breaks):
```
>/inputs/2yhd_model

PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHTQ
```
So, I’m all good now. But this fasta format line break weirdness really threw me for a loop. I hope this post can help somebody else.
March 24, 2017 at 6:35 pm #12239
Anonymous
Is there a way to mark this as “sovled”? Or do we just let the thread dangle? For anyone reading this: its SOLVED.
March 24, 2017 at 6:35 pm #12760
Anonymous
Is there a way to mark this as “sovled”? Or do we just let the thread dangle? For anyone reading this: its SOLVED.
March 24, 2017 at 6:35 pm #13281
Anonymous
Is there a way to mark this as “sovled”? Or do we just let the thread dangle? For anyone reading this: its SOLVED.

Author

Posts

Viewing 9 reply threads

You must be logged in to reply to this topic.