Simple comparative modeling (threading) example is failing with “unknown atom name: CA CB”

Member Site Forums Rosetta 3 Rosetta 3 – Applications Simple comparative modeling (threading) example is failing with “unknown atom name: CA CB”

Viewing 9 reply threads
  • Author
    Posts
    • #2622
      Anonymous

        I am trying to run a very simple threading scenario (comparative modeling – but as basic as it gets).

        I am executing the command:

        partial_thread.static.linuxgccrelease -database $ROSETTA_DATABASE -in:file:fasta /inputs/2yhd_model.fasta -in:file:alignment /inputs/2yhd_model.grishin.txt -in:file:template_pdb /inputs/2yhd.pdb 

        But an getting the error: 

        ERROR: unknown atom_name: CA  CB

        I’ve been stuck on this for a few days now, and need to ask for pointers. I saw another question on this forum with the same issue, but its left unresolved, so I figured I’d try to give as much detail here as I can.

        My template PDB is (/inputs/2yhd.pdb) has been run through clean_pdb.py, with no apparent issues. The PDB is attached.

        My grishin formatted alignment was manually created, but I’m pretty sure the formatting is correct since I followed all the documentation. The alignment file is attached.

        My target FASTA is also attached in the zip.

        Does anyone know why I am getting this error? You’ll see in the alignment file, there is only one difference – a single mutation from AALLSSL to AALHSSL.

         

        Full output:


        core.init: Rosetta version unknown:cbe8723f7038f0b9e5d24fca9c3728b2fc952a37 2016-08-02 10:58:29 -0400 from /scratch/local-benchmark/release/rosetta/git/release/rosetta.binary.linux.release.git
        core.init: command: /rosetta/bin/partial_thread.static.linuxgccrelease -database /databases/rosetta/ -in:file:fasta /inputs/2yhd_model.fasta -in:file:alignment /inputs/2yhd_model.grishin.aln -in:file:template_pdb /inputs/2yhd.pdb -ignore_unrecognized_res
        core.init: 'RNG device' seed mode, using '/dev/urandom', seed=-519025839 seed_offset=0 real_seed=-519025839
        core.init.random: RandomGenerator:init: Normal mode, seed=-519025839 RG_type=mt19937
        core.chemical.ResidueTypeSet: Finished initializing fa_standard residue type set. Created 414 residue types
        core.chemical.ResidueTypeSet: Total time to initialize 0.34 seconds.
        core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] skipping pdb residue b/c it's missing too many mainchain atoms: 1920 A TES TES
        core.io.pose_from_sfr.PoseFromSFRBuilder: missing: N
        core.io.pose_from_sfr.PoseFromSFRBuilder: missing: CA
        core.io.pose_from_sfr.PoseFromSFRBuilder: missing: C
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue GLN 23
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue GLN 23
        core.conformation.Conformation: [ WARNING ] missing heavyatom: OE1 on residue GLN 23
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NE2 on residue GLN 23
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue ARG 90
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue ARG 90
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NE on residue ARG 90
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CZ on residue ARG 90
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NH1 on residue ARG 90
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NH2 on residue ARG 90
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue LYS 155
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue LYS 155
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CE on residue LYS 155
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NZ on residue LYS 155
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue LYS 166
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue LYS 166
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CE on residue LYS 166
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NZ on residue LYS 166
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue LYS 177
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue LYS 177
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CE on residue LYS 177
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NZ on residue LYS 177
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue ASN 178
        core.conformation.Conformation: [ WARNING ] missing heavyatom: OD1 on residue ASN 178
        core.conformation.Conformation: [ WARNING ] missing heavyatom: ND2 on residue ASN 178
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue ARG 184
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue ARG 184
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NE on residue ARG 184
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CZ on residue ARG 184
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NH1 on residue ARG 184
        core.conformation.Conformation: [ WARNING ] missing heavyatom: NH2 on residue ARG 184
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CG on residue GLU 223
        core.conformation.Conformation: [ WARNING ] missing heavyatom: CD on residue GLU 223
        core.conformation.Conformation: [ WARNING ] missing heavyatom: OE1 on residue GLU 223
        core.conformation.Conformation: [ WARNING ] missing heavyatom: OE2 on residue GLU 223
        core.pack.pack_missing_sidechains: packing residue number 23 because of missing atom number 6 atom name CG
        core.pack.pack_missing_sidechains: packing residue number 90 because of missing atom number 6 atom name CG
        core.pack.pack_missing_sidechains: packing residue number 155 because of missing atom number 6 atom name CG
        core.pack.pack_missing_sidechains: packing residue number 166 because of missing atom number 6 atom name CG
        core.pack.pack_missing_sidechains: packing residue number 177 because of missing atom number 6 atom name CG
        core.pack.pack_missing_sidechains: packing residue number 178 because of missing atom number 6 atom name CG
        core.pack.pack_missing_sidechains: packing residue number 184 because of missing atom number 6 atom name CG
        core.pack.pack_missing_sidechains: packing residue number 223 because of missing atom number 6 atom name CG
        core.pack.task: Packer task: initialize from command line()
        core.scoring.ScoreFunctionFactory: SCOREFUNCTION: talaris2014
        core.scoring.etable: Starting energy table calculation
        core.scoring.etable: smooth_etable: changing atr/rep split to bottom of energy well
        core.scoring.etable: smooth_etable: spline smoothing lj etables (maxdis = 6)
        core.scoring.etable: smooth_etable: spline smoothing solvation etables (max_dis = 6)
        core.scoring.etable: Finished calculating energy tables.
        basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBPoly1D.csv
        basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBFadeIntervals.csv
        basic.io.database: Database file opened: scoring/score_functions/hbonds/sp2_elec_params/HBEval.csv
        basic.io.database: Database file opened: scoring/score_functions/rama/Rama_smooth_dyn.dat_ss_6.4
        basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA
        basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_n
        basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_pp
        core.pack.dunbrack.RotamerLibrary: Using Dunbrack library binary file '/databases/rosetta/rotamer/ExtendedOpt1-5/Dunbrack10.lib.bin'.
        core.pack.dunbrack.RotamerLibrary: Dunbrack 2010 library took 0.16 seconds to load from binary
        core.pack.pack_rotamers: built 155 rotamers at 8 positions.
        core.pack.interaction_graph.interaction_graph_factory: Instantiating DensePDInteractionGraph
        core.pack.interaction_graph.interaction_graph_factory: IG: 5396 bytes
        partial_thread: score: 0 identities: 247/248 gaps: 0/248
        partial_thread: 2yhd_model 1 PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHT
        partial_thread: 2yhd.pdb 1 PIFLNVLEAIEPGVVCAGHDNNQPDSFAALLSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHT
        partial_thread:
        partial_thread: id 2yhd.pdb => 2yhd.
        core.chemical.ResidueType: atom name : CB not available in residue CA
        core.chemical.ResidueType: 'CA ' 0x7ecb4a8
        core.chemical.ResidueType: ' V1 ' 0x7ecb718
        core.chemical.ResidueType: ' V2 ' 0x7ecc498
        core.chemical.ResidueType: ' V3 ' 0x7ecc668
        core.chemical.ResidueType: ' V4 ' 0x7eb5ba8
        core.chemical.ResidueType:

        ERROR: unknown atom_name: CA CB
        ERROR:: Exit from: src/core/chemical/ResidueType.cc line: 3540
        [0x3130f71]
        [0x492061f]
        [0x43c7d57]
        [0x18560e4]
        [0x184cd74]
        [0x412f7c]
        [0x4c2e6f4]
        [0x953d6d]
        caught exception

        [ERROR] EXCN_utility_exit has been thrown from: src/core/chemical/ResidueType.cc line: 3540
        ERROR: unknown atom_name: CA CB

         

      • #12237
        Anonymous

          clean_pdb.py does not overwrite the original PDB. Instead, it writes a new PDB file as the cleaned version.

          Right now you’re using the pre-cleaned PDB as your template input, which will fail in threading for two reasons 1) It still contains non-protein residues like calcium (hence the issue finding the Cbeta atom on residue CA) and 2) the sequence of the template PDB (pre-cleaning) probably does not match the sequence (post-cleaning) you’re using for your alignment.

          Try using the cleaned PDB output (something like inputs/2yhd_A.pdb, but the exact name will depend on how you ran the clean_pdb.py script) as the input to the threading run.

        • #12758
          Anonymous

            clean_pdb.py does not overwrite the original PDB. Instead, it writes a new PDB file as the cleaned version.

            Right now you’re using the pre-cleaned PDB as your template input, which will fail in threading for two reasons 1) It still contains non-protein residues like calcium (hence the issue finding the Cbeta atom on residue CA) and 2) the sequence of the template PDB (pre-cleaning) probably does not match the sequence (post-cleaning) you’re using for your alignment.

            Try using the cleaned PDB output (something like inputs/2yhd_A.pdb, but the exact name will depend on how you ran the clean_pdb.py script) as the input to the threading run.

          • #13279
            Anonymous

              clean_pdb.py does not overwrite the original PDB. Instead, it writes a new PDB file as the cleaned version.

              Right now you’re using the pre-cleaned PDB as your template input, which will fail in threading for two reasons 1) It still contains non-protein residues like calcium (hence the issue finding the Cbeta atom on residue CA) and 2) the sequence of the template PDB (pre-cleaning) probably does not match the sequence (post-cleaning) you’re using for your alignment.

              Try using the cleaned PDB output (something like inputs/2yhd_A.pdb, but the exact name will depend on how you ran the clean_pdb.py script) as the input to the threading run.

            • #12238
              Anonymous

                Thanks for your reply! 

                Two things:

                (1), I had uploaded the uncleaned pdb instead of the cleaned one, yet, the cleaned one was indeed being run in the partial_thread command. HOWEVER, because I am only making a single AA change, I was extracting the FASTA sequence from the template – uncleaned – PDB. This was a no-no. I should have cleaned it as a very first step. Thank you for clearing that up.

                (2) Fixing 1 didn’t lead to an exact fix. After fixing (1), I was getting “length” errors. Turns out, my target fasta was originally fromatted (with line breaks):


                >/inputs/2yhd_model
                PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHV
                DDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHL
                SQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPT
                SCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGK
                VKPIYFHTQ

                 and it turns out to need to be (without line breaks):


                >/inputs/2yhd_model
                PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHTQ

                 

                So, I’m all good now. But this fasta format line break weirdness really threw me for a loop. I hope this post can help somebody else.

              • #12759
                Anonymous

                  Thanks for your reply! 

                  Two things:

                  (1), I had uploaded the uncleaned pdb instead of the cleaned one, yet, the cleaned one was indeed being run in the partial_thread command. HOWEVER, because I am only making a single AA change, I was extracting the FASTA sequence from the template – uncleaned – PDB. This was a no-no. I should have cleaned it as a very first step. Thank you for clearing that up.

                  (2) Fixing 1 didn’t lead to an exact fix. After fixing (1), I was getting “length” errors. Turns out, my target fasta was originally fromatted (with line breaks):


                  >/inputs/2yhd_model
                  PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHV
                  DDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHL
                  SQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPT
                  SCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGK
                  VKPIYFHTQ

                   and it turns out to need to be (without line breaks):


                  >/inputs/2yhd_model
                  PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHTQ

                   

                  So, I’m all good now. But this fasta format line break weirdness really threw me for a loop. I hope this post can help somebody else.

                • #13280
                  Anonymous

                    Thanks for your reply! 

                    Two things:

                    (1), I had uploaded the uncleaned pdb instead of the cleaned one, yet, the cleaned one was indeed being run in the partial_thread command. HOWEVER, because I am only making a single AA change, I was extracting the FASTA sequence from the template – uncleaned – PDB. This was a no-no. I should have cleaned it as a very first step. Thank you for clearing that up.

                    (2) Fixing 1 didn’t lead to an exact fix. After fixing (1), I was getting “length” errors. Turns out, my target fasta was originally fromatted (with line breaks):


                    >/inputs/2yhd_model
                    PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHV
                    DDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHL
                    SQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPT
                    SCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGK
                    VKPIYFHTQ

                     and it turns out to need to be (without line breaks):


                    >/inputs/2yhd_model
                    PIFLNVLEAIEPGVVCAGHDNNQPDSFAALHSSLNELGERQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFTNVNSRMLYFAPDLVFNEYRMHKSRMYSQCVRMRHLSQEFGWLQITPQEFLCMKALLLFSIIPVDGLKNQKFFDELRMNYIKELDRIIACKRKNPTSCSRRFYQLTKLLDSVQPIARELHQFTFDLLIKSHMVSVDFPEMMAEIISVQVPKILSGKVKPIYFHTQ

                     

                    So, I’m all good now. But this fasta format line break weirdness really threw me for a loop. I hope this post can help somebody else.

                  • #12239
                    Anonymous

                      Is there a way to mark this as “sovled”? Or do we just let the thread dangle? For anyone reading this: its SOLVED.

                    • #12760
                      Anonymous

                        Is there a way to mark this as “sovled”? Or do we just let the thread dangle? For anyone reading this: its SOLVED.

                      • #13281
                        Anonymous

                          Is there a way to mark this as “sovled”? Or do we just let the thread dangle? For anyone reading this: its SOLVED.

                      Viewing 9 reply threads
                      • You must be logged in to reply to this topic.