Can’t make NCAA’s and D-aminoacids work

Member Site Forums Rosetta 3 Rosetta 3 – General Can’t make NCAA’s and D-aminoacids work

Viewing 7 reply threads
  • Author
    Posts
    • #1445
      Anonymous

        Hi,

        I’m trying to incorporate D-amino acids in some designs.
        I followed the steps as in the recent thread on D-amino acids here:
        http://www.rosettacommons.org/content/steps-use-d-amino-acids

        1) I uncommented the aminoacids I need in residue_types.txt
        2) I downloaded and unpacked the params files
        3) I use the -score:weights mm_std in my docking protocol
        4) I found the 3 letter codes in fa_standard/residue_types/d-caa and modified the input file accordingly.

        When I run low resolution docking now, what I get is the following error message:

        can not find a residue type that matches the residue DSERat position 2
        ERROR: core:util:switch_to_residue_type_set fails
        ERROR: Exit from: src/core/util/SwitchResidueTypeSet.cc line: 143

        Is there some step that is not mentioned in the above thread that I skipped,
        to have the D-amino acids activated?

        Thanks for help

        Jarek

      • #7991
        Anonymous

          It’s failing because you’re trying to do the centroid docking phase on D-amino acids for which there are no centroid parameters. This is what is discussed in the latter part of that thread (post #12 on). Did you see that part and put in D-SER in the centroid parameter set files?

        • #8002
          Anonymous

            Thanks for quick reply. I created all centroid params files now in chemical/residue_type_sets/centroid/residue_types.
            I have also modified the chemical/residue_type_sets/centroid/residue_types.txt file to tell Rosetta where they are.
            All files are read without problems except for DPRO (attached).
            For some reason, Rosetta complains about atom name “H” not available in DPR, but does not complain about this
            in case of L-Proline PRO.params centroid file, on which the DPR is based.
            Here is the exact error message:

            core.chemical.ResidueType: atom name : H not available in residue DPR
            core.chemical.ResidueType: N
            core.chemical.ResidueType: CA
            core.chemical.ResidueType: C
            core.chemical.ResidueType: O
            core.chemical.ResidueType: CB
            core.chemical.ResidueType: CEN
            core.chemical.ResidueType:

            ERROR: unknown atom_name: DPR H
            ERROR:: Exit from: src/core/chemical/ResidueType.cc line: 1692

            Any idea on how to solve this issue and why I get the message for DPRO but not for PRO?

            Thanks for help

            Jarek

          • #8003
            Anonymous

              In parallel to the low res docking I have also tested the full atom docking.
              The rotamer libraries are read in the order of appearance of the aminoacids in the pdb file.
              11 libraries are read, ending on histidine. Before reading methionine (next in sequence)
              I get a segmentation fault:

              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dser.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dcys.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dile.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dasp.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dthr.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dpro.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dlys.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/darg.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dphe.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dgln.rotlib…done!
              Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dhis.rotlib…done!
              ./run: line 1: 10567 Segmentation fault /disk1/programs/rosetta-3.4-D/rosetta-3.4-bundles/rosetta_source/bin/docking_protocol.default.linuxgccrelease
              @flags -database /disk1/programs/rosetta-3.4-D/rosetta_database -run:constant_seed -nodelay

              Does 10567 stand for something? I have no idea where to look for the reason.

              Jarek

            • #8005
              Anonymous

                I will recompile in debug, thanks! Any thoughts on the issue with the DPRO? (previous post)

              • #8009
                Anonymous

                  I have recompiled rosetta 3.4 in debug mode and now this is what I get:

                  core.pack.dunbrack: Dunbrack library took 1.23 seconds to load from binary
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dser.rotlib…done!
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dcys.rotlib…done!
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dile.rotlib…done!
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dasp.rotlib…done!
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dthr.rotlib…done!
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dpro.rotlib…done!
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/dlys.rotlib…done!
                  Reading in rot lib /disk1/programs/rosetta-3.4-D/rosetta_database//rotamer/ncaa_rotlibs/darg.rotlib…done!
                  docking_protocol.default.linuxgccdebug: src/utility/vectorL.hh:352: typename std::vector::const_reference utility::vectorL<, T, A>::operator[](typename utility::vectorL_IndexSelector<(L >= 0)>::index_type) const [with long int L = 1l, T = long unsigned int, A = std::allocator]: Assertion `static_cast< size_type >( i – l_ ) < super::size()' failed.
                  Got some signal… It is:6
                  Process was aborted!

                  Interestingly, when I run the release executable it goes through Arginine and crashes after his.

                  EDIT


                  Running the same with Rosetta 3.3 gives the same result, except for “Got some signal”line:
                  Reading in rot lib /disk1/programs/rosetta-3.3/rosetta_database//ncaa_rotlibs/darg.rotlib…done!
                  docking_protocol.default.linuxgccdebug: src/utility/vectorL.hh:323: typename std::vector::const_reference utility::vectorL<, T, A>::operator[](typename utility::vectorL_IndexSelector<(L >= 0)>::index_type) const [with long int L = 1l, T = long unsigned int, A = std::allocator]: Assertion `static_cast< size_type >( i – l_ ) < super::size()' failed.
                  ./run: line 3: 12289 Aborted

                • #8019
                  Anonymous

                    This is the output of gdb:



                    Reading in rot lib /disk1/programs/rosetta-3.3/rosetta_database//ncaa_rotlibs/darg.rotlib…done!
                    docking_protocol.default.linuxgccdebug: src/utility/vectorL.hh:323: typename std::vector::const_reference utility::vectorL<, T, A>::operator[](typename utility::vectorL_IndexSelector<(L >= 0)>::index_type) const [with long int L = 1l, T = long unsigned int, A = std::allocator]: Assertion `static_cast< size_type >( i – l_ ) < super::size()' failed. Program received signal SIGABRT, Aborted.
                    0x00000032f18305b5 in raise () from /lib64/libc.so.6
                    (gdb) backtrace
                    #0 0x00000032f18305b5 in raise () from /lib64/libc.so.6
                    #1 0x00000032f1832060 in abort () from /lib64/libc.so.6
                    #2 0x00000032f18299ff in __assert_fail () from /lib64/libc.so.6
                    #3 0x0000000000406e94 in utility::vectorL<1l, unsigned long, std::allocator >::operator[] (this=0x16987f88, i=18446744073709551604) at src/utility/vectorL.hh:323
                    #4 0x00002b503d4a3969 in core::pack::dunbrack::SingleResidueDunbrackLibrary::rotno_2_packed_rotno (this=0x16987ed0, rotno=18446744073709551604) at src/core/pack/dunbrack/SingleResidueDunbrackLibrary.cc:367
                    #5 0x00002b503d4a3a80 in core::pack::dunbrack::SingleResidueDunbrackLibrary::rotwell_2_packed_rotno (this=0x16987ed0, rotwell=@0x7fff711dfa68) at src/core/pack/dunbrack/SingleResidueDunbrackLibrary.cc:386
                    #6 0x00002b503d4872c2 in core::pack::dunbrack::RotamericSingleResidueDunbrackLibrary<4ul>::eval_rotameric_energy_deriv (this=0x16987ed0, rsd=@0x16244ef0, scratch=@0x7fff711dfa50, eval_deriv=false) at src/core/pack/dunbrack/RotamericSingleResidueDunbrackLibrary.tmpl.hh:382
                    #7 0x00002b503d487c08 in core::pack::dunbrack::RotamericSingleResidueDunbrackLibrary<4ul>::rotamer_energy (this=0x16987ed0, rsd=@0x16244ef0, scratch=@0x7fff711dfa50) at src/core/pack/dunbrack/RotamericSingleResidueDunbrackLibrary.tmpl.hh:276
                    #8 0x00002b503d42fe0a in core::pack::dunbrack::DunbrackEnergy::residue_energy (this=0x85b5140, rsd=@0x16244ef0, emap=@0x1630e910) at src/core/pack/dunbrack/DunbrackEnergy.cc:101
                    #9 0x00002b503e48c612 in core::scoring::ScoreFunction::eval_ci_1b (this=0x85bbde0, rsd=@0x16244ef0, pose=@0x7fff711e0500, emap=@0x1630e910) at src/core/scoring/ScoreFunction.cc:1223
                    #10 0x00002b503e4923bc in core::scoring::ScoreFunction::eval_onebody_energies (this=0x85bbde0, pose=@0x7fff711e0500) at src/core/scoring/ScoreFunction.cc:1103
                    #11 0x00002b503e493ac9 in core::scoring::ScoreFunction::operator() (this=0x85bbde0, pose=@0x7fff711e0500) at src/core/scoring/ScoreFunction.cc:567
                    #12 0x00002b503b542726 in protocols::docking::DockMCMCycle::init_mc (this=0x161df7c0, pose=@0x7fff711e0500) at src/protocols/docking/DockMCMCycle.cc:222
                    #13 0x00002b503b544b8b in protocols::docking::DockMCMProtocol::apply (this=0x866b2d0, pose=@0x7fff711e0500) at src/protocols/docking/DockMCMProtocol.cc:226
                    #14 0x00002b503b529048 in protocols::docking::DockingProtocol::apply (this=0x85b6d10, pose=@0x7fff711e0500) at src/protocols/docking/DockingProtocol.cc:781
                    #15 0x00002b503b25afa7 in protocols::jd2::JobDistributor::go_main (this=0x8615830, mover={p_ = 0x7fff711e08f0}) at src/protocols/jd2/JobDistributor.cc:375
                    #16 0x00002b503b25bb81 in protocols::jd2::JobDistributor::go (this=0x8615830, mover={p_ = 0x7fff711e09a0}) at src/protocols/jd2/JobDistributor.cc:200
                    #17 0x0000000000405a78 in main (argc=6, argv=0x7fff711e0aa8) at src/apps/public/docking/docking_protocol.cc:64

                  • #8036
                    Anonymous

                      It works great now! thanks!
                      One more thing I’ve noticed is that the d-cys-bridges are messed up during docking.
                      is there some kind of patch I should apply to keep them fixed?

                    • #8004
                      Anonymous

                        10567 does not mean anything – it’s probably the name of the core dump; there may be a file core.10567.

                        Can you recompile in debug mode and run that again to see if/how it fails? Usually debug mode gives more informative error messages.

                        I’m going to guess that the error is that the D-rotamer libraries were made a long time ago and the underlying code drifted somehow, but I’m not sure.

                      • #8006
                        Anonymous

                          I think it’s because proline is missing the amide hydrogen in the backbone, and Rosetta is having a hissy fit about it for some reason. I’m trying to duplicate that one with my copy of 3.4. (I’ll get around to trying the other one too, but it’s more work to track the libraries down).

                          EDIT: this is fixed, but the fix is in reply to the original post.

                        • #8008
                          Anonymous

                            Ok, this one I can fix.

                            The problem is in the patching system. Rosetta has parameters for the residue types, and then “patches” to represent special cases. Instead of 20 residue types, then 20 N-terminal residue types (with extra H atoms), and 20 C-terminal residue types (with the extra OXT), etc, there’s just one “patch” file for the N-terminus, C-terminus, etc.

                            Many patches have to special-case proline because of the lack of the amide hydrogen – if you look in the patches you’ll see the cases like so:

                            BEGIN_CASE ### PROLINE

                            BEGIN_SELECTOR
                            AA PRO
                            END_SELECTOR

                            SET_POLYMER_CONNECT LOWER NONE

                            ADD_PROPERTY LOWER_TERMINUS ## implies terminus

                            END_CASE

                            BEGIN_CASE ### THE GENERAL CASE

                            SET_POLYMER_CONNECT LOWER NONE

                            ADD_PROPERTY LOWER_TERMINUS ## implies terminus

                            ## totally making this up:
                            SET_ICOOR H 120 60 1 N CA C

                            END_CASE

                            Anyway, you need to similarly special-case your DPRO in centroid mode (you can look and see it’s already special-cased in fullatom mode). Here’s how, for centroid/patches/NTermProtein.txt (which is the one that is crashing):

                            BEGIN_CASE ### DPROLINE

                            BEGIN_SELECTOR
                            NAME3 DPR
                            END_SELECTOR

                            SET_POLYMER_CONNECT LOWER NONE

                            ADD_PROPERTY LOWER_TERMINUS ## implies terminus

                            END_CASE

                            Some of the other patches need to be special cased as well. You can do it by either copying the PRO case and replacing “AA PRO” with “NAME3 DPR”, or you can just comment them out of patches.txt. For docking, you can probably do this:

                            patches/CtermProtein.txt
                            patches/NtermProtein.txt
                            #patches/protein_cutpoint_upper.txt
                            #patches/protein_cutpoint_lower.txt
                            #patches/VirtualBB.txt
                            #patches/ShoveBB.txt
                            patches/protein_centroid_with_HA.txt
                            #patches/VirtualNterm.txt
                            #patches/N_acetylated.txt
                            #patches/C_methylamidated.txt
                            #patches/RepulsiveOnly_centroid.txt
                            patches/ser_phosphorylated.txt
                            patches/thr_phosphorylated.txt
                            patches/LowerDNA.txt
                            patches/UpperDNA.txt
                            patches/VirtualDNAPhosphate.txt

                            You would need the cutpoint ones for loop modeling, and you won’t need any of the other commented-out ones.

                          • #8010
                            Anonymous

                              Thanks! It works now!

                            • #8016
                              Anonymous

                                We got a bit unlucky and that message is not all that illuminating. It’s saying that Rosetta tried to access more items in a vector than there actually are in the vector – but it’s not saying where/why it’s trying to access past the end of the vector.

                                In these sorts of situations, it’s helpful to run the program under a debugger and look at a backtrace. Typically I run under gdb. (gdb docking_protocol.default.linuxgccdebug; run ; backtrace; q; ). The full backtrace should tell where in the code it’s erroring out, and looking at the surrounding code may give hints as to what’s going wrong.

                              • #8021
                                Anonymous

                                  It looks like it’s failing in trying to figure out which rotamer bin a particular residue is in (see the rotwell_2_packed_rotno() and rotno_2_packed_rotno() functions which occur right before/above the vector[] operator). Most likely there’s some mismatch between the rotamer library you’re using and the residue type specification. It’s hard to say more without a small, self contained example which can be replicated and poked-around-with on a local machine.

                                • #8022
                                  Anonymous

                                    I have tried to make it work on a number of examples. Here’s one example from PDB.
                                    I am trying to locally redock the D-peptide from 2Q3I.pdb on its L-peptide target.
                                    Attached the pdb and the flags file I’m using.
                                    I had to change some of the amino-acid names: DPN was substituted with DPH,
                                    CSY with DCD and DGL with DGU.

                                    The sequence of the peptide is GacGlGneewftlcaa (I use lower case for D-amino acids)
                                    If I just run it as it is, it fails while reading the dgln rotlib.
                                    I tried to truncate the peptide starting from the N-terminus:
                                    eewftlcaa fails after dtrp.
                                    ftlcaa fails after c but returns “error seqpos >= 1”
                                    aa works fine.

                                    It seems to me that the problem is not related to a particular rotamer library.
                                    For other peptides, dgln and dcys libraries work just fine, and the program fails on others.

                                  • #8025
                                    Anonymous

                                      I’m not quite sure why it’s happening, but it looks like the proximate cause is the fact that the code is calling core::pack::dunbrack::rotamer_from_chi_02() for the non-canonical amino acids. This sort-of works for residues with only one chi, but for multi-chi atoms it fails because rotamer_from_chi_02() is specialized for canonical amino acids. (The end result is that the rotwell vector gets ‘0’s, which then turn negative in core::pack::dunbrack::SingleResidueDunbrackLibrary::rotwell_2_rotno(), resulting in an out-of-range index)

                                      I’m not sure where things are going off the track, but I’ll email our NCAA expert – but keep in mind that he’s in NYC, so he might be busy with other things for a while.

                                    • #8026
                                      Anonymous

                                        Thanks for update.
                                        That’s a good news you were able to reproduce the bug.

                                      • #8032
                                        Anonymous

                                          Doug pointed out that the core of the issue is that you shouldn’t be using fa_dun with NCAA’s to begin with. While you have set “-score:weights mm_std” appropriately, it looks like the docking protocol isn’t obeying that completely, so it’s using fa_dun, even though it shouldn’t be.

                                          For a short term fix, it looks like it may be sufficient to change the call to DockingProtocol() in rosetta_source/src/apps/public/docking/docking_protocol.cc to

                                          DockingProtocolOP dp = new DockingProtocol(utility::tools::make_vector1(1), false, false, true, NULL, core::scoring::getScoreFunction ());

                                          You’ll also need to add several headers to the list of includes at the top of the file:

                                          #include <core/scoring/ScoreFunctionFactory.hh>
                                          #include <utility/tools/make_vector1.hh>

                                          If you then recompile, that solves the immediate problem, though additional ones likely will crop up (e.g. the test case now crashes on missing centroid params files)

                                        • #8033
                                          Anonymous

                                            Clarification: Changing the source and recompiling is only necessary if you’re running Rosetta3.3

                                            If you’re running Rosetta3.4, all you need to do is add “-score::pack_weights mm_std” in addition to “-score:weights mm_std”

                                          • #8038
                                            Anonymous

                                              D-disulfides are probably unhandled. Casual inspection of the disulfide code unsurprisingly finds insistence on residues being “aa_cys” in the aa enum. DCYD is set up for disulfides, but it’s aa_unk, not aa_cys, so I suspect the disulfide score never runs. Assuming you have no L-disulfides in your structure, do the disulfide terms all come out as 0 (indicating that they aren’t running)?

                                            • #8053
                                              Anonymous

                                                Hi, I’m sorry but being a Rosetta newbie, I have no idea how do I check if the disulfide terms come out as 0. Where do I look?

                                              • #8056
                                                Anonymous

                                                  You should probably have a scorefile (probably score.sc) that is basically a whitespace-delimited spreadsheet of energy terms and total score for each result. The scores are also likely to be at the bottom of the file in the .pdb outputs. There will be four terms that start “dslf” that are the disulfide terms:
                                                  dslf_ca_dih
                                                  dslf_cs_ang
                                                  dslf_ss_dih
                                                  dslf_ss_dst

                                                  If those terms are not present, disulfides are not being scored at all (they are in mm_std, so they ought to be present). If those terms have all zero values in each of your pdb structures, then none of your d-cysteine disulfides are being recognized as such, which explains why they aren’t being maintained properly. If you can confirm that you have a d-cysteine disulfide that comes up with zero score (meaning it’s unrecognized) then can file a bug report that D-cysteine disulfides don’t work, but it’s not a bug I expect to be fixed soon.

                                                • #8060
                                                  Anonymous

                                                    I don’t have any of these terms in my output score file.
                                                    How do I prohibit the application to repack and break the bonds ?

                                                  • #8061
                                                    Anonymous

                                                      A) Show me what terms you DO have in your output file

                                                      B) We probably can’t. Assuming the local backbone context is fixed, if you are using a resfile, you may be able to pass NATRO for those residues to prevent packing. I forget what executable you’re using so that might not work either.

                                                    • #8066
                                                      Anonymous

                                                        If you’re using a protocol that’s constraint file aware, you can pass in a manually-made constraint file which enforces the disulfide bond geometry. (You would also need to make sure your weights files have the appropriate constraint term turned on as well.)

                                                      • #8103
                                                        Anonymous

                                                          I started to get the same error message when running a Rosetta script
                                                          that uses the HBondsToResidue filter for complexes composed of NCAA’s.
                                                          (attached)

                                                          Since there is an energy_cutoff used in the filter, can I somehow make the
                                                          filter aware that it needs to use the mm_std weights, like I did for docking?
                                                          Passing -score:weights mm_std and -score:pack_weights mm_std in the flags file
                                                          doesn’t do the job here.

                                                          Jarek

                                                        • #8105
                                                          Anonymous

                                                            It looks like HbondsToResidueFilter hard codes standard.wts/score12.wts_patch at the scorefunction to use, so there’s no commandline or tag option which will fix things.

                                                            There’s several ways around it, though. The easiest one is to massage the standard.wts and score12.wts_patch to remove the offending terms. You might even be able to avoid touching the database if you put the changed copy of standard.wts and score12.wts_patch files in the current directory (the one you’re running the program in).

                                                        Viewing 7 reply threads
                                                        • You must be logged in to reply to this topic.