Member Site › Forums › Rosetta 3 › Rosetta 3 – General › How to obtain alignment file for comparative modeling under Rosetta 3.1?
- This topic has 11 replies, 5 voices, and was last updated 10 years, 10 months ago by Anonymous.
-
AuthorPosts
-
-
May 25, 2010 at 8:29 am #600Anonymous
Although Rosetta is famous for its de novo prediction protocol, it did provide comparative modeling package with the Rosetta 3.1 bundle but without any documentation. It is obvious that one should provide alignment file for making Rosetta-CM work. In its code, it seems to only accept one of the two alignment formats: grishin or general.
So my question is what is “grishin” or “general” alignment formats, and how to obtain such files? It is easy to guess that “grishin” refers to Dr. Grishin at SWMED. But his web provides so many softwares…
thanks,
len -
May 28, 2010 at 1:39 am #4458Anonymous
I solved it by myself, thanks for the open source.
For “general” alignment file format, it is unbelievable simple:
score nnn.nnn
queryID startPos seq
templID startPos seq
queryID startPos seq
templID startPos seq
–# for start another alignment
score nnn.nnn
queryID startPos seq
templID startPos seq
queryID startPos seq
templID startPos seq
I do not know which aligner generates this “general” file format. Then we need another utility to convert any other aligner’s output to this format. It seems Rosetta-CM calculates identities and gaps in its own code, although I think all the aligner will provide such information with the alignment result.
Anyway, thanks to the Rosetta contributors.
len
-
March 20, 2011 at 5:46 pm #5202Anonymous
Hi len,
I was just reading your post. If I understand you correctly I can just use any available tool to align two sequences. Every part of those two sequences with a positive alignment can be manually entered in this alignment file. For example lets say from sequence A (seqA_) AA 2:43 can be aligned with sequence B (seqB1) 12:53 and sequence A 48:62 with sequence B 56:70. Then my file should look like this:
seqA_ 2 XXXXXXXXXXXX
seqB1 12 XXXXXXXXXXXXX
seqA_ 48 YYYYYYYYYYYYYYYY
seqB1 56 YYYYYYYYYYYYYYYYand now what the algorithm is basically doing is that it uses for the alignment the known crystal structure of sequence b and for the rest without alignment it does an ab initio prediction?
Thanks for you help.
Max
-
March 21, 2011 at 3:34 pm #5214Anonymous
Hi lennylv,
Take a look at the Meiler Lab documentation for homology modeling. It provides information on how to create alignments and thread using Rosetta. It also provides scripts to automate much of the work. Look under the homology modeling tutorial. I believe that most people use clustaw for alignments.
http://www.meilerlab.org/index.php/jobs/resources
Steven C.
-
December 11, 2013 at 2:43 pm #9576Anonymous
Hi,
i have a clustalW alignment file and i’m trying to reformat it to get the correct input for rosetta3.4, but i can’t figure out what format it wants.
Could you please help? I’ll attach the clustalW file. If you could please post a description on how to generate the desired input, including a sample input, that would be great.Thanks,
Sabine -
December 11, 2013 at 3:44 pm #9577Anonymous
The clustal format, more-or-less like you have, is typically the correct format to use, though the details depend on how exactly you’re intending to use it.
I’m not entirely sure how you’re planning to use it with Rosetta3.4, but if you’re intending to use the rosetta_tools/protein_tools/scripts/thread_pdb_from_alignment.py script to thread a PDB with the alignment, I think the file should work as-is. (Although you may want to edit the names of the sequences to match the names of the PDB’s you’re using.)
An example commandline for that threading script (using the weekly releases), with a corresponding alignment file is as follows:
python2.7 ~/Rosetta/tools/protein_tools/scripts/thread_pdb_from_alignment.py –template=2rh1A –target=1u19A –chain=A –align_format=clustal 1u19A.2rh1A.aln 2rh1A.pdb 1u19A_on_2rh1A.pdb
If you have another application you would wish to use the alignment file with, let me know the details and we can figure out the details for that one.
-
December 11, 2013 at 4:02 pm #9579Anonymous
I’m actually trying to build a homology model using the minirosetta application. I’ll attach my flags file and the template file as well. i have one alignment file i got to work, it’s a different format, i created it according to the example in the documentation. But when i use this, i get the error message that it can’t find the template pdb file (attached). Could you please take a look at that as well? (the names in the flags file are not all correct b/c i renamed the files to send them to you.) The working_aln file is the one i made according to the documentation, here it doesn’t complain about a sequence length mismatch in the alignment, however, when i start the alignment with the 3 missing residues in the query sequence, i.e. the template pdb has actually 3 residues in the beginning which i just deleted to make it work, then it gives me a length mismatch error again.
Also, is there a particular naming convention between the tags in the alignment files and the file names?
Thanks! -
December 11, 2013 at 4:34 pm #9580Anonymous
Thanks!
-
December 11, 2013 at 4:36 pm #9581Anonymous
I’m trying to use minirosetta, as described in my following post. Thanks for your help.
-
December 11, 2013 at 7:32 pm #9586Anonymous
I’m also attaching a file showing the error message about no template being provided, though i clearly have one in my directory and i specified it in the input flags file. This still refers to trying to run minirosetta using the files from my previous post.
Thanks. -
December 12, 2013 at 12:19 am #9589Anonymous
Right, sorry about pointing you to the Clustal format. I’d forgotten that Rosetta also likes the “grishin” file format (as mentioned in the documentation for the comparative modeling application https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d5/d4e/comparative_modeling.html). I think there’s some scripts kicking around to convert between them, but I can’t seem to find them at the moment.
Regarding your length mismatch, it looks like there are a number of residues in your alignment which are missing from your template PDB. There’s five residues on the C-terminus and a significant chunk missing in the middle (DVPTLCDSACGHNEGSARENS). I’d recommend matching the sequence in the alignment with the sequence in the PDB. (You probably ran into the situation where the reported sequence of the PDB doesn’t actually match what’s structurally present. The fastas the PDB gives out are for the sequence of what was crystalized. If there’s missing density, that won’t show up structurally, but will show up in the fasta.) Note that deleting off the superflous residues like you did doesn’t hurt anything, and can possibly help. If you’re still experiencing length problems, try removing the extra C-terminal residues from the template structure and the alignment.
-
December 12, 2013 at 1:22 pm #9593Anonymous
Thanks. I figured out the length mismatch, but i still don’t know how to create a grishin alignment file other than actually doing it by hand. Could you please double check if you could find something? Also, i started a new thread, b/c after i solved the length mismatch and created an alignment file by hand, i got a segmentation fault without any additional info. I posted all my input files except the fragment files. Could you please take a look? I’ve been fighting with this for a while now and would like to get it going. Thanks for your help!
Sabine
-
-
AuthorPosts
- You must be logged in to reply to this topic.