Simple workflow using tmap to find transmembrane regions, using a single sequence as input.
Displays membrane spanning regions
png
http://www.ebi.ac.uk/soaplab/emboss4/services/protein_2d_structure.tmap
For an entry identifer, fetch the sequence, otherwise ensure the sequence is in fasta format.
Given a sequence or sequence entry identifer (e.g. uniprot:wap_rat), return the sequence in fasta format.
If a sequence identifier, in database:identifier format, is input the EBI's WSDbfetch web service (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch) is used to retrive the sequence in fasta format. Otherwise the input is assumed to be a sequence and if passed through the Soaplab EMBOSS seqret service to force the sequence into fasta format.
Return true if the input is a sequence or false if the input is a sequence identifer (e.g. uniprot:wap_rat).
lineLen = sequence.indexOf("\n");
if(lineLen < 1) {
lineLen = sequence.length();
}
if(!sequence.startsWith(">") &&
sequence.indexOf(":") > 0 &&
sequence.indexOf(":") < lineLen) {
is_sequence = "false";
} else {
is_sequence = "true";
}
sequence
is_sequence
Fails if the workflow input is an identifier (i.e. is an actual sequence).
org.embl.ebi.escience.scuflworkers.java.FailIfFalse
Fails if the workflow input was a sequence (i.e. is an identifer).
org.embl.ebi.escience.scuflworkers.java.FailIfTrue
Fetch the sequence in fasta format from the identifer using EBI's WSDbfetch service (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch).
fasta
raw
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSDbfetch.wsdl
fetchData
Format sequence into fasta format.
fasta
http://www.ebi.ac.uk/soaplab/emboss4/services/edit.seqret
Either an actual sequence or an entry identifer in database:identifier format (e.g. uniprot:wap_rat).
Sequence in fasta format.
Completed
Fail_if_sequence
fetchData
Scheduled
Running
Completed
Fail_if_identifer
seqret
Scheduled
Running
// Reformat a tmap report into GFF.
import java.util.StringTokenizer;
tmap_gff = ""; // Return GFF
seqId = "";
// Split into sections
StringTokenizer tok1 = new StringTokenizer(tmap_output, "=");
sectionNum = 0;
while(tok1.hasMoreElements()) {
sectionStr = tok1.nextElement();
sectionNum++;
if(sectionNum == 4) { // Details for input sequence
// Split into lines
StringTokenizer tok2 = new StringTokenizer(sectionStr, "\n");
while(tok2.hasMoreElements()) {
lineStr = tok2.nextElement();
if(lineStr.startsWith("# Sequence: ")) { // Sequence ID
StringTokenizer tok3 = new StringTokenizer(lineStr);
fieldCount = 0;
while(tok3.hasMoreElements()) {
fieldStr = tok3.nextElement();
fieldCount++;
if(fieldCount == 3) {
seqId += fieldStr;
}
}
}
}
}
if(sectionNum == 5) { // Details of features
// Split into lines
StringTokenizer tok4 = new StringTokenizer(sectionStr, "\n");
while(tok4.hasMoreElements()) {
lineStr = tok4.nextElement();
if(!(lineStr.length() == 0 || lineStr.startsWith("#") || lineStr.startsWith(" Start"))) {
tmap_gff += seqId + "\ttmap\tTRANSMEM";
// Split into fields
StringTokenizer tok5 = new StringTokenizer(lineStr);
fieldCount = 0;
while(tok5.hasMoreElements()) {
fieldStr = tok5.nextElement();
fieldCount++;
if(fieldCount > 0 && fieldCount < 3) { // Start and stop
tmap_gff += "\t" + fieldStr;
}
}
tmap_gff += "\t.\t.\t.\tEMBOSS tmap\n";
}
}
}
}
tmap_output
tmap_gff
Input sequence to analyse for transmembrane regions. Either the actual sequence (fasta format recommended) or an entry identifer in database:identifer format (e.g. uniprot:LPHN2_RAT).
Output of tmap describing the found transmembrane features.
image/png
Plot showing the tmap score and the predicted transmembrane regions.