From a nucleotide sequence get the protein translations of the open reading frames (stop to stop) that are longer than a specifed minimum length.
EMBOSS getorf is used to find the ORFs and perform the translations. The getorf tool is accessed via Soaplab (see http://www.ebi.ac.uk/Tools/webservices/soaplab/overview).
Finds and extracts open reading frames
(ORFs)
0
fasta
http://www.ebi.ac.uk/soaplab/emboss4/services/nucleic_gene_finding.getorf
Ensure the sequence is in fasta format.
Given a sequence or sequence entry identifer (e.g. uniprot:wap_rat), return the sequence in fasta format.
If a sequence identifier, in database:identifier format, is input the EBI's WSDbfetch web service (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch) is used to retrive the sequence in fasta format. Otherwise the input is assumed to be a sequence and if passed through the Soaplab EMBOSS seqret service to force the sequence into fasta format.
Fails if the workflow input is an identifier (i.e. is an actual sequence).
org.embl.ebi.escience.scuflworkers.java.FailIfFalse
Fails if the workflow input was a sequence (i.e. is an identifer).
org.embl.ebi.escience.scuflworkers.java.FailIfTrue
Return true if the input is a sequence or false if the input is a sequence identifer (e.g. uniprot:wap_rat).
lineLen = sequence.indexOf("\n");
if(lineLen < 1) {
lineLen = sequence.length();
}
if(!sequence.startsWith(">") &&
sequence.indexOf(":") > 0 &&
sequence.indexOf(":") < lineLen) {
is_sequence = "false";
} else {
is_sequence = "true";
}
sequence
is_sequence
Fetch the sequence in fasta format from the identifer using EBI's WSDbfetch service (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch).
fasta
raw
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSDbfetch.wsdl
fetchData
Format sequence into fasta format.
fasta
http://www.ebi.ac.uk/soaplab/emboss4/services/edit.seqret
Either an actual sequence or an entry identifer in database:identifier format (e.g. uniprot:wap_rat).
Sequence in fasta format.
Completed
Fail_if_sequence
fetchData
Scheduled
Running
Completed
Fail_if_identifer
seqret
Scheduled
Running
Input nucleotide sequence. Either the actual sequence (fasta format) or an entry identifier in database:identifer format (e.g. embl:x01153).
The ID of the codon translation table to be used (e.g. 1).
Minimum ORF length to report in base pairs (e.g. 240).
Translations of the ORFs found.