MsaPAD: Multiple Sequence Alignment - Input Submission and email notification

Created: 2014-07-30 15:02:56      Last updated: 2015-06-12 13:39:41

BioVeL – Biodiversity Virtual e-Laboratory

Workflow Documentation

Name:Perform a Multiple DNA sequence alignment coding for multiple/single protein domains

Capacities Programme of Framework 7: EC e-Infrastructure Programme – e-Science Environments - INFRA-2011-1.2.1 Grant Agreement No: 283359 Project Co-ordinator: Mr Alex Hardisty Project Homepage: [http://www.biovel.eu][1]

[1]: http://www.biovel.eu

## 1 Description

This workflow is used to submit the multiple alignment job. Once the computation is finished the user will receive an email notification indicating the obtained output address. The workflow at work translates DNA sequences following a user defined genetic code and one or more open reading frame/s. It devides the translated sequence at each stop codon into separate sequences. It searches, using hmmsearch (HMMer3.0 package), translated amino acid sequences against a mirror of PFAM-A conserved domains database. It performs a MSA of either single or multiple protein domains coding sequences. Sequences not coding for the same domain/s of the majority of sequences and those not following domains succession (just in case of multiple domains coding sequences) are discarded. A custom Python2.7 script retrieves then the relevant information regarding domain position falling within or on a complete sequence to align each sequence or fragment of it against its corresponding PFAM domain profile. Furthermore, all fragments are assembled to produce a multiple domain DNA alignment, by back translating protein alignments. The back-translated DNA alignments are merged together to construct the whole multiple domains alignment. During merging, aligned DNA blocks are added from 5' to 3', and if sites do overlap across domain, the 3' overlapping domain section is discarded and registered in a separate file as well as sites that do not match any domain.

## 2 General

**2.1 Name of the workflow and myExperiment +BiodiversityCatalogue identifiers** Name: Perform a Multiple DNA sequence alignment coding for multiple/single protein domains Download info: [http://www.myexperiment.org/packs/371.html][2] BiodiversityCatalogue entry:

**2.2 Date, version and licensing** Last [updated:30/07/14][3] @ 16:00:00 Version: 4 Licensing: Creative Commons Attribution ShareAlike CC-BY- SA

**2.3 How to cite this workflow**

These results come from the processing of public data data (public repositories) through BioVeL's services ([www.biovel.eu][4]). BioVeL is funded by the EU’s Seventh Framework Program, grant no. 283359. Use the article [http://journal.embnet.org/index.php/embnetjournal/article/view/557][6]

[3]: http://www.myexperiment.org/packs/371.html [4]: http://updated:21/02/13 [5]: http://www.biovel.eu [6]: http://journal.embnet.org/index.php/embnetjournal/article/view/557

##

  1. Scientific Specifications

**3.1. Keywords:** MSA, coding sequences, back-translation, merger, MrBayes. **3.2. Scientific workflow description:**

Arguments: 1- DNA sequences coding for single or multiple protein domain in fasta format 2- One of the following genetic codes: 1,6,10,12,15,2,3,4,5,9,13,14,16,21,22,23,11,25 3- Reading frame(s) The different step for a complete phylogenetic inference are in this pack divided as following:

The email address and a filename should be provided to receive the notification of computation completion, therefore the availabilty of the results to be downloaded.

## 4. Technical Specifications

**4.1. Execution environment and installation requirements**

The workflows is tested in Taverna Workbench Biodiversity 2.5.0.

[Taverna workbench installation][7] [7]: http://www.taverna.org.uk/download/workbench/2-5/biodiversity/

## 5. Support For questions with using the workflow, please write [support@biovel.eu][8]. For definitions of technical and biological terms, please visit the BioVeL glossary page: [https://wiki.biovel.eu/display/BioVeL/Glossary][9]

[8]: mailto:support@biovel.eu [9]: https://wiki.biovel.eu/display/BioVeL/Glossary

## 7. Bibliography

  1. Donvito, G., et al. (2012) The BioVeL Project: Robust phylogenetic workflows running on the GRID. Proceedings of the EGI Community Forum 2012/EMI Second Technical Conference (EGICF12-EMITC2). 26-30 March, 2012. Munich, Germany. Published online at http://pos.sissa.it/cgi-bin/reader/conf.cgi?confid=162, id. 29. pp. 29.
  2. Eddy, S.R. (2011) Accelerated Profile HMM Searches, PLoS computational biology, 7, e1002195.

Information Preview

Information Run

You do not have permission to run this workflow


Information Workflow Components

You do not have permission to see the internals of this workflow

Information Workflow Type

Taverna 2

Information Uploader

Information License

All versions of this Workflow are licensed under:

Information Version 2 (latest) (of 2)

View version:

Information Credits (5)

(People/Groups)

Information Attributions (1)

(Workflows/Files)

  • Private item

Information Tags (4)

Log in to add Tags

Information Shared with Groups (1)

Information Featured In Packs (0)

None

Log in to add to one of your Packs

Information Attributed By (0)

(Workflows/Files)

None

Information Favourited By (0)

No one

Information Statistics

 

Citations (0)

None


Version History

In chronological order:



Reviews Reviews (0)

No reviews yet

Be the first to review!



Comments Comments (0)

No comments yet

Log in to make a comment




Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.