Workflow Entry: Mapping OligoNucleotides to an assembly

Created at: 13/02/09 @ 09:05:35      Last updated: 13/02/09 @ 09:08:20
Information Version 7 (latest) (of 7)
View version:

Version created on: 13/02/09 @ 09:05:35 by: Wassinki   |   Revision comments Expand

Last edited on: 13/02/09 @ 09:08:23 by: Wassinki

Title: Mapping OligoNucleotides to an assembly

Type: Taverna 1


Information Preview

(Click on the image to get the full size)

Medium


Information Description

Version info

The former version of the workflow expected that results from BioMART only report transcripts when the query (the probe in our
case) are entirely encapsulated in an exon of that transcript. However, the BioMart service also returns transcripts when the query is not or only partially overlapping with an exon in the stretch on the assembly on which a transcript is defined. This resulted in too many oligos classified as having multiple transcripts or having multiple genes.

Workflow description

We used RShell in the design process of a Zebrafish microarray
(supp. info Figure S1 and Figure S2). A microarray with 15k probes
of 60-mer oligonucleotides was designed on gene sequences from
Vega (http://vega.sanger.ac.uk/Danio_rerio) and Ensembl
(http://www.ensembl.org/Danio_rerio/) that are also known
in the Zebrafish Information Network (http://zfin.org) (for zebra
fish, the VEGA set is not a subset of the Ensembl set) of the genome
DNA-sequence assemblies and to judge the agreement that exists between
the different assembly annotations, we mapped the Vega-designed probes
onto the Ensembl assembly

It first performs an alignment using the BioMoby Blat and Blast service provided by WUR (www.bioinformatics.nl). Next, for each hit, tries to find the corresponding transcripts and genes using a biomart webservice. The final task is an analysis task using RShell. It calculates for each oligo to which class it belongs:

0 no hit
1 single hit, single transcript, single gene
2 multiple hits, single transcript, single gene, intron spanning
3 multiple hits, single transcript, single gene, possible intron spanning *
4 multiple hits, single transcript, single gene, no intron spanning
5 multiple hits, multiple transcripts, single gene, intron spanning
6 multiple hits, multiple transcripts, single gene, possible intron spanning *
7 multiple hits, multiple transcripts, single gene, no intron spanning
8 single hit, does not meet additional criteria **
9 multiple hits, single transcript, do not meet additional criteria **
10 multiple hits, multiple transcripts, do not meet additional criteria **
11 multiple hits, multiple genes
12 no transcript found but hit(s) meet additional criteria **
13 no transcript found and hit(s) do not meet additional criteria **
14 multiple hits, single transcript, single gene plus hit without transcript found and hits
meet additional criteria **
* Oligo below e-value cut-off 1e-12, but also intron spanning criteria met.
** Additional criteria: either e-value below 1e-12 or intron spanning.

To run this workflow, a certificate to access www.bioinformatics.nl needs to installed (Some services use an SSL connection). Look at the link below how to install this certificate.

http://www.myexperiment.org/files/148

The myExperiment pack http://www.myexperiment.org/packs/45 contains the workflow, the input and a test input. The whole input set is large. It takes about 6 hours on a 3 GHz Linux pc with 24 Gig RAM. The test input set can be run on almost any computer with Taverna and R installed. This set takes approximately 10 minutes.


Information Download



Information Run

Run this Workflow in the Taverna Workbench...

Option 1:

Note: you need to have both the WHIP Launcher and the Taverna myExperiment/WHIP plugin installed on your machine for this to work. See here for information.

Option 2:

Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/603/download?version=7
[ More InfoExpand ]


Information Workflow Components

Inputs (3)
Processors (20)
Beanshells (28)
Outputs (6)
Links (33)
Coordinations (11)

Information Workflow Type

Taverna 1

Information Original Uploader

Information License

All versions of this Workflow are licensed under:

Information Tags (7)

Log in to add Tags

Information Shared with Groups (0)

None

Information Featured In Packs (2)

Log in to add to one of your Packs

Information Ratings (0)

Current:

0.0 / 5

(0 ratings)

Log in to rate and see breakdown of ratings

Information Attributed By (0)

(Workflows/Files)

None

Information Favourited By (1)

 

Citations (0)

None


Version History

Earliest Version:
[1] - Mapping oligonucleotides to an assembly

Created on: Thursday 11 December 2008 @ 12:11:59 (GMT)

Created by: Wassinki

Last edited on: Thursday 11 December 2008 @ 12:25:55 (GMT)

Last edited by: Wassinki

Revision comments:

None

Previous Versions:
[2] - Mapping OligoNucleotides to an assembly

Created on: Thursday 11 December 2008 @ 12:26:50 (GMT)

Created by: Wassinki

Last edited on: Thursday 11 December 2008 @ 12:32:04 (GMT)

Last edited by: Wassinki

Revision comments:

None

[3] - Mapping OligoNucleotides to an assembly

Created on: Friday 19 December 2008 @ 09:41:13 (GMT)

Created by: Wassinki

Revision comments:

<meta /> <meta /> <meta /> <meta /> <link /><!--[if gte mso 9]><xml> <w:WordDocument> <w:View>Normal</w:View> <w:Zoom>0</w:Zoom> <w:PunctuationKerning /> <w:ValidateAgainstSchemas /> <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid> <w:IgnoreMixedContent>false</w:IgnoreMixedContent> <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText> <w:Compatibility> <w:BreakWrappedTables /> <w:SnapToGridInCell /> <w:WrapTextWithPunct /> <w:UseAsianBreakRules /> <w:DontGrowAutofit /> </w:Compatibility> <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel> </w:WordDocument> </xml><![endif]--><!--[if gte mso 9]><xml> <w:LatentStyles DefLockedState="false" LatentStyleCount="156"> </w:LatentStyles> </xml><![endif]-->

----------------------------------------

The newest version only takes into account the probes that have blast hits that map on exons. The BioMart sub workflow has been modified to do this by adding an extra BioMart processor and a beanshell processor to filter those blast hits that map on exons.

----------------------------------------

This workflow maps the input oligo set to an assembly.<o:p></o:p>

It first performs an alignment using the BioMoby Blat and Blast service provided by WUR (www.bioinformatics.nl). Next, for each hit, tries to find the corresponding transcripts and genes using a biomart webservice. The final task is an analysis task using RShell. It calculates for each oligo to which class it belongs:<o:p></o:p>

1 single hit
2-4 multiple hits single transcript
5-7 mulitple hits multiple transcripts
8 single hit, discarded
9 multiple hits single transcript, discarded
10 multiple transcripts, discarded*
11 multi gene, discarded
12 no transcript
13 no transcript, discarded
* classified on the criteria intron spanning only, possible intron spanning and no intron spanning.
* hit(s) do not meet high stringency threshold
* no transcript found but hit(s) meet high stringency threshold.<o:p></o:p>

To run this workflow, a certificate to access www.bioinformatics.nl needs to installed (Some services use an SSL connection). Look at the link below how to install this certificate.

http://www.myexperiment.org/files/148<o:p></o:p>

The myExperiment pack http://www.myexperiment.org/packs/45 contains the workflow, the input and a test input. The whole input set is large. It takes about 6 hours on a 3 GHz Linux pc with 24 Gig RAM. The test input set can be run on almost any computer with Taverna and R installed. This set takes approximately 10 minutes.<o:p></o:p>

<o:p> </o:p>

<o:p> </o:p>

[4] - Mapping OligoNucleotides to an assembly

Created on: Tuesday 03 February 2009 @ 15:27:18 (GMT)

Created by: Wassinki

Last edited on: Tuesday 03 February 2009 @ 15:33:01 (GMT)

Last edited by: Wassinki

Revision comments:

The former version of the workflow expected that results from BioMART only report transcripts when the query (the probe in our case) are entirely encapsulated in an exon of that transcript. However, the BioMart service also returns transcripts when the query is not or only partially overlapping with an exon in the stretch on the assembly on which a transcript is defined. This resulted in too many oligos classified as having multiple transcripts or having multiple genes.
 

[5] - Mapping OligoNucleotides to an assembly

Created on: Wednesday 04 February 2009 @ 08:25:36 (GMT)

Created by: Wassinki

Last edited on: Wednesday 04 February 2009 @ 08:26:04 (GMT)

Last edited by: Wassinki

Revision comments:

None

[6] - Analysing workflows

Created on: Friday 13 February 2009 @ 09:03:37 (GMT)

Created by: Wassinki

Revision comments:

None

Latest Version:
[7] - Mapping OligoNucleotides to an assembly

Created on: Friday 13 February 2009 @ 09:05:35 (GMT)

Created by: Wassinki

Last edited on: Friday 13 February 2009 @ 09:08:23 (GMT)

Last edited by: Wassinki

Revision comments:

None



Reviews Reviews (0)

No reviews yet

Be the first to review!



Comments Comments (0)

No comments yet

Log in to make a comment




Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.

What is this?

Linked Data

Non-Information Resource URI: http://www.myexperiment.org/workflows/603


Alternative Formats

HTML
RDF
XML

New/Upload

Log in / Register

Username or Email:

Password:

Remember me:

OR

Use OpenID:


(eg: name.myopenid.com)

Need an account?
Click here to register

Forgot Password?

Front Page

Home

Invite people to myExperiment

Help pages

About Us

News and Events

Mailing List

Contact Us

Developers

Publications


Taverna Workflow Workbench

myGrid

BioCatalogue

Trident

Google Coop Search

EPSRC

JISC

Microsoft

Powered by:

Rails

Icons:
Silk icon set 1.3