Workflow Entry: hierarchical microarray clustering

Created at: 05/12/08 @ 18:31:53      Last updated: 05/12/08 @ 20:33:37
Information Version 1 (of 1)

Version created on: 05/12/08 @ 18:31:53 by: Wei Tan   |   Revision comments Expand

Last edited on: 05/12/08 @ 20:33:39 by: Wei Tan

Title: hierarchical microarray clustering

Type: Taverna 1


Information Preview

(Click on the image to get the full size)

Medium


Information Description

To illustrate our caGrid plug-in’s application, we tested it with a microarray hierarchical clustering workflow that involves services hosted at multiple institutions.
Microarrays are a high-throughput technology used to measure the expression of tens of thousands of genes in different tissues or cells. Scientists represent the data from each microarray via a vector (profile) in which each element represents a gene’s expression level. They use clustering analysis to identify similar expression profiles across genes or samples.10 In particular, hierarchical clustering is popular for grouping microarrays into a multilevel hierarchy in which, at each level, arrays in the same cluster are more similar to each other than those in different clusters. To cluster data, the user must identify and retrieve relevant microarrays, preprocess them, and then invoke the hierarchical clustering program. In the past, we might have programmed this sequence of steps using a scripting language such as Perl. Instead, we use Taverna and the caGrid plug-in to identify relevant services, compose those services with additional building blocks (for data transformation), and orchestrate their execution. Our workflow involves three major steps:
1.    Identify and retrieve the microarray data of interest. We used CQL, the query language that caGrid Data Services uses, to specify this data and retrieve it from a caArray data service hosted at Columbia University. (http://cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub)
2.     Preprocess, or normalize, the microarray data before clustering them. We used a GenePattern analytical service (http://node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService), which provides normalization, floor and ceiling thresholding, variation filtering, and other preprocessing functions. We used an instance of this service hosted at MIT’s Broad Institute.
3.    Run hierarchical clustering on the preprocessed data. We invoked the geWorkbench analytical service Columbia University hosts. (http://cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage).

The Taverna workflow contains an input processor to store the CQL expression, an output processor to store the clustered microarray data (both input and output processors are blue), three caGrid processors (green) representing the three caGrid services just listed, and a few “shim” processors, such as XML splitters and beanshell scripts, to deal with data transformation between services.


Information Download



Information Run

Run this Workflow in the Taverna Workbench...

Option 1:

Note: you need to have both the WHIP Launcher and the Taverna myExperiment/WHIP plugin installed on your machine for this to work. See here for information.

Option 2:

Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/599/download?version=1
[ More InfoExpand ]


Information Workflow Components

Inputs (1)
Processors (15)
Beanshells (2)
Outputs (1)
Links (16)
Coordinations (0)

Information Workflow Type

Taverna 1

Information Original Uploader

Information License

All versions of this Workflow are licensed under:

Information Credits (1)

(People/Groups)

Information Attributions (0)

(Workflows/Files)

None

Information Tags (8)

Log in to add Tags

Information Shared with Groups (0)

None

Information Featured In Packs (0)

None

Log in to add to one of your Packs

Information Ratings (0)

Current:

0.0 / 5

(0 ratings)

Log in to rate and see breakdown of ratings

Information Attributed By (0)

(Workflows/Files)

None

Information Favourited By (0)

No one

 

Citations (0)

None


Version History

Earliest Version:
[1] - hierarchical microarray clustering

Created on: Friday 05 December 2008 @ 18:31:53 (GMT)

Created by: Wei Tan

Last edited on: Friday 05 December 2008 @ 20:33:39 (GMT)

Last edited by: Wei Tan

Revision comments:

None

This Workflow only has one version.



Reviews Reviews (0)

No reviews yet

Be the first to review!



Comments Comments (0)

No comments yet

Log in to make a comment




Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.

What is this?

Linked Data

Non-Information Resource URI: http://www.myexperiment.org/workflows/599


Alternative Formats

HTML
RDF
XML

New/Upload

Log in / Register

Username or Email:

Password:

Remember me:

OR

Use OpenID:


(eg: name.myopenid.com)

Need an account?
Click here to register

Forgot Password?

Front Page

Home

Invite people to myExperiment

Help pages

About Us

News and Events

Mailing List

Contact Us

Developers

Publications


Taverna Workflow Workbench

myGrid

BioCatalogue

Trident

Google Coop Search

EPSRC

JISC

Microsoft

Powered by:

Rails

Icons:
Silk icon set 1.3