Rank Phenotype Terms

Created: 2009-08-10 15:43:48

This workflow counts the number of articles in the pubmed database in which each term occurs, and identifies the total number of articles in the entire PubMed database. It also identified the total number of articles within pubmed so that a term enrichment score may be calculated.

The workflow also takes in a document containing abstracts that are related to a particular phenotype. Scientiifc terms are then extracted from this text and given a weighting according to the number of terms that appear in the document. The higher the value the better the score. This is given as: X = log((a / b) / (c / d))

where: a = number of occurnaces of individual terms in phenotype corpus b = number of abstracts in entire phenotype corpus c = number of occurnaces of individual terms in entire pubmed d = number of articles in entire pubmed

Once this has been created, the pathways obtained from the QTL and microarray pathway analysis workflows are analysed. The documents from a search of each pathway in pubmed are merged into a single document of pathway abstracts. The (unweighted) phenotype terms are then searched in the pathways corpus. This will determine if the phenotype term is listed with the given pathway. The higher the value the better the score. Each term is then assigned a weight as: Y = log((e / f) / (c /d))

where: a = number of occurnaces of individual terms in pathway corpus b = number of abstracts in pathway corpus (per pathway) c = number of occurnaces of individual terms in entire pubmed d = number of articles in entire pubmed

The weighted terms are then given a link score. This is the total of: X + Y. This gives the link between the pathway and the phenotype a score / significance value. The higher the score the more "appropriate/interesting" the link between the pathway and the phenotype.

The terms are also ranked according to the number of pathways which have been given a weight. This is calculated as: W = Sum( X + Y). The higher the value the better the score.

Information Preview

Information Run

You do not have permission to run this workflow


Information Workflow Components

You do not have permission to see the internals of this workflow

Information Workflow Type

Taverna 1

Information Uploader

Information License

All versions of this Workflow are licensed under:

Information Version 1 (of 1)

Information Credits (1)

(People/Groups)

Information Attributions (0)

(Workflows/Files)

None

Information Tags (24)

Log in to add Tags

Information Shared with Groups (0)

None

Information Featured In Packs (1)

Log in to add to one of your Packs

Information Attributed By (4)

(Workflows/Files)

Information Favourited By (0)

No one

 

Citations (0)

None


Version History

In chronological order:



Reviews Reviews (0)

No reviews yet

Be the first to review!



Comments Comments (0)

No comments yet

Log in to make a comment




Workflow Other workflows that use similar services (30)

Only the first 2 workflows that use similar services are shown. View all workflows that use these services.


Workflow Perform a search through NCBI eUtils eSearch (3)

Thumb
This workflow takes in a search term and a database (e.g. snp, gene, protein) in which to perfom the search over. The result is an xml file containing summary information about the search term. Example input for this workflow are given below: database: pubmed terms: cancer AND diabetes

Created: 2009-11-27 | Last updated: 2009-12-03

Uploader

Workflow Escherichia coli : From cDNA Microarray Ra... (1)

Thumb
This workflow takes in a CDNA raw file and a normalisation method then returns a series of images/graphs which represent the same output obtained using the R and bioconductor. Also retruned by this workflow are a list of the top differentialy expressed genes (size dependant on the number specified as input - geneNumber), which are then used to find the candidate pathways which may be influencing the observed changes in the microarray data. By identifying the candidate pathways, more detailed...

Created: 2008-05-08 | Last updated: 2008-05-12

Credits: User Saeedeh User Paul Fisher

Attributions: Workflow HUMAN Microarray CEL file to candidate pathways