Version 9 (latest)
(of 9)
|
Version created on:
14/12/08 @ 21:42:40
by:
Marco Roos
|
Revision comments
Last edited on: 14/12/08 @ 21:44:19 by: Marco Roos
Title: BioAID_ProteinDiscovery_filterOnHumanUniprot_perDoc_html
Type: Taverna 1
Preview
(Click on the image to get the full size)
Description
This workflow finds proteins relevant to the query string via the following steps: 1. A user query: a single gene/protein name. E.g.: (EZH2 OR "Enhancer of Zeste"). 2. Retrieve documents: finds ‘maximumNumberOfHits’ relevant documents (abstract+title) based on query (the AIDA service inside is based on Apache’s Lucene) 3. Discover proteins: extract proteins discovered in the set of relevant abstracts with a ‘named entity recognizer’ trained on genomic terms using a Bayesian approach; the AIDA service inside is based on LingPipe. This subworkflow also ‘filters’ false positives from the discovered protein by requiring a discovery has a valid UniProt ID. Martijn Schuemie’s service to do that contains only human UniProt IDs, which is why this workflow only works for human proteins. Workflow by Marco Roos (AID = Adaptive Information Disclosure, University of Amsterdam; http://adaptivedisclosure.org) Text mining services by Sophia Katrenko and Edgar Meij (AID), and Martijn Schuemie (BioSemantics, Erasmus University Rotterdam). Changes to our original BioAID_DiseaseDiscovery workflow: * Stops at protein discovery * Use of Martijn Schuemie’s synsets service to * add synonyms to the query. * provide uniprot ids to discovered proteins * filter false positive discoveries, only proteins with a uniprot id go through; this introduces some false negatives (e.g. discovered proteins with a name shorter than 3 characters) * Counting of results in various ways, but no outputs defined in this simplified workflow. * Output into simple html table.
Download
Run
Option 1:
Note: you need to have both the WHIP Launcher and the Taverna myExperiment/WHIP plugin installed on your machine for this to work. See here for information.
Option 2:
Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/154/download?version=9
[ More Info
]
Workflow Components
All versions of this Workflow are licensed under the Creative Commons Attribution-Share Alike 3.0 License.
Log in to add Tags
Shared with Groups (2)
Current:
0.0 / 5
(0 ratings)
Log in to rate and see breakdown of ratings
Statistics
1358 viewings
1096 downloads
None
Earliest Version:
[1] - BioAID_ProteinDiscovery_filterOnHumanUniprot_perDoc_html
Previous Versions:
[2] - BioAID_ProteinDiscovery_filterOnHumanUniprot_perDoc_html
Latest Version:
[9] - BioAID_ProteinDiscovery_filterOnHumanUniprot_perDoc_html
Reviews
(0)
Copyright (c) 2007 - 2008 The University of Manchester and University of Southampton
Log in to make a comment
I am not sure I understand what this workflow does.
Can you please add some use case/example of how to use it?
What do you mean exactly with 'proteins relevant to the query string'? Proteins that interact with the query gene? Or that are involved in the same metabolism?
With which data have you tested this workflow? Which queries have you tried?