Created: 2009-05-16 01:06:26      Last updated: 2009-05-16 01:13:18

This workflow is for demonstration purposes only. Please contact the authors if you wish to try it. We will gladly collaborate with you.


This workflow extracts proteins and protein relations from Medline. Extracted protein names (symbols of at least 3 characters) are validated against mouse, rat, and human UniProt symbols, so the results are limited to these species. This workflow follows the following basic steps:

  1. it retrieves documents relevant for the query string
  2. it discovers proteins in those documents, that are considered relevant to the query string and related to the proteins mentioned in the query (colocation in text mining jargon)
  3. it stores the results in a semantic repository

To support hypothesis formation, the results are added to a repository containing proto-ontologies with biological classes and procedural classes to log evidence. The models are based on RDF and OWL.


Synonyms and Uniprot services: Martijn Scheumie, BioSemantics Group, University of Rotterdam, The Netherlands (BioRange project)

Known issues

Occasionally the workflow will fail on intermediate results that return no results (e.g. on a time out or a bug in the workflow). This problem will be addressed in Taverna 2 using its more strict list iteration mechanism and the AIDA plugin for Taverna 2. The workflow contains some elements that are not yet functional. This will show as failed when run. This can be ignored.

Please contact us if you have any questions about the workflow, our approach, or if you experience technical difficulties.

Information Preview

Information Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
[ More InfoExpand ]

Information Workflow Components

Inputs (4)
Processors (32)
Beanshells (87)
Outputs (7)
Links (112)
Coordinations (8)

Information Workflow Type

Taverna 1

Information Uploader

Information License

All versions of this Workflow are licensed under:

Information Version 7 (latest) (of 7)

View version:

Information Credits (8)


Information Attributions (0)



Information Tags (10)

Log in to add Tags

Information Featured In Packs (2)

Log in to add to one of your Packs

Information Attributed By (0)



Information Favourited By (0)

No one

Information Statistics


Citations (0)


Version History

In chronological order:

Reviews Reviews (0)

No reviews yet

Be the first to review!

Comments Comments (1)

Log in to make a comment

  • Friday 27 November 2009 11:43:02 (UTC)

    This workflow may need some work because of a recent server migration... Our apologies.

Workflow Other workflows that use similar services (1)

Workflow BioAID_ProteinDiscovery_filterOnHumanUnipr... (11)

This workflow finds proteins relevant to the query string via the following steps: A user query: a single gene/protein name. E.g.: (EZH2 OR "Enhancer of Zeste"). Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apache's Lucene) Discover proteins: extract proteins discovered in the set of relevant abstracts with a 'named entity recognizer' trained on genomic terms using a Bayesian approach; the AIDA serv...

Created: 2009-05-28

Credits: User Marco Roos User Martijn Schuemie Network-member AID Network-member AID_myGrid_collaboration

Attributions: Workflow BioAID_DiseaseDiscovery_RatHumanMouseUniprotFilter