BioAID_EnirchBioModelWithProteinsFromText

Created: 2009-05-16 01:06:26 Last updated: 2009-05-16 01:13:18

Download Workflow

This workflow is for demonstration purposes only. Please contact the authors if you wish to try it. We will gladly collaborate with you.

Summary

This workflow extracts proteins and protein relations from Medline. Extracted protein names (symbols of at least 3 characters) are validated against mouse, rat, and human UniProt symbols, so the results are limited to these species. This workflow follows the following basic steps:

it retrieves documents relevant for the query string
it discovers proteins in those documents, that are considered relevant to the query string and related to the proteins mentioned in the query (colocation in text mining jargon)
it stores the results in a semantic repository

To support hypothesis formation, the results are added to a repository containing proto-ontologies with biological classes and procedural classes to log evidence. The models are based on RDF and OWL.

Acknowledgements:

Synonyms and Uniprot services: Martijn Scheumie, BioSemantics Group, University of Rotterdam, The Netherlands (BioRange project)

Known issues

Occasionally the workflow will fail on intermediate results that return no results (e.g. on a time out or a bug in the workflow). This problem will be addressed in Taverna 2 using its more strict list iteration mechanism and the AIDA plugin for Taverna 2. The workflow contains some elements that are not yet functional. This will show as failed when run. This can be ignored.

Please contact us if you have any questions about the workflow, our approach, or if you experience technical difficulties.

Preview

Download as scalable diagram (SVG)

Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/379/download?version=7
[ More Info Expand ]

Workflow Components

Inputs (4)

Name	Description
query	Biological query, e.g. a protein of interest. See Lucene documentation for advanced queries (http://lucene.apache.org/) Synonyms for protein names will be searched and added for terms within double quotes.
max_document_nr	limits the maximum number of hits search will produce. In Taverna 1 '100' works well while a 1000 and above is likely to halt Taverna 1 due to memory problems. This also depends on the memory setting for the java virtual machine by the client (usually your local Taverna).
aida_magic_word	A magic word is required to make use of the AIDA semantic repository for BioAID workflows. Please ask Scott Marshall (marshall@science.uva.nl) or Marco Roos (M.Roos1@uva.nl) for the magic word. NB: this semantic repository is for temporary data only. You should expect the repository to be cleared often and without warning.
clear_knowledge_repository	true if you would like the knowledge base (triple store) to be cleared and the proto-ontologies reloaded false otherwise

Processors (32)

Name	Type	Description
false	stringconstant
MedLineTotalDocCount	stringconstant
MinLogLikelhoodScoreDescription	stringconstant
true	stringconstant
Fail	local
s06_AddProteinRelationToSemanticModel	workflow
negate	beanshell
CountProteinsPerDocument	beanshell
CloneStringToList_NERserviceRunInstance	beanshell
CloneStringToList_DocumentInstance	beanshell
CloneStringToList_NERprocessRunInstance	beanshell
Timestamp	beanshell
s07_AddScoreToSemanticModel	workflow	Add ranking score for discovered protein terms to the semantic model.
s02_AddOriginalQueryToSemanticModel	workflow	Add Query to Semantic model with Sesame service cf example Biological Query
02_RetrieveDocumentsFromMedline	workflow	This workflow applies the search web service from the AIDA toolbox. Comments: This search service is based on lucene defaults; it may be necessary to optimize the querystring to adopt the behaviour to what is most relevant in a particular domain (e.g. for medline prioritizing based on publication date is useful). Lucene favours shorter sentences, which may be bad for subsequent information extraction.
p03_AddProteinDiscoveryToSemanticModel	workflow
p02_AddDocumentDiscoveryToSemanticModel	workflow
s03b_AddQueryProteinsToSemanticModel	workflow	Add Protein to Semantic model with Sesame service cf example Discovered Proteins
01b_UniProtXrefURLs	workflow	Adds URL cross references to various protein information resources.
s03_AddExpandedQueryToSemanticModel	workflow	Add autamotically expanded query to Semantic model.
p01_AddWorkflowToSemanticModel	workflow
05_ScoreExtractedProteins	workflow	This workflow calculates a min log likelihood score for the combination of a discoverd protein and a protein of interest (the query protein). Note that at the moment the total count of medline papers, which is part of the formula, is hard coded and not exact. Given its size this should not matter that much, and certainly not in comparison with other likelihoods calculated using the same value.
s08_AddRdfToRepository	workflow
01b_CalculateQueryFrequency	workflow
s01_AddBiologicalModelToSemanticModel	workflow	Add Query to Semantic model with Sesame service cf example Biological Query
s00_InitializeSemanticStorage	workflow
06_UniProtXrefURLs_iHopBYPASS	workflow	Adds URL cross references to various protein information resources.
04_ExtractProteinRelations_HomoSapiens	workflow	Workflow to extract protein protein interactions from text, followed by filtering protein names known as human protein names. The protein protein interaction service takes the output in 'IOB' format from applyCRF, which annotates proteins as such in text.
s05_AddDiscoveredProteinToSemanticModel	workflow	Add Protein to Semantic model with Sesame service cf example Discovered Proteins
s04_AddDocToSemanticModel	workflow	Add Document to Semantic model with Sesame service cf example discovered document
01_ProcessQuery	workflow	Workflow to optimize a Lucene document retrieval query to 1. increase the priority of recent years (in decreasing order from 2009 down to 2002) 2. limit a subsequent search to a specific organism using a mesh organism tag
03_ExtractProteins_UniProtValidation	workflow	Workflow to extract proteins from text, followed by filtering protein names known as human protein names.

Beanshells (87)

Name	Inputs	Outputs
InstantiateDiscoveryScore	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
ConcatenateStringList	stringlist delimiter	output
DefineHasDiscoveryScoreRelation	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineTypedPropertyOfInstance	instance_uri property_uri property_string property_type	NTriple_PropertyOfInstance_statement
DefinePropertyOfInstance	property_uri property_string instance_uri	NTriple_PropertyOfInstance_statement
InstantiateQueryInstance	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
ConcatenateRDFstatements	stringlist delimiter	output
ReplaceCharsForQueryID	input	output
SlashDoubleQuotes	input findstring replacestring	output
DefineSemanticRelation	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineBooleanPropertyOfInstance	instance_uri property_uri property_string	NTriple_PropertyOfInstance_statement
DefineSemanticRelationRunComponent	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
InstantiateRunOfNERProcessAndService	instance_ontology_url datetime timestamp_shortstring executed_service_instance_uri executed_process_instance_uri process_service_relation_uri run_of_process_property_uri run_of_service_property_uri process_run_type_uri service_run_type_uri input	NTriple_statements process_run_instance_uri service_run_instance_uri
ConcatenateStringList	stringlist delimiter	output
ConcatenateStringList	stringlist delimiter	output
DefineSemanticRelationRunComponent	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
InstantiateRunOfDocRetrProcessAndService	instance_ontology_url datetime timestamp_shortstring executed_service_instance_uri executed_process_instance_uri process_service_relation_uri run_of_process_property_uri run_of_service_property_uri process_run_type_uri service_run_type_uri input	NTriple_statements process_run_instance_uri service_run_instance_uri
DefineSeeAlsoEntrezUniprotForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
DefineSeeAlsoiHopForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
InstantiateSemanticType_Protein	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
DefineSeeAlsoiHopQueryForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
DefineSemanticRelation_isModelComponent	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineSemanticRelation_references	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineSeeAlsoExpasyForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
ConcatenateStringList	stringlist delimiter	output
InstantiateSemanticType_ProteinTerm	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
ProteinTermAnnotations	protein_name uniprot_id	protein_label protein_comment protein_term_label protein_term_comment
ReplaceCharsForQueryID	input	output
DefineSemanticRelation_expansion_of	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
ConcatenateRDFstatements	stringlist delimiter	output
DefinePropertyOfInstance	property_uri property_string instance_uri	NTriple_PropertyOfInstance_statement
DefineSemanticRelation_references	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
SlashDoubleQuotes	input findstring replacestring	output
InstantiateQueryInstance	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
DefineBooleanPropertyOfInstance	instance_uri property_uri property_string	NTriple_PropertyOfInstance_statement
ConcatenateStringList	stringlist delimiter	output
InstantiateRunOfTextMiningProcessAndWorkflow	instance_ontology_url datetime timestamp_shortstring executed_service_instance_uri executed_process_instance_uri process_service_relation_uri run_of_process_property_uri run_of_service_property_uri process_run_type_uri service_run_type_uri input	NTriple_statements process_run_instance_uri service_run_instance_uri
PoiAndDpQuery	poi_query dp_query	poi_and_dp_query
RelativeFrequencyPoiInCorpus	corpus_total poi_count_in_corpus	relative_frequency
validate_query	query	validated_query count_brackets
CloneFrequencies	copy_number input	clones
MinLogLikelihood	query_frequency discovered_frequency query_discovered_frequency total_frequency	minloglikelihood
CountListElements	list	count
RelativeFrequencyPoiInCorpus	corpus_total poi_count_in_corpus	relative_frequency
InstantiateSemanticType	instance_ontology_url instance_name type_uri label_string comment_string	NTriple_InstanceOf_statement instance_uri
ParseToBioModelID	input	output
RepositoryRef	sesame_url repository	bioaid_repository_url
AddQueryToFilename	input_query input_filename	new_filename
FilterTrueProteinPairsByUniProtID	protein1 uniprot1 protein2 uniprot2	true_protein1 true_uniprot1 true_protein2 true_uniprot2
UniProtOrNot2	uniprotIDlist	uniprotID_or_False
UniProtOrNot1	uniprotIDlist	uniprotID_or_False
example_interaction_doc		interaction_doc
ConcatenateRelation	protein_name1 protein_name2 interaction_term uniprot_id1 uniprot_id2	relation id_relation
DefineSeeAlsoiHopForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
InstantiateSemanticType_ProteinTerm	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
DefineSemanticRelation_discovered_by	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineSeeAlsoiHopQueryForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
DefineDocComponentRelation	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineRelation_NERCRFhasInput	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineSemanticRelation_isModelComponent	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineSeeAlsoExpasyForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
DefineRelation_NERCRFhasOutput	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
ProteinTermAnnotations	protein_name uniprot_id	protein_label protein_comment protein_term_label protein_term_comment
InstantiateSemanticType_Protein	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
DefineSemanticRelation_references	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
ConcatenateStringList	stringlist delimiter	output
DefineSeeAlsoEntrezUniprotForInstance	instance_uri annotation_property_uri seeAlso_url	NTriple_PropertyOfInstance_statement
DefineRelation_DocRetr_hasInput	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
ConcatenateStringList	stringlist delimiter	output
DefineSemanticRelation_has_output	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
DefineSemanticRelation_discovered_by	domain_instance_uri relation_uri range_instance_uri	NTriple_Relation_statement
InstantiateSemanticType	instance_ontology_url instance_name type_uri label_string comment_string datetime	NTriple_InstanceOf_statement instance_uri
validatePotentialProteinName	uniProtIDlist potentialProteinName	validatedUniProtIDlist validatedUniProtID validatedProteinName validated
QueryToStringIfNotList	input_list	output
RemoveFalseProteins	validated uniProtIDlist proteinNameList uniProtIDlistList	cleanUniProtIDlist cleanProteinNamelist cleanUniProtIDlistList
Concat_synonyms	synonymlist query_term	new_query
ListFindAndReplace	input findstringlist replacestringlist	output
ReListEmpty	listOfLists_in	listOfLists_out
Prioritise_lucene_query	query_string priority_string	lucene_query
validatePotentialProteinName	uniProtIDlist potentialProteinName	validatedUniProtIDlist validatedUniProtID validatedProteinName validated
Trim_protein_gene	input	output
negate	true_or_false	false_or_true
CountProteinsPerDocument	list	count
CloneStringToList_NERserviceRunInstance	copy_number input	clones
CloneStringToList_DocumentInstance	copy_number input	clones
CloneStringToList_NERprocessRunInstance	copy_number input	clones
Timestamp		now_RFC822 now_short now_ISO8601

Outputs (7)

Name	Description
ProteinURL
PubMed_URL
protein_discovery_score
Protein_name
Relation
RDFtriples_doc_url
BioAID_RDFrepository_URL

Links (112)

Source	Sink
aida_magic_word	s00_InitializeSemanticStorage:aida_magic_word
aida_magic_word	s08_AddRdfToRepository:aida_magic_word
max_document_nr	02_RetrieveDocumentsFromMedline:maxHits
query	01_ProcessQuery:query_string
01_ProcessQuery:UniProtID	s03b_AddQueryProteinsToSemanticModel:uniprot_id
01_ProcessQuery:UniProtIDlist	01b_UniProtXrefURLs:UniProtID
01_ProcessQuery:extended_lucene_query	01b_CalculateQueryFrequency:query
01_ProcessQuery:extended_lucene_query	02_RetrieveDocumentsFromMedline:queryString
01_ProcessQuery:extended_lucene_query	05_ScoreExtractedProteins:query
01_ProcessQuery:extended_lucene_query	s03_AddExpandedQueryToSemanticModel:query
01_ProcessQuery:protein_name	s03b_AddQueryProteinsToSemanticModel:protein_name
01b_CalculateQueryFrequency:poi_count_in_corpus	05_ScoreExtractedProteins:query_frequency
01b_UniProtXrefURLs:EntrezUniProtURL	s03b_AddQueryProteinsToSemanticModel:entrez_pubmed_URL
01b_UniProtXrefURLs:ExpasyUniProtURL	s03b_AddQueryProteinsToSemanticModel:expasy_URL
01b_UniProtXrefURLs:iHopSearchURL	s03b_AddQueryProteinsToSemanticModel:iHop_search_URL
01b_UniProtXrefURLs:iHopSentencesURL	s03b_AddQueryProteinsToSemanticModel:iHop_sentence_URL
02_RetrieveDocumentsFromMedline:abstract	03_ExtractProteins_UniProtValidation:input_text
02_RetrieveDocumentsFromMedline:abstract	04_ExtractProteinRelations_HomoSapiens:input_text
02_RetrieveDocumentsFromMedline:pubmed_URL	s04_AddDocToSemanticModel:pubmed_URL
02_RetrieveDocumentsFromMedline:pubmed_id	s04_AddDocToSemanticModel:pubmed_id
03_ExtractProteins_UniProtValidation:protein_name	05_ScoreExtractedProteins:discovered_protein
03_ExtractProteins_UniProtValidation:protein_name	CountProteinsPerDocument:list
03_ExtractProteins_UniProtValidation:protein_name	s05_AddDiscoveredProteinToSemanticModel:protein_name
03_ExtractProteins_UniProtValidation:uniProtIDlist	06_UniProtXrefURLs_iHopBYPASS:UniProtID
03_ExtractProteins_UniProtValidation:uniprotID	s05_AddDiscoveredProteinToSemanticModel:uniprot_id
04_ExtractProteinRelations_HomoSapiens:protein1	s06_AddProteinRelationToSemanticModel:protein_name1
04_ExtractProteinRelations_HomoSapiens:protein2	s06_AddProteinRelationToSemanticModel:protein_name2
04_ExtractProteinRelations_HomoSapiens:relation_term	s06_AddProteinRelationToSemanticModel:interaction_term
04_ExtractProteinRelations_HomoSapiens:uniprot_id1	s06_AddProteinRelationToSemanticModel:uniprot_id1
04_ExtractProteinRelations_HomoSapiens:uniprot_id2	s06_AddProteinRelationToSemanticModel:uniprot_id2
05_ScoreExtractedProteins:min_log_likelihood	s07_AddScoreToSemanticModel:score_value
06_UniProtXrefURLs_iHopBYPASS:EntrezUniProtURL	s05_AddDiscoveredProteinToSemanticModel:entrez_pubmed_URL
06_UniProtXrefURLs_iHopBYPASS:ExpasyUniProtURL	s05_AddDiscoveredProteinToSemanticModel:expasy_URL
06_UniProtXrefURLs_iHopBYPASS:iHopSearchURL	s05_AddDiscoveredProteinToSemanticModel:iHop_search_URL
06_UniProtXrefURLs_iHopBYPASS:iHopSentencesURL	s05_AddDiscoveredProteinToSemanticModel:iHop_sentence_URL
CloneStringToList_DocumentInstance:clones	s05_AddDiscoveredProteinToSemanticModel:doc_instance_uri
CloneStringToList_NERprocessRunInstance:clones	s05_AddDiscoveredProteinToSemanticModel:protein_discovery_process_run_instance_uri
CloneStringToList_NERserviceRunInstance:clones	s05_AddDiscoveredProteinToSemanticModel:protein_discovery_service_run_instance_uri
CountProteinsPerDocument:count	CloneStringToList_DocumentInstance:copy_number
CountProteinsPerDocument:count	CloneStringToList_NERprocessRunInstance:copy_number
CountProteinsPerDocument:count	CloneStringToList_NERserviceRunInstance:copy_number
MedLineTotalDocCount:value	01b_CalculateQueryFrequency:corpus_total_doc_count
MedLineTotalDocCount:value	05_ScoreExtractedProteins:PubMedTotalDocCount
MinLogLikelhoodScoreDescription:value	s07_AddScoreToSemanticModel:discovery_score_method
Timestamp:now_ISO8601	p01_AddWorkflowToSemanticModel:timestamp_iso8601
Timestamp:now_ISO8601	p02_AddDocumentDiscoveryToSemanticModel:timestamp_iso8601
Timestamp:now_ISO8601	p03_AddProteinDiscoveryToSemanticModel:timestamp_iso8601
Timestamp:now_ISO8601	s02_AddOriginalQueryToSemanticModel:datetime
Timestamp:now_ISO8601	s03_AddExpandedQueryToSemanticModel:datetime
Timestamp:now_ISO8601	s03b_AddQueryProteinsToSemanticModel:datetime
Timestamp:now_ISO8601	s04_AddDocToSemanticModel:datetime
Timestamp:now_ISO8601	s05_AddDiscoveredProteinToSemanticModel:datetime
Timestamp:now_ISO8601	s07_AddScoreToSemanticModel:datetime
Timestamp:now_short	p01_AddWorkflowToSemanticModel:timestamp_shortstring
Timestamp:now_short	p02_AddDocumentDiscoveryToSemanticModel:timestamp_shortstring
Timestamp:now_short	p03_AddProteinDiscoveryToSemanticModel:timestamp_shortstring
clear_knowledge_repository	negate:true_or_false
false:value	s00_InitializeSemanticStorage:do_not_add_to_repository
false:value	s00_InitializeSemanticStorage:do_not_clear_tmp_rdf_file
false:value	s08_AddRdfToRepository:do_not_add_to_repository
query	s00_InitializeSemanticStorage:original_input_query
query	s01_AddBiologicalModelToSemanticModel:BioModelComment
query	s01_AddBiologicalModelToSemanticModel:ModelIdentifyingName
query	s02_AddOriginalQueryToSemanticModel:query
negate:false_or_true	s00_InitializeSemanticStorage:do_not_clear_repository
p01_AddWorkflowToSemanticModel:AIDA_TextMiningWorkflowRun_instance	p02_AddDocumentDiscoveryToSemanticModel:workflow_run_instance_uri
p01_AddWorkflowToSemanticModel:AIDA_TextMiningWorkflowRun_instance	p03_AddProteinDiscoveryToSemanticModel:workflow_run_instance_uri
p02_AddDocumentDiscoveryToSemanticModel:AIDA_DocRetrProcessRun_instance	s04_AddDocToSemanticModel:docretrieval_process_run_instance_uri
p02_AddDocumentDiscoveryToSemanticModel:AIDA_DocRetrServiceRun_instance	s04_AddDocToSemanticModel:docretrieval_service_run_instance_uri
p03_AddProteinDiscoveryToSemanticModel:AIDA_NERProcessRun_instance	CloneStringToList_NERprocessRunInstance:input
p03_AddProteinDiscoveryToSemanticModel:AIDA_NERserviceRun_instance	CloneStringToList_NERserviceRunInstance:input
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	p01_AddWorkflowToSemanticModel:instance_ontology_url
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	p02_AddDocumentDiscoveryToSemanticModel:instance_ontology_url
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	p03_AddProteinDiscoveryToSemanticModel:instance_ontology_url
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	s01_AddBiologicalModelToSemanticModel:InstanceOntologyURL
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	s02_AddOriginalQueryToSemanticModel:InstanceOntologyURL
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	s03_AddExpandedQueryToSemanticModel:InstanceOntologyURL
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	s03b_AddQueryProteinsToSemanticModel:instance_ontology_url
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	s04_AddDocToSemanticModel:instance_ontology_url
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	s05_AddDiscoveredProteinToSemanticModel:instance_ontology_url
s00_InitializeSemanticStorage:BioAIDinstances_ontology_url	s07_AddScoreToSemanticModel:instance_ontology_url
s00_InitializeSemanticStorage:RDFoutput_doc_url	p01_AddWorkflowToSemanticModel:tmp_RDFdoc_fileref
s00_InitializeSemanticStorage:RDFoutput_doc_url	p02_AddDocumentDiscoveryToSemanticModel:tmp_RDFdoc_fileref
s00_InitializeSemanticStorage:RDFoutput_doc_url	p03_AddProteinDiscoveryToSemanticModel:tmp_RDFdoc_fileref
s00_InitializeSemanticStorage:RDFoutput_doc_url	s08_AddRdfToRepository:rdfFile_url
s01_AddBiologicalModelToSemanticModel:biomodel_instance_uri	s02_AddOriginalQueryToSemanticModel:model_instance_uri
s01_AddBiologicalModelToSemanticModel:biomodel_instance_uri	s03_AddExpandedQueryToSemanticModel:model_instance_uri
s01_AddBiologicalModelToSemanticModel:biomodel_instance_uri	s03b_AddQueryProteinsToSemanticModel:model_instance_uri
s01_AddBiologicalModelToSemanticModel:biomodel_instance_uri	s05_AddDiscoveredProteinToSemanticModel:model_instance_uri
s02_AddOriginalQueryToSemanticModel:query_instance	p01_AddWorkflowToSemanticModel:workflow_input
s02_AddOriginalQueryToSemanticModel:query_instance	p02_AddDocumentDiscoveryToSemanticModel:computation_input
s02_AddOriginalQueryToSemanticModel:query_instance	s03_AddExpandedQueryToSemanticModel:original_query_instance_uri
s03_AddExpandedQueryToSemanticModel:query_instance	s04_AddDocToSemanticModel:query_instance_uri
s04_AddDocToSemanticModel:doc_instance_uri	CloneStringToList_DocumentInstance:input
s04_AddDocToSemanticModel:doc_instance_uri	p03_AddProteinDiscoveryToSemanticModel:computation_input
s04_AddDocToSemanticModel:doc_instance_uri	s06_AddProteinRelationToSemanticModel:doc_instance
02_RetrieveDocumentsFromMedline:pubmed_URL	PubMed_URL
03_ExtractProteins_UniProtValidation:protein_name	Protein_name
04_ExtractProteinRelations_HomoSapiens:relation	Relation
05_ScoreExtractedProteins:min_log_likelihood	protein_discovery_score
06_UniProtXrefURLs_iHopBYPASS:EntrezUniProtURL	ProteinURL
s00_InitializeSemanticStorage:BioAID_Repository_url	BioAID_RDFrepository_URL
s00_InitializeSemanticStorage:RDFoutput_doc_url	RDFtriples_doc_url
s00_InitializeSemanticStorage:RDFoutput_doc_url	s01_AddBiologicalModelToSemanticModel:RDF_doc_filename
s00_InitializeSemanticStorage:RDFoutput_doc_url	s02_AddOriginalQueryToSemanticModel:RDF_doc_filename
s00_InitializeSemanticStorage:RDFoutput_doc_url	s03_AddExpandedQueryToSemanticModel:RDF_doc_filename
s00_InitializeSemanticStorage:RDFoutput_doc_url	s03b_AddQueryProteinsToSemanticModel:tmp_rdf_output_fileref
s00_InitializeSemanticStorage:RDFoutput_doc_url	s04_AddDocToSemanticModel:rdf_output_doc_url
s00_InitializeSemanticStorage:RDFoutput_doc_url	s05_AddDiscoveredProteinToSemanticModel:tmp_rdf_output_fileref
s00_InitializeSemanticStorage:RDFoutput_doc_url	s07_AddScoreToSemanticModel:tmp_rdf_doc_ref
s00_InitializeSemanticStorage:Rdf_NTriple_format	s08_AddRdfToRepository:rdf_format
s05_AddDiscoveredProteinToSemanticModel:protein_term_instance	s07_AddScoreToSemanticModel:discovered_instance_uri

Coordinations (8)

Controller	Target
s07_AddScoreToSemanticModel	s08_AddRdfToRepository
Fail	s06_AddProteinRelationToSemanticModel
Fail	04_ExtractProteinRelations_HomoSapiens
05_ScoreExtractedProteins	06_UniProtXrefURLs_iHopBYPASS
s00_InitializeSemanticStorage	01_ProcessQuery
s03_AddExpandedQueryToSemanticModel	s03b_AddQueryProteinsToSemanticModel
p01_AddWorkflowToSemanticModel	s03_AddExpandedQueryToSemanticModel
s03b_AddQueryProteinsToSemanticModel	p02_AddDocumentDiscoveryToSemanticModel

Information Workflow Type

Taverna 1

Information Uploader

Marco Roos

Information License

All versions of this Workflow are licensed under:

Information Version 7 (latest) (of 7)

Information Credits (8)

(People/Groups)

Information Attributions (0)

(Workflows/Files)

None

Information Tags (10)

Uploader tags

Log in to add Tags

Information Shared with Groups (11)

Information Featured In Packs (2)

Log in to add to one of your Packs

Information Attributed By (0)

(Workflows/Files)

None

Information Favourited By (0)

No one

Information Statistics

8448 viewings

3503 downloads

[ see breakdown ]

Citations (0)

None

Version History

In chronological order:

BioAID_ProteinDiscovery_HomoSapiens

Created by Marco Roos on Friday 22 August 2008 11:00:29 (UTC)

Last edited by Marco Roos on Tuesday 26 August 2008 15:56:58 (UTC)
BioAID_EnirchBioModelWithProteinsFromText

Created by Marco Roos on Friday 22 August 2008 11:00:29 (UTC)

Last edited by Marco Roos on Tuesday 26 August 2008 16:05:11 (UTC)

Revision comment:

Protein discovery workflow that stores instances in a semantic model that separates biology (intensional) knowledge from procedural (extensional) knowledge.
Semantic types in the semantic sub workflows are obtained provisionally using strings produced by a workflow (GetFromSesame.xml) that gets types from a Sesame repository containing the template ontologies.
BioAID_EnirchBioModelWithProteinsFromText

Created by Marco Roos on Friday 22 August 2008 11:00:29 (UTC)

Last edited by Marco Roos on Wednesday 27 August 2008 23:15:01 (UTC)

Revision comment:

Minor update. Changed the description.
BioAID_EnirchBioModelWithProteinsFromText

Created by Marco Roos on Wednesday 29 October 2008 09:14:51 (UTC)

Revision comment:

Adjustments for semantic model updates.
BioAID_EnirchBioModelWithProteinsFromText

Created by Marco Roos on Wednesday 29 October 2008 09:27:18 (UTC)

Last edited by Marco Roos on Friday 15 May 2009 16:38:18 (UTC)

Revision comment:

Minor updates to get the syncing of document instances and proteins right.
BioAID_EnirchBioModelWithProteinsFromText
Created by Marco Roos on Saturday 16 May 2009 00:57:14 (UTC)

Revision comment:
- Temporary switch to development server because of minor unresolved issues with production server
- Changed location of SynSets service.
BioAID_EnirchBioModelWithProteinsFromText
Created by Marco Roos on Saturday 16 May 2009 01:06:26 (UTC)

Last edited by Marco Roos on Saturday 16 May 2009 01:13:19 (UTC)

Revision comment:
1. Temporary return to aida development server due to minor issues with production server
2. Changed URL of SynSets service (it moved)

Reviews (0)

No reviews yet

Be the first to review!

Comments (1)

View Timeline

Log in to make a comment

Marco Roos	Friday 27 November 2009 11:43:02 (UTC)
	This workflow may need some work because of a recent server migration... Our apologies.

Other workflows that use similar services (1)

Taverna 1

Uploader

Marco Roos

BioAID_ProteinDiscovery_filterOnHumanUnipr... (11)

Download

This workflow finds proteins relevant to the query string via the following steps: A user query: a single gene/protein name. E.g.: (EZH2 OR "Enhancer of Zeste"). Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apache's Lucene) Discover proteins: extract proteins discovered in the set of relevant abstracts with a 'named entity recognizer' trained on genomic terms using a Bayesian approach; the AIDA serv...

Created: 2009-05-28

Credits: Marco Roos Martijn Schuemie AID AID_myGrid_collaboration

Attributions: BioAID_DiseaseDiscovery_RatHumanMouseUniprotFilter