User: Marco Roos
|
|
Name: Marco Roos Joined: Saturday 21 July 2007 @ 10:43:23 (GMT) Last seen: Sunday 22 January 2012 @ 16:17:42 (GMT) Email (public): m.roos [at] lumc.nl Website: http://www.lumc.nl/con/6020/39898/909290026392525 Location: Leiden, Netherlands |
Marco Roos has been credited 39 times Marco Roos has an average rating of: 4.0 / 5 (7 ratings in total) for their items |
My role as a biologist and bioinformatician in e-science is to help increase the usefulness of emerging information technologies for biology, while experimenting with new ways to increase insight into mechanisms related to structure and function of DNA in the cell. I experiment with technologies such as workflow, knowledge extraction from text, semantic web and virtual research environments such as myExperiment.
More information on the blog below (originally uploaded as an example for the 'NBIC on workflows' workshop in Lunteren, the Netherlands, March 2008).
Other contact details:
Also see my web page at the University of Amsterdam: http://home.medewerker.uva.nl/m.roos1
Interests:
Structure/function relationship of DNA in the cell, e-science, automated support for modeling biological mechanisms by knowledge extraction and semantic web technology.
Field/Industry: Biology
Occupation/Role(s): PhD, e-(bio)scientist (biology 'power-user'), biology e-Science liaison for NBIC and e-Science organisations
Organisation(s):
Leiden University Medical Centre
University of Amsterdam
NBIC
OMII-UK / myGrid
Note: some items may not be visible to you, due to viewing permissions.
1. Gene expression interpretation by the Global Test
2. BioAID_ProteinDiscovery
3. BioAID_ProteinDiscovery_filterOnHumanUniprot_perDoc_html
4. BioAID_EnirchBioModelWithProteinsFromText
5. BioAID_DiseaseDiscovery_RatHumanMouseUniprotFilter
6. Demo_DiseaseDiscovery_byHumanUniprot_scaffold
7. Retrieve_documents_MR1
8. Retrieve_bio_documents
9. Lucene_bioquery_optimizer_MR1
10. Link_protein_to_OMIM_disease
11. Flatten_and_make_unique
12. Extract_proteins
13. Discover_entities
14. TestIteratorStrategy_withCloning
15. CloneItemsInList
16. TestIteratorStrategy_withNesting
17. TestIterator
18. BioAID_Discover_proteins_from_text_plus_synonyms
19. Discover_proteins_from_text
20. BioAID_ProteinToDiseases
21. CountListElements
22. DiscoverProteinLink
23. ProteinSynonymsToQuery
|
Original Uploader |
Created: 26/04/11 @ 08:31:51 | Last updated: 26/04/11 @ 08:31:52
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow adds meaning to gene expresion values by performing a standard and a literature weighted Global Test. Gene expression is expected to be from Affymetrix microarrays, for which an RMA normalization and entrez Gene ID mapping/summation is performed.
Original workflow is by Dennis Leenheer, edits by Marco Roos. Scripts by Kristina Hettne, acknowledging Rob Jellier, Jelle Goeman, and Peter-Bram 't Hoen.
The workflow was created for the LUMC BioSemantics group, part of the Human Gen...
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 19 times | Downloaded: 0 times Tags (8):
|
View
|
|
Original Uploader |
Created: 10/05/10 @ 16:21:09 | Last updated: 12/01/12 @ 14:39:37
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
The workflow extracts protein names from documents retrieved from MedLine based on a user Query (cf Apache Lucene syntax). The protein names are filtered by checking if there exists a valid UniProt ID for the given protein name.
Rating: 0.0 / 5 (0 ratings) | Versions: 7 | Reviews: 0 | Comments: 1 | Citations: 0 Viewed: 258 times | Downloaded: 116 times Tags (12):
|
View
Download (v7)
|
|
Original Uploader |
Created: 28/05/09 @ 12:21:05
Credits:
Attributions:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow finds proteins relevant to the query string via the following steps:
A user query: a single gene/protein name. E.g.: (EZH2 OR "Enhancer of Zeste").
Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apache's Lucene)
Discover proteins: extract proteins discovered in the set of relevant abstracts with a 'named entity recognizer' trained on genomic terms using a Bayesian approach; the AIDA serv...
Rating: 0.0 / 5 (0 ratings) | Versions: 11 | Reviews: 0 | Comments: 1 | Citations: 0 Viewed: 441 times | Downloaded: 155 times Tags (9):
|
View
Download (v11)
|
|
Original Uploader |
Created: 16/05/09 @ 01:06:26 | Last updated: 16/05/09 @ 01:13:18
Credits:
License: Creative Commons Attribution-No Derivative Works 3.0 Unported License
This workflow is for demonstration purposes only. Please contact the authors if you wish to try it. We will gladly collaborate with you.
Summary
This workflow extracts proteins and protein relations from Medline. Extracted protein names (symbols of at least 3 characters) are validated against mouse, rat, and human UniProt symbols, so the results are limited to these species. This workflow follows the following basic steps:
it retrieves documents relevant for the query string
i...
Rating: 0.0 / 5 (0 ratings) | Versions: 7 | Reviews: 0 | Comments: 1 | Citations: 0 Viewed: 142 times | Downloaded: 26 times Tags (10):
|
View
Download (v7)
|
|
Original Uploader |
Created: 15/12/08 @ 20:46:09 | Last updated: 11/08/11 @ 09:22:23
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow finds disease relevant to the query string via the following steps: 1. A user query: a list of terms or boolean query - look at the Apache Lucene project for all details. E.g.: (EZH2 OR "Enhancer of Zeste" +(mutation chromatin) -clinical); consider adding 'ProteinSynonymsToQuery' in front of the input if your query is a protein. 2. Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apa...
Rating: 4.0 / 5 (2 ratings) | Versions: 4 | Reviews: 0 | Comments: 3 | Citations: 0 Viewed: 3851 times | Downloaded: 571 times Tags (9):
|
View
Download (v4)
|
|
Original Uploader |
Created: 10/12/07 @ 23:10:00
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow finds disease relevant to the query string via the following steps:
A user query: a list of terms or boolean query - look at the Apache Lucene project for all details. E.g.: (EZH2 OR "Enhancer of Zeste" +(mutation chromatin) -clinical); consider adding 'ProteinSynonymsToQuery' in front of the input if your query is a protein.
Retrieve documents: finds 'maximumNumberOfHits' relevant documents (abstract+title) based on query (the AIDA service inside is based on Apache's Lucene)...
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 84 times | Downloaded: 63 times Tags (4):
|
View
Download (v1)
|
|
Original Uploader |
Created: 10/12/07 @ 22:17:00
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow applies the search web service from the AIDA toolbox.
Comments:
This search service is based on lucene defaults; it may be necessary to optimize the querystring to adopt the behaviour to what is most relevant in a particular domain (e.g. for medline prioritizing based on publication date is useful). Lucene favours shorter sentences, which may be bad for subsequent information extraction.
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 56 times | Downloaded: 34 times Tags (4):
|
View
Download (v1)
|
|
Original Uploader |
Created: 10/12/07 @ 22:15:54 | Last updated: 10/12/07 @ 22:45:49
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow retrieves relevant documents, based on a query optimized by adding a string to the original query that will rank the search output according to the most recent years. The added string adds years with priorities (most recent is highest); it starts at 2007.
Rating: 4.5 / 5 (2 ratings) | Versions: 2 | Reviews: 0 | Comments: 2 | Citations: 0 Viewed: 114 times | Downloaded: 42 times Tags (5):
|
View
Download (v2)
|
|
Original Uploader |
Created: 10/12/07 @ 22:14:26
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow does four things:
it retrieves documents relevant for the query string
it discovers entities in those documents, these are considered relevant entities
it filters proteins from those entities (on the tag protein_molecule)
it removes all terms from the list produced by 3 (query terms temporarily considered proteins)
ToDo
Replace step 4 by the following procedure:
1. remove the query terms from the output of NER (probably by a regexp matching on what is inside the tag, ...
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 46 times | Downloaded: 30 times Tags (4):
|
View
Download (v1)
|
|
Original Uploader |
Created: 10/12/07 @ 22:13:35
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
No description
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 134 times | Downloaded: 53 times Tags (4):
|
View
Download (v1)
|
|
Original Uploader |
Created: 10/12/07 @ 22:12:00
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
No description
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 49 times | Downloaded: 29 times Tags (4): |
View
Download (v1)
|
|
Original Uploader |
Created: 10/12/07 @ 22:10:51 | Last updated: 10/12/07 @ 22:30:53
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow filters protein_molecule-labeled terms from an input string(list). The result is a tagged list of proteins (disregarding false positives in the input).
Internal information:
This workflow is a copy of 'filter_protein_molecule_MR3' used for the NBIC poster (now in Archive).
Rating: 0.0 / 5 (0 ratings) | Versions: 2 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 106 times | Downloaded: 53 times Tags (4):
|
View
Download (v2)
|
|
Original Uploader |
Created: 10/12/07 @ 21:48:33 | Last updated: 10/12/07 @ 22:54:42
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow contains the 'Named Entity Recognize' web service from the AIDA toolbox, created by Sophia Katrenko. It can be used to discover entities of a certain type (determined by 'learned_model') in documents provided in a lucene output format.
Known issues:
The output of NErecognize contains concepts with / characters, breaking the xml. For post-processing its results it is better to use string manipulation than xml manipulations.
The output is per document, which means entities will ...
Rating: 5.0 / 5 (1 rating) | Versions: 2 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 94 times | Downloaded: 47 times Tags (4):
|
View
Download (v2)
|
|
Original Uploader |
Created: 29/11/07 @ 15:35:48 | Last updated: 29/11/07 @ 15:40:30
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow implements a strategy for this problem:
> I would like to perform an iteration including a dot product between
> a list and a list of lists; example:
> Input:
>
> [1] (1)
> [A,B,C] (2)
> [[a,b],[c,d],[e,f]] (3)
>
> Desired output:
>
> [1Aa, 1Ab, 1Bc, 1Bd, 1Ce, 1Cf]
In this implementation a java beanshell is used to clone the items in list 2 as many times per item as there are items in the sublists of list 3. The iteration stra...
Rating: 5.0 / 5 (1 rating) | Versions: 2 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 64 times | Downloaded: 48 times Tags (6): |
View
Download (v2)
|
|
Original Uploader |
Created: 29/11/07 @ 15:34:52
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
Utility workflow that clones an item copy_number times. You can use this to work around standard iteration strategies, e.g. in combination with the CountListItems workflow.
Workflow examples: TestIterationStrategy_withClones. For an alternative approach see TestIterationStrategy_withNesting.
Example I/O:
input: A
copy_number: 3
result: [A,A,A]
input: [A,B,C]
copy_number: 3
result: [[A,A,A][B,B,B][C,C,C]]
input: [A,B,C]
copy_number: [3,2]
result: [[[A,A,A],[A,A]][[B,B,B],[B,B]],[[C,C,C],...
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 1 | Citations: 0 Viewed: 52 times | Downloaded: 35 times Tags (5):
|
View
Download (v1)
|
|
Original Uploader |
Created: 29/11/07 @ 15:31:32
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
Implementation of the iteration workaround by Tom Oin conform the Q&A below. The nested workflow 'NestedProcessor' is called that to conform to Tom's explanation. For an alternative solution using a java beanshell to clone list items see 'TestIteratorStrategy_withCloning.
This workflow implements the following Q&A:
Marco Roos wrote:
> Dear Taverna user,
>
> Issue 1: Complex iteration
>
> I would like to perform an iteration including a dot product between
> a list and a list of li...
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 56 times | Downloaded: 41 times Tags (5):
|
View
Download (v1)
|
|
Original Uploader |
Created: 28/11/07 @ 15:28:43
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
Workflow to experiment with list iteration strategies. Look at metadata of nested workflow 'Concatenate' to see the current iteration strategy.
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 80 times | Downloaded: 41 times Tags (2): |
View
Download (v1)
|
|
Original Uploader |
Created: 15/11/07 @ 09:40:24
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow discovers proteins from plain text and adds synonyms using Martijn Schuemie's proteins synonym service. Proteins are discovered with the AIDA 'Named Entity Recognize' web service by Sophia Katrenko (service based on LingPipe), from which output it filters out proteins. The Named Recognizer services uses the pre-learned genomics model, named 'MedLine', to find genomics concepts in plain text.
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 86 times | Downloaded: 2 times Tags (7):
|
View
|
|
Original Uploader |
Created: 15/11/07 @ 08:58:00 | Last updated: 15/11/07 @ 09:12:34
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow discovers proteins from plain text. It is built around the AIDA 'Named Entity Recognize' web service by Sophia Katrenko (service based on LingPipe), from which output it filters out proteins. The Named Recognizer services uses the pre-learned genomics model, named 'MedLine', to find genomics concepts in plain text.
Rating: 0.0 / 5 (0 ratings) | Versions: 2 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 172 times | Downloaded: 65 times Tags (7):
|
View
Download (v2)
|
|
Original Uploader |
Created: 14/11/07 @ 12:47:57 | Last updated: 15/11/07 @ 09:00:44
Credits:
Attributions:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow was based on BioAID_DiseaseDiscovery, changes: expects only one protein name, adds protein synonyms).
This workflow finds diseases relevant to the query string via the following steps:
A user query: a single protein name
Add synonyms (service courtesy of Martijn Scheumie, Erasmus University Rotterdam)
Retrieve documents: finds relevant documents (abstract+title) based on query
Discover proteins: extract proteins discovered in the set of relevant abstracts
5. Link proteins ...
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 175 times | Downloaded: 83 times Tags (8):
|
View
Download (v1)
|
|
Original Uploader |
Created: 17/10/07 @ 14:44:27 | Last updated: 17/10/07 @ 16:13:54 License: Creative Commons Attribution-No Derivative Works 3.0 Unported License
Very simple workflow to count the number of items in a list (top level only in case of nested lists). Does no more than
count = list.size();
Rating: 0.0 / 5 (0 ratings) | Versions: 5 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 91 times | Downloaded: 61 times Tags (3): |
View
Download (v5)
|
|
Original Uploader |
Created: 03/10/07 @ 18:36:12 | Last updated: 15/11/07 @ 09:02:44
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
COMPETITION:
For friends only:
If you find any two topics that return true positives with this workflow I will buy you a bottle of wine (or equivalent).
Terms: if we confirm that the protein was indeed never mentioned together with both input topics in one article, we will publish this together.
----
This workflow implements Swanson's prinicple with services from the AIDA toolbox. It tries to find proteins that link two topics, while they never mentioned together with both topics in ...
Rating: 0.0 / 5 (0 ratings) | Versions: 2 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 279 times | Downloaded: 58 times Tags (8):
|
View
Download (v2)
|
|
Original Uploader |
Created: 03/10/07 @ 18:36:10 | Last updated: 13/11/07 @ 23:47:41
Credits:
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow uses Martijn Schuemie's protein synonym service to produce synonyms and a new query string from the input query term. The service is limited to proteins, enzymes and genes. An input query that is a boolean string will be split and processed, but the boolean logic of the input query will be lost.
Workflow URL:
http://rdf.adaptivedisclosure.org/~marco/BioAID/Public/Workflows/BioAID/ProteinSynonymsToQuery.xml
Rating: 0.0 / 5 (0 ratings) | Versions: 2 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 165 times | Downloaded: 67 times Tags (5):
|
View
Download (v2)
|
New/Upload
Log in / Register
Need an account?
Click here to register
Popular Tags
25 tags
[All Tags]
Copyright © 2007 - 2011 The University of Manchester and University of Southampton

Log in
Register
Give us Feedback
Invite