Content from the e-LICO group

Search filter terms
Filter by category
Filter by type
Filter by tag
Filter by user
Filter by licence
Filter by wsdl
Results per page:
Sort by:
Showing 37 results. Use the filters on the left and the search box below to refine the results.

Workflow Image Mining with RapidMiner (1)

Thumb
This is an image mining process using the image mining Web service provided by NHRF within e-Lico. It first uploads a set of images found in a directory, then preprocesses the images and visualizes the result. Furthermore, references to the uploaded images are stored in the local RapidMiner repository so they can later be used for further processing without uploading images a second time.

Created: 2010-04-28 | Last updated: 2012-01-16

Creator

Pack Core text mining workflows


Created: 2010-02-19 10:12:33 | Last updated: 2011-12-13 16:03:17

This pack contains workflows we have created to support core text mining tasks. We currently provide workflows to do these tasks Loading documents (text or PDF) PDF to text conversion Sentence splitting Text cleaning (ASCII or XML-valid) Term recognition (using NaCTeM service TerMine)  

7 items in this pack

Comments: 0 | Viewed: 727 times | Downloaded: 136 times

Tags:

Workflow Termine with c-value threshold (1)

Thumb
This workflow accepts a list of sentences from a single document and returns the terms found by the TerMine web service. It also allows you to set a threshold c-value score so that only terms with a user-controlled probability (of being a real term) are returned as an output.   To get sentences to supply to this workflow you can use the sentence splitting workflow.  The TerMine service (used in this workflow) only accepts text in ASCII encoding, so you should also use the Clean p...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Workflow PDF to plain text (1)

Thumb
This workflow will extract the plain text content of PDF files supplied to the input port.  You can connect the Load PDF from directory workflow to this workflows input. We recommend you send the output from this workflow to the Clean plain text workflow, because the PDF to text process can add characters into the text that are XML-invalid and therefore can not be sent to most services as plain text.  Another way round this problem is to encode the text as Base64 using the handy loc...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Sentence splitting (1)

Thumb
This workflow will attempt to split up text into sentences, returning a list of sentences to the output port.  The sentence splitting service makes use of the OpenNLP sentence detector and has been trained to work on english text. This workflow can be used to provide input to the Termine with c-value threshold workflow. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Terms from collection of PDF files (2)

Thumb
This workflow will give you a set of candidate terms for each PDF document in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows.  These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow t...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Creator

Pack Who Wants to be a Data Miner?


Created: 2011-11-02 17:54:07 | Last updated: 2013-09-09 16:22:11

One of the most fun events at the annual RapidMiner Community Meeting and Conference (RCOMM) is the live data mining process design competition "Who Wants to be a Data Miner?". In this competition, participants must design RapidMiner processes for a given goal within a few minutes. The tasks are related to data mining and data analysis, but are rather uncommon. In fact, most of the challenges ask for things RapidMiner was never supposed to do. This pack contains solutions for these...

12 items in this pack

Comments: 0 | Viewed: 260 times | Downloaded: 142 times

Tags:

Workflow Terms from collection of text files (1)

Thumb
This workflow will give you a set of candidate terms for each text file in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows.  These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow then...

Created: 2010-02-22 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Load PDF from directory (1)

Thumb
This workflow will automate the reading of a set of PDF files stored in a single directory (the path to which should be supplied as a single input value). This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.  

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Load plain text from directory (1)

Thumb
This workflow will automate the reading of a set of text files stored in a single directory (the path to which should be supplied as a single input value).  It will assume that the text files are saved using the default character encoding for the system that Taverna is running on.  This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.  

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Clean plain text (ASCII) (1)

Thumb
This workflow will remove any XML-invalid and non-ASCII characters (e.g. for sending to the ASCII-only Termine service) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: User James Eales

Pack e-LICO recommender workflows


Created: 2011-03-15 15:33:48 | Last updated: 2012-01-28 19:39:06

This pack contains recommender system workflows created for the purpose of e-LICO project.

6 items in this pack

Comments: 0 | Viewed: 306 times | Downloaded: 162 times

Tags:

Workflow Clean plain text (1)

Thumb
This workflow will remove any XML-invalid characters (these characters often appear in the output of PDF to text software) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.  

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: User James Eales

Creator

Pack Recommender systems workflow templates 2012


Created: 2012-01-08 12:27:43 | Last updated: 2012-06-03 19:48:22

 The Recommender Extension can be downloaded from the Rapid-I Marketplace from: http://rapidupdate.de:8180/UpdateServer/faces/product_details.xhtml?productId=rmx_irbrecommender . More details can be found: http://elico.rapid-i.com/recommender-extension.html          

12 items in this pack

Comments: 0 | Viewed: 478 times | Downloaded: 145 times

Tags:

Workflow miRNA GFF to entrez gene (1)

Thumb
This workflow reads a GFF file of miRNA cooridinates and uses BioMart to search human ensemble genes for the gene that codes for the miRNA. The workflow returns a list of miRNAid, chromosome, start, stop, strand, entrez gene id, gene name, gene strand. Example input file here: ftp://mirbase.org/pub/mirbase/CURRENT/genomes/hsa.gff

Created: 2011-01-26 | Last updated: 2012-01-11

Workflow Using Remember / Recall for "tunneling" re... (1)

Thumb
This process shows how Remeber and Recall operators can be used for passing results from one position to another position in the process, when it's impossible to make a direct connection. This process introduces another advanced RapidMiner technique: The macro handling. We have used the predefined macro a, accessed by %{a}, that gives the apply count of the operator. So we are remembering each application of the models that are generated in the learning subprocess of the Split validation. Af...

Created: 2010-04-29 | Last updated: 2012-01-16

Uploader

Blob Data supplementary to meta-mining workflows

Created: 2012-03-05 22:22:33 | Last updated: 2012-03-05 22:23:51

Credits: User Lawrynka

License: Creative Commons Attribution-Share Alike 3.0 Unported License

- Repositories of RapidMiner baseline workflows, and used datasets - DMOP ontology files from ver5.2 -input files to meta-mining workflows     

File type: ZIP archive

Comments: 0 | Viewed: 74 times | Downloaded: 39 times

This File has no tags!

Uploader

Blob Digital Multimedia Repositories Ontology (DMRO) and ...

Created: 2012-01-29 16:35:26 | Last updated: 2012-01-29 16:38:25

Credits: User Lawrynka

License: Creative Commons Attribution-Share Alike 3.0 Unported License

For the information on the ontology see: http://www.e-lico.eu/?q=node/288 For the information on the original dataset see:    http://www.ecmlpkdd2011.org/challenge.php     The ontology and KB files are zipped into one file.     

File type: ZIP archive

Comments: 0 | Viewed: 226 times | Downloaded: 44 times

Tags:

Uploader

Workflow Loading OWL files (RDF version of videolec... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/). Operator "Build knowledge base" is responsible for collecting data either from OWL files or SPARQL endpoints or RDF repositories and provide it to the subsequent operators in a workflow. In this workflow it is parametrized in this way, that is builds a Sesame/OWLIM repository from the files specified in "Load file" operators. Paths to OWL files are specified as parameter va...

Created: 2012-01-29 | Last updated: 2012-01-29

Uploader

Workflow Semantic clustering (with alpha-clustering... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. The measure used in this particualr workflow is a kernel that exploits membership of clustered individuals to OWL classes from a background ontology ("Epistemic" kernel from [1]). Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". This ...

Created: 2012-01-29 | Last updated: 2012-01-30

Creator

Pack RMonto pack


Created: 2012-01-29 09:47:09 | Last updated: 2012-03-05 22:24:02

  RMonto is an ontological extension to RapidMiner, that provides possibility of machine learning with formal ontologies. RMonto is an easily extendable framework, currently providing support for unsupervised clustering with kernel methods and (frequent) pattern mining in knowledge bases. One important feature of RMonto is that it enables working directly on structured, relational data. Additionally, its custom algorithm implementations may be combined with the power of RapidMiner thr...

9 items in this pack

Comments: 0 | Viewed: 145 times | Downloaded: 31 times

This Pack has no tags!

Uploader

Workflow Semantic clustering (with AHC) of SPARQL q... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. The measure used in this particualr workflow is a kernel that exploits membership of clustered individuals to OWL classes from a background ontology ("Common classes" kernel from [1]). Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". ...

Created: 2012-01-29 | Last updated: 2012-01-29

Uploader

Workflow Semantic clustering (with k-medoids) of SP... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". The SPARQL query is entered in a parameter of "SPARQL selector" operator. The clustering operator (k-medoids) allows to specify which of the query variables are to be used as clustering criteria. If more ...

Created: 2012-01-29

Workflow Discretize by Binning (1)

Thumb
 Choosing an attribute ('a1') with an attribute filter type condition (single).

Created: 2011-12-20 | Last updated: 2011-12-20

Credits: User Rishi Ramgolam

Creator

Pack Creating a focused corpus of factual outcomes from b...


Created: 2011-06-28 11:19:04 | Last updated: 2011-12-13 16:02:16

 This pack contains resources and supplementary files for the submission to the MIND2011 workshop titled "Creating a focused corpus of factual outcomes from biomedical experiments" by James Eales, George Demetriou and Robert Stevens

1 item in this pack

Comments: 0 | Viewed: 77 times | Downloaded: 45 times

Tags:

Creator

Pack RapidMiner plugin for Taverna videos and descriptions


Created: 2011-06-06 10:17:52 | Last updated: 2011-12-13 16:02:04

 This pack contains videos the show how to use various parts of the RapidMiner plugin for Taverna. The videos demonstrate how to build a Taverna workflow which collects a GEO dataset, uploads it to RapidAnalytics, trains a classifier on one half of the data and tests it on the other half. This classification process can be used to gauge how well mutant and control assays agree across experimental repeats.

5 items in this pack

Comments: 0 | Viewed: 172 times | Downloaded: 66 times

Tags:

Creator

Pack Data mining on KUP data


Created: 2011-05-23 12:35:37 | Last updated: 2011-05-24 12:31:58

Various Rapid Miner workflows and R scripts that analyse and visualize KUP-related data extracted from the KUP knowledge base.

9 items in this pack

Comments: 0 | Viewed: 72 times | Downloaded: 23 times

This Pack has no tags!

Workflow One sentence per line (1)

Thumb
This workflow accepts a plain text input and provides a single text document per input containing one sentence per line.  Newline characters are removed from the original input. The OpenNLP sentence splitter is used to split the text, this is provided by University of Manchester Web Services.

Created: 2011-05-06 | Last updated: 2011-12-13

Credits: User James Eales

Workflow PUT data into RapidAnalytics (1)

Thumb
Use the example value for the input port

Created: 2011-04-27 | Last updated: 2011-12-13

Creator

Pack Article title classification: kidney factomics


Created: 2011-04-21 15:20:20 | Last updated: 2011-12-13 16:02:52

 This pack contains files relevant to our work on extracting facts from article titles

7 items in this pack

Comments: 0 | Viewed: 51 times | Downloaded: 16 times

This Pack has no tags!

Workflow Text stemming with Porter Stemmer (1)

Thumb
This workflow does text stemming. Stemming removes the inflicted endings of words. It is often used as text preprocessing for text mining, since stemmed words can be easily matched and counted. The input to the workflow is the text to be stemmed, the output is the stemmed text.  

Created: 2011-01-11 | Last updated: 2011-01-11

Credits: User Petra Kralj Novak

Workflow Text preprocessing (1)

Thumb
The input to this workflow is plain text. The text is preprocessed so that non- alfanumeric symbols are removed, the text is transformed to to lower case and stop words are removed. The workflow first removes the charachters from this set: `~!@#$%^&*()_+=-{}|\][":;'?><,./. Then it transforms the text to lower case. The user will be prompted to select a dictionary for stop words from a list. The workflow will, based on the selected list, remove the stop words. Stop words are...

Created: 2011-01-07 | Last updated: 2011-01-07

Credits: User Petra Kralj Novak

Workflow Lemmatization (3)

Thumb
The workflow lemmatizes the text in the input port. Takes text as input and returns (language dependent) lemmatized text as output. All the words in the resulting text are in the same order as in the original text, but they are transformed to their dictionary form. The workflow asks for the language of lemmatization. Currently, 12 languages are supported: en,sl,ge,bg,cs,et,fr,hu,ro,sr,it,sp.

Created: 2010-12-17 | Last updated: 2010-12-23

Credits: User Petra Kralj Novak

Attributions: Workflow Select from a list of possible web service parameter values

Pack RCOMM2011 recommender systems workflow templates


Created: 2011-04-07 14:59:37 | Last updated: 2012-01-28 19:37:47

No description

6 items in this pack

Comments: 0 | Viewed: 495 times | Downloaded: 166 times

Tags:

Workflow Classification of GEO assays using RapidAn... (2)

Thumb
No description

Created: 2011-05-04 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Distance heatmap of GEO dataset produced b... (2)

Thumb
No description

Created: 2011-04-28 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Agglomerative clustering of a GEO dataset ... (2)

Thumb
No description

Created: 2011-04-28 | Last updated: 2011-12-13

Credits: User James Eales

Results per page:
Sort by: