myExperiment - Groups

RapidMiner

Uploader

Simon Fischer

Download

This is an image mining process using the image mining Web service provided by NHRF within e-Lico. It first uploads a set of images found in a directory, then preprocesses the images and visualizes the result. Furthermore, references to the uploaded images are stored in the local RapidMiner repository so they can later be used for further processing without uploading images a second time.

Created: 2010-04-28 | Last updated: 2012-01-16

Creator

James Eales

View

Download

Core text mining workflows

Created: 2010-02-19 10:12:33 | Last updated: 2011-12-13 16:03:17

This pack contains workflows we have created to support core text mining tasks. We currently provide workflows to do these tasks Loading documents (text or PDF) PDF to text conversion Sentence splitting Text cleaning (ASCII or XML-valid) Term recognition (using NaCTeM service TerMine)

7 items in this pack

Comments: 0 | Viewed: 727 times | Downloaded: 136 times

Tags:

Taverna 2

Uploader

James Eales

Termine with c-value threshold (1)

Download

This workflow accepts a list of sentences from a single document and returns the terms found by the TerMine web service. It also allows you to set a threshold c-value score so that only terms with a user-controlled probability (of being a real term) are returned as an output. To get sentences to supply to this workflow you can use the sentence splitting workflow. The TerMine service (used in this workflow) only accepts text in ASCII encoding, so you should also use the Clean p...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

PDF to plain text (1)

Download

This workflow will extract the plain text content of PDF files supplied to the input port. You can connect the Load PDF from directory workflow to this workflows input. We recommend you send the output from this workflow to the Clean plain text workflow, because the PDF to text process can add characters into the text that are XML-invalid and therefore can not be sent to most services as plain text. Another way round this problem is to encode the text as Base64 using the handy loc...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Sentence splitting (1)

Download

This workflow will attempt to split up text into sentences, returning a list of sentences to the output port. The sentence splitting service makes use of the OpenNLP sentence detector and has been trained to work on english text. This workflow can be used to provide input to the Termine with c-value threshold workflow. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Terms from collection of PDF files (2)

Download

This workflow will give you a set of candidate terms for each PDF document in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows. These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow t...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Creator

Simon Fischer

View

Download

Who Wants to be a Data Miner?

Created: 2011-11-02 17:54:07 | Last updated: 2013-09-09 16:22:11

One of the most fun events at the annual RapidMiner Community Meeting and Conference (RCOMM) is the live data mining process design competition "Who Wants to be a Data Miner?". In this competition, participants must design RapidMiner processes for a given goal within a few minutes. The tasks are related to data mining and data analysis, but are rather uncommon. In fact, most of the challenges ask for things RapidMiner was never supposed to do. This pack contains solutions for these...

12 items in this pack

Comments: 0 | Viewed: 260 times | Downloaded: 142 times

Tags:

Taverna 2

Uploader

James Eales

Terms from collection of text files (1)

Download

This workflow will give you a set of candidate terms for each text file in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows. These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow then...

Created: 2010-02-22 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Load PDF from directory (1)

Download

This workflow will automate the reading of a set of PDF files stored in a single directory (the path to which should be supplied as a single input value). This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Load plain text from directory (1)

Download

This workflow will automate the reading of a set of text files stored in a single directory (the path to which should be supplied as a single input value). It will assume that the text files are saved using the default character encoding for the system that Taverna is running on. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

Clean plain text (ASCII) (1)

Download

This workflow will remove any XML-invalid and non-ASCII characters (e.g. for sending to the ASCII-only Termine service) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: James Eales

Creator

Matko BoÅ¡njak

View

Download

e-LICO recommender workflows

Created: 2011-03-15 15:33:48 | Last updated: 2012-01-28 19:39:06

This pack contains recommender system workflows created for the purpose of e-LICO project.

6 items in this pack

Comments: 0 | Viewed: 306 times | Downloaded: 162 times

Tags:

Taverna 2

Uploader

James Eales

Clean plain text (1)

Download

This workflow will remove any XML-invalid characters (these characters often appear in the output of PDF to text software) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: James Eales

Creator

Ninoaf

View

Download

Recommender systems workflow templates 2012

Created: 2012-01-08 12:27:43 | Last updated: 2012-06-03 19:48:22

The Recommender Extension can be downloaded from the Rapid-I Marketplace from: http://rapidupdate.de:8180/UpdateServer/faces/product_details.xhtml?productId=rmx_irbrecommender . More details can be found: http://elico.rapid-i.com/recommender-extension.html

12 items in this pack

Comments: 0 | Viewed: 478 times | Downloaded: 145 times

Tags:

Taverna 2

Uploader

Simon Jupp

miRNA GFF to entrez gene (1)

Download

This workflow reads a GFF file of miRNA cooridinates and uses BioMart to search human ensemble genes for the gene that codes for the miRNA. The workflow returns a list of miRNAid, chromosome, start, stop, strand, entrez gene id, gene name, gene strand. Example input file here: ftp://mirbase.org/pub/mirbase/CURRENT/genomes/hsa.gff

Created: 2011-01-26 | Last updated: 2012-01-11

RapidMiner

Uploader

Sebastian land

Using Remember / Recall for "tunneling" re... (1)

Download

This process shows how Remeber and Recall operators can be used for passing results from one position to another position in the process, when it's impossible to make a direct connection. This process introduces another advanced RapidMiner technique: The macro handling. We have used the predefined macro a, accessed by %{a}, that gives the apply count of the operator. So we are remembering each application of the models that are generated in the learning subprocess of the Split validation. Af...

Created: 2010-04-29 | Last updated: 2012-01-16

Uploader

Lawrynka

View

Download

Data supplementary to meta-mining workflows

Created: 2012-03-05 22:22:33 | Last updated: 2012-03-05 22:23:51

Credits: Lawrynka

License: Creative Commons Attribution-Share Alike 3.0 Unported License

- Repositories of RapidMiner baseline workflows, and used datasets - DMOP ontology files from ver5.2 -input files to meta-mining workflows

File type: ZIP archive

Comments: 0 | Viewed: 74 times | Downloaded: 39 times

This File has no tags!

Uploader

Lawrynka

View

Download

Digital Multimedia Repositories Ontology (DMRO) and ...

Created: 2012-01-29 16:35:26 | Last updated: 2012-01-29 16:38:25

Credits: Lawrynka

License: Creative Commons Attribution-Share Alike 3.0 Unported License

For the information on the ontology see: http://www.e-lico.eu/?q=node/288 For the information on the original dataset see: http://www.ecmlpkdd2011.org/challenge.php The ontology and KB files are zipped into one file.

File type: ZIP archive

Comments: 0 | Viewed: 226 times | Downloaded: 44 times

Tags:

RapidMiner

Uploader

Lawrynka

Loading OWL files (RDF version of videolec... (1)

Download

The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/). Operator "Build knowledge base" is responsible for collecting data either from OWL files or SPARQL endpoints or RDF repositories and provide it to the subsequent operators in a workflow. In this workflow it is parametrized in this way, that is builds a Sesame/OWLIM repository from the files specified in "Load file" operators. Paths to OWL files are specified as parameter va...

Created: 2012-01-29 | Last updated: 2012-01-29

RapidMiner

Uploader

Lawrynka

Semantic clustering (with alpha-clustering... (1)

Download

The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. The measure used in this particualr workflow is a kernel that exploits membership of clustered individuals to OWL classes from a background ontology ("Epistemic" kernel from [1]). Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". This ...

Created: 2012-01-29 | Last updated: 2012-01-30

Creator

Lawrynka

View

Download

RMonto pack

Created: 2012-01-29 09:47:09 | Last updated: 2012-03-05 22:24:02

RMonto is an ontological extension to RapidMiner, that provides possibility of machine learning with formal ontologies. RMonto is an easily extendable framework, currently providing support for unsupervised clustering with kernel methods and (frequent) pattern mining in knowledge bases. One important feature of RMonto is that it enables working directly on structured, relational data. Additionally, its custom algorithm implementations may be combined with the power of RapidMiner thr...

9 items in this pack

Comments: 0 | Viewed: 145 times | Downloaded: 31 times

This Pack has no tags!

RapidMiner

Uploader

Lawrynka

Semantic clustering (with AHC) of SPARQL q... (1)

Download

The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. The measure used in this particualr workflow is a kernel that exploits membership of clustered individuals to OWL classes from a background ontology ("Common classes" kernel from [1]). Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". ...

Created: 2012-01-29 | Last updated: 2012-01-29

RapidMiner

Uploader

Lawrynka

Semantic clustering (with k-medoids) of SP... (1)

Download

The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". The SPARQL query is entered in a parameter of "SPARQL selector" operator. The clustering operator (k-medoids) allows to specify which of the query variables are to be used as clustering criteria. If more ...

Created: 2012-01-29

Taverna 2

Uploader

Rishi Ramgolam

Discretize by Binning (1)

Download

Choosing an attribute ('a1') with an attribute filter type condition (single).

Created: 2011-12-20 | Last updated: 2011-12-20

Credits: Rishi Ramgolam

Creator

James Eales

View

Download

Creating a focused corpus of factual outcomes from b...

Created: 2011-06-28 11:19:04 | Last updated: 2011-12-13 16:02:16

This pack contains resources and supplementary files for the submission to the MIND2011 workshop titled "Creating a focused corpus of factual outcomes from biomedical experiments" by James Eales, George Demetriou and Robert Stevens

1 item in this pack

Comments: 0 | Viewed: 77 times | Downloaded: 45 times

Tags:

Creator

James Eales

View

Download

RapidMiner plugin for Taverna videos and descriptions

Created: 2011-06-06 10:17:52 | Last updated: 2011-12-13 16:02:04

This pack contains videos the show how to use various parts of the RapidMiner plugin for Taverna. The videos demonstrate how to build a Taverna workflow which collects a GEO dataset, uploads it to RapidAnalytics, trains a classifier on one half of the data and tests it on the other half. This classification process can be used to gauge how well mutant and control assays agree across experimental repeats.

5 items in this pack

Comments: 0 | Viewed: 172 times | Downloaded: 66 times

Tags:

assays
|
e-lico
|
elico
|
geo
|
rapidminer
|
svm

Creator

Adam Woznica

View

Download

Data mining on KUP data

Created: 2011-05-23 12:35:37 | Last updated: 2011-05-24 12:31:58

Various Rapid Miner workflows and R scripts that analyse and visualize KUP-related data extracted from the KUP knowledge base.

9 items in this pack

Comments: 0 | Viewed: 72 times | Downloaded: 23 times

This Pack has no tags!

Taverna 2

Uploader

James Eales

One sentence per line (1)

Download

This workflow accepts a plain text input and provides a single text document per input containing one sentence per line. Newline characters are removed from the original input. The OpenNLP sentence splitter is used to split the text, this is provided by University of Manchester Web Services.

Created: 2011-05-06 | Last updated: 2011-12-13

Credits: James Eales

Taverna 2

Uploader

James Eales

PUT data into RapidAnalytics (1)

Download

Use the example value for the input port

Created: 2011-04-27 | Last updated: 2011-12-13

Creator

James Eales

View

Download

Article title classification: kidney factomics

Created: 2011-04-21 15:20:20 | Last updated: 2011-12-13 16:02:52

This pack contains files relevant to our work on extracting facts from article titles

7 items in this pack

Comments: 0 | Viewed: 51 times | Downloaded: 16 times

This Pack has no tags!

Taverna 2

Uploader

Petra Kralj Novak

Text stemming with Porter Stemmer (1)

Download

This workflow does text stemming. Stemming removes the inflicted endings of words. It is often used as text preprocessing for text mining, since stemmed words can be easily matched and counted. The input to the workflow is the text to be stemmed, the output is the stemmed text.

Created: 2011-01-11 | Last updated: 2011-01-11

Credits: Petra Kralj Novak

Taverna 2

Uploader

Petra Kralj Novak

Text preprocessing (1)

Download

The input to this workflow is plain text. The text is preprocessed so that non- alfanumeric symbols are removed, the text is transformed to to lower case and stop words are removed. The workflow first removes the charachters from this set: `~!@#$%^&*()_+=-{}|\][":;'?><,./. Then it transforms the text to lower case. The user will be prompted to select a dictionary for stop words from a list. The workflow will, based on the selected list, remove the stop words. Stop words are...

Created: 2011-01-07 | Last updated: 2011-01-07

Credits: Petra Kralj Novak

Taverna 2

Uploader

Petra Kralj Novak

Lemmatization (3)

Download

The workflow lemmatizes the text in the input port. Takes text as input and returns (language dependent) lemmatized text as output. All the words in the resulting text are in the same order as in the original text, but they are transformed to their dictionary form. The workflow asks for the language of lemmatization. Currently, 12 languages are supported: en,sl,ge,bg,cs,et,fr,hu,ro,sr,it,sp.