Workflows

Search filter terms
Filter by type
Filter by tag
Filter by user
Filter by licence
Filter by group
Results per page:
Sort by:
Showing 294 results. Use the filters on the left and the search box below to refine the results.

Workflow Using Remember / Recall for "tunneling" re... (1)

Thumb
This process shows how Remeber and Recall operators can be used for passing results from one position to another position in the process, when it's impossible to make a direct connection. This process introduces another advanced RapidMiner technique: The macro handling. We have used the predefined macro a, accessed by %{a}, that gives the apply count of the operator. So we are remembering each application of the models that are generated in the learning subprocess of the Split validation. Af...

Created: 2010-04-29 | Last updated: 2012-01-16

Workflow Operator testing workflow (1)

Thumb
This workflow is used for operator testing. It joins dataset metafeatures with execution times and performanse measures of the selected recommendation operator. In the Extract train and Extract test Execute Process operator user should open Metafeature extraction workflow. In the Loop Operator train/test data are used to evaluate performanse of the selected operator. Result is remebered and joined with the time and metafeature informations. This workflow can be used both for Item Recommend...

Created: 2012-01-29

Credits: User Matej Mihelčić User Matko Bošnjak

Workflow Model saving workflow (RP) (1)

Thumb
This workflow trains and saves model for a selected rating prediction operator.

Created: 2012-01-29 | Last updated: 2012-01-30

Credits: User Matej Mihelčić

Workflow Model testing workflow (RP) (1)

Thumb
This workflow measures performance of three models. Model learned on train data and upgraded using online model updates. Model learned on train data + all query update sets. Model learned on train data only.

Created: 2012-01-29 | Last updated: 2012-01-30

Credits: User Matej Mihelčić

Uploader

Workflow Connect to twitter and analyze the key words (1)

Thumb
Hi All, This workflow connects RapidMiner to Twitter and downloads the timeline. It then creates a wordlist from the tweets and breaks them into key words that are mentioned in the tweets. You can then visualize the key words mentioned in the tweets. This workflow can be further modified to review various key events that have been talked about in the twitterland. Do let me know your feedback and feel free to ask me any questions that you may have. Shaily web: http://advanced-analyti...

Created: 2010-07-26 | Last updated: 2010-07-26

Workflow Looping over Examples for doing de-aggrega... (1)

Thumb
This process is based on (artificially generated) data that looks like it has been aggregated before. The integer attribute Qty specifies the quantity of the given item that is represented by the rest of the example. The process now loops over every example and performs on each example another loop, that will append the current example to a new example set. This example set has been created as empty copy of the original example set, so that the attributes are equally. To get access to and rem...

Created: 2010-04-29

Workflow Iterate through datasets (1)

Thumb
This is a dataset iteration workflow. It is a part of Experimentation workflow for recommender extension. Loop FIles operator iterates through datasets from a specified directory using read aml operator. Only datasets specified with a proper regular expression are considered. Train and test data filenames must correspond e.g (train1.aml, test1.aml). In each iteration Loop Files calles specified operator testing workflow with Execute subprocess operator. Informations about training and t...

Created: 2012-01-29

Credits: User Matej Mihelčić User Matko Bošnjak

Workflow Metafeature extraction (1)

Thumb
This is a metafeature extraction workflow used in Experimentation workflow for recommender extension operators. This workflow extracts metadata from the train/test datasets (user/item counts, rating count, sparsity etc). This workflow is called from the operator testing workflow using Execute Process operator.

Created: 2012-01-29 | Last updated: 2012-01-30

Credits: User Matko Bošnjak

Workflow Data iteration workflow (RP) (1)

Thumb
This is a data iteration workflow used to iterate throug query update sets.

Created: 2012-01-29

Credits: User Matej Mihelčić User Matko Bošnjak

Uploader

Workflow Transforming user/item description dataset... (1)

Thumb
This workflow provides transformation of an user/item description attribute set, into a format required by attribute based k-NN operators of the Recommender extension. See: http://zel.irb.hr/wiki/lib/exe/fetch.php?media=del:projects:elico:recsys_manual_v1.1.pdf to learn about formats of datasets required by Recommender extension.

Created: 2012-01-30

Workflow Random recommender (1)

Thumb
This process does a random item recommendation; for a given item ID, from the example set of items, it randomly recommends a desired number of items. The purpose of this workflow is to produce a random recommendation baseline for comparison with different recommendation solutions, on different retrieval measures. The inputs to the process are context defined macros: %{id} defines an item ID for which we would like to obtain recommendation and %{recommender_no} defines the required number of ...

Created: 2011-03-15 | Last updated: 2011-03-15

Workflow RCOMM 2011 Challenge 2: Vodka or President? (1)

Thumb
This is a solution for Challenge 2 of the a live data mining process design competition "Who Wants to be a Data Miner" held at RCOMM 2011 in Dublin. Those of you who loved "You Don't Know Jack" will remember this task: To tell whether a certain word is the name of a vodka or the name of a leader of the Soviet Union. The RapidMiner process was allowed to download data from Wikipedia to make this decision. One input file contains a list of words for which two attributes "Vodka" or "Leader" wi...

Created: 2011-11-02

Workflow RCOMM Challenge 3: Fibonacci Numbers (Inte... (1)

Thumb
At the RComm 2010 (www.rcomm2010.org), an unusual competition was held. Titled "Who Wants to Be a Data Miner", three challenges were issued to the participants of the conference. In all challenges, participants had to design RapidMiner processes as quickly as possible. This is the original solution I had in mind for Challenge 2: "Fibonacci Numbers". It defines a macro n, recurses by applying itself using an "Embed Process" operator on n-1 and n-2, appends the results (so the length is F(n-1)...

Created: 2010-09-17 | Last updated: 2010-09-17

Workflow Change Class Distribution of Your Training... (1)

Thumb
This example process shows how to change the class distribution of your training data set (in this case the training data is what ever comes out of the "myData reader"). The given training set has a distribution of 10 "Iris-setosa" examples, 40 "Iris-versicolor" examples and 50 "Iris-virginica" examples. The aim is to get a data set which has the class distribution for the label, lets say 10 "Iris-setosa", 20 "Iris-versicolor" and 20 "Iris-virginica. Beware that this may change some propert...

Created: 2011-01-21 | Last updated: 2011-01-21

Uploader
1599?size=60x60 Ch Juk

Workflow kddcup98 direct marketing (1)

Thumb
RapidMiner supports Meta Learning by embedding one or several basic learners as children into a parent meta learning operator. In this example we generate a data set with the ExampleSetGenerator operator and apply an improved version of Stacking on this data set. The Stacking operator contains four inner operators, the first one is the learner which should learn the stacked model from the predictions of the other four child operators (base learners). Other meta learning schemes like Boosting ...

Created: 2012-03-15

Workflow RCOMM Challenge 3: Fibonacci Numbers (Impr... (1)

Thumb
At the RComm 2010 (www.rcomm2010.org), an unusual competition was held. Titled "Who Wants to Be a Data Miner", three challenges were issued to the participants of the conference. In all challenges, participants had to design RapidMiner processes as quickly as possible. This is the winning process of Challenge 2: "Fibonacci Numbers" by Matko Bošnjak. This was the task: The n-th Fibonacci number is F(n)=F(n-1)+F(n-2), and F(0)=0, F(1)=1. Create a process that creates an example set with F(n)...

Created: 2010-09-17 | Last updated: 2010-09-17

Workflow 2. Getting Started: Retrieve and Apply a M... (1)

Thumb
This getting started process demonstrates how to load (retrieve) a model from the repository and apply it to a data set. The result is a data set (at the lab output for "labeled data" ) with has a new "prediction" attribute which indicated the prediction for each example (ie. row/record). You will need to adjust the path of the retrieve data operator to the actual location where the model is stored by a previews execution of the "1. Getting Started: Learn and Store a...

Created: 2011-01-17 | Last updated: 2011-01-19

Workflow Content based recommender system template (1)

Thumb
As an input, this workflow takes two distinct example sets: a complete set of items with IDs and appropriate textual attributes (item example set) and a set of IDs of items our user had interaction with (user example set). Also, a macro %{recommendation_no} is defined in the process context, as a required number of outputted recommendations. The first steps of the workflow are to preprocess those example sets; select only textual attributes of item example set, and set ID roles on both of th...

Created: 2011-05-05 | Last updated: 2011-05-09

Credits: User Matko Bošnjak User Ninoaf

Attributions: Blob Datasets for the pack: RCOMM2011 recommender systems workflow templates

Workflow Crossvalidation with SVM (1)

Thumb
Performs a crossvalidation on a given data set with nominal label, using a Support Vector Machine as a learning algorithm. Inside the cross validation, the first subprocess generates an SVM model, and the second subprocess evaluates it. applying it on a so-far unused subset of the data and counting the misclassifications.

Created: 2010-04-29

Workflow Item to item similarity matrix -based reco... (1)

Thumb
This process executes the recommendation based on item to item similarity matrix. The inputs to the process are context defined macros: %{id} defines an item ID for which we would like to obtain recommendation and %{recommender_no} defines the required number of recommendations. The process internally uses an item to item similarity matrix written in pairwise form (id1, id2, similarity). The process essentially filters out appearances of the required ID in both of the columns of the pairwis...

Created: 2011-03-15 | Last updated: 2011-03-15

Workflow Collaborative filtering recommender (1)

Thumb
This process executes a collaborative filtering recommender based on user to item score matrix. This recommender predicts one user’s score on some of his non scored items based on similarity with other users. The inputs to the process are context defined macros: %{id} defines an item ID for which we would like to obtain recommendation and %{recommender_no} defines the required number of recommendations and %{number_of_neighbors} defines the number of the most similar users taken into a...

Created: 2011-03-15 | Last updated: 2012-03-06

Workflow Przykład metody Stacking (1)

Thumb
Poniższy przepÅ‚yw pokazuje wykorzystanie operatora Stacking do tworzenia meta-klasyfikatorów. Operator Stacking pozwala na zagnieżdżenie dowolnej liczby modeli bazowych, które bÄ™dÄ… równolegle uczone na zbiorze uczÄ…cym. Drugim operatorem zagnieżdżonym jest model klasyfikatora, który uczy siÄ™ na odpowiedziach modeli bazowych (czyli buduje model modeli odpowiedzi). W przykÅ‚adzie jako modele bazowe wykorzystano: drzewo decyzyjne, algorytm k-NN, sieć neurono...

Created: 2011-05-25 | Last updated: 2011-05-25

Workflow Iterate over Attribute Subsets and Store A... (1)

Thumb
This process iterates over all possible feature subsets and stores a) the names of all attribute subsets, b) the number of used features, and c) the achieved performance in a log table which can then be further analyzed.

Created: 2011-07-07

Uploader

Workflow Semantic clustering (with k-medoids) of SP... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". The SPARQL query is entered in a parameter of "SPARQL selector" operator. The clustering operator (k-medoids) allows to specify which of the query variables are to be used as clustering criteria. If more ...

Created: 2012-01-29

Uploader

Workflow Semantic clustering (with AHC) of SPARQL q... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. The measure used in this particualr workflow is a kernel that exploits membership of clustered individuals to OWL classes from a background ontology ("Common classes" kernel from [1]). Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". ...

Created: 2012-01-29 | Last updated: 2012-01-29

Uploader

Workflow Semantic clustering (with alpha-clustering... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/) to perform clustering of SPARQL query results based on chosen semantic similarity measure. The measure used in this particualr workflow is a kernel that exploits membership of clustered individuals to OWL classes from a background ontology ("Epistemic" kernel from [1]). Since the semantics of the backgound ontology is used in this way, we use the name "semantic clustering". This ...

Created: 2012-01-29 | Last updated: 2012-01-30

Uploader

Workflow Loading OWL files (RDF version of videolec... (1)

Thumb
The workflow uses RapidMiner extension named RMonto (http://semantic.cs.put.poznan.pl/RMonto/). Operator "Build knowledge base" is responsible for collecting data either from OWL files or SPARQL endpoints or RDF repositories and provide it to the subsequent operators in a workflow. In this workflow it is parametrized in this way, that is builds a Sesame/OWLIM repository from the files specified in "Load file" operators. Paths to OWL files are specified as parameter va...

Created: 2012-01-29 | Last updated: 2012-01-29

Workflow RCOMM Challenge 1: 99 bottles of beer (1)

Thumb
At the RComm 2010 (www.rcomm2010.org), an unusual competition was held. Titled "Who Wants to Be a Data Miner", three challenges were issued to the participants of the conference. In all challenges, participants had to design RapidMiner processes as quickly as possible. This is the winning process of Challenge 1: "99 bottles of beer" by Sebastian Land. This was the task: Design a process that produces an example set the rows of which form the lyrics of the well-known song "99 bottles of beer...

Created: 2010-09-17

Workflow Extended Operations for Nominal Values (1)

Thumb
This process shows examples for the extended operations for nominal values coming with one of the next RapidMiner updates (5.0.011 or 5.1.000). The operations are performed with the operator "Generate Attributes" and can be used directly within the expressions for the new attributes. The supported functions include Number to String [str(x)], String to Number [parse(text)], Substring [cut(text, start, length)], Concatenation [concat(text1, text2, text3...)], Replace [replace(text, what, by)],...

Created: 2010-10-05

Workflow Content based recommender (1)

Thumb
This process is a special case of the item to item similarity matrix based recommender where the item to item similarity is calculated as cosine similarity over TF-IDF word vectors obtained from the textual analysis over all the available textual data. The inputs to the process are context defined macros: %{id} defines an item ID for which we would like to obtain recommendation and %{recommender_no} defines the required number of recommendations. The process internally uses an example set of...

Created: 2011-03-15 | Last updated: 2011-03-15

Workflow Semantic clustering with k-Medoids and ALC... (1)

Thumb
 This workflow loads data from a configuration file for DL-Learner (http://dl-learner.org) and uses ALCN Semantic Kernel [1] to cluster those data with k-Medoids algorithm. [1] N. Fanizzi, C. d’Amato, F. Esposito. Learning with Kernels in Description Logics. ILP 2008  

Created: 2012-05-30 | Last updated: 2012-06-07

Workflow Evaluating Multiple Models with Looped X-V... (1)

Thumb
This process shows how multiple different models can be evaluated with cross validation runs. This allows for the comparison of, for example, three prediction model types (e.g., ANN, SVM and DT) all at once under a x-fold cross validation. Having said that, the process actually performs the same cross validation several times using each time a different modeling scheme. It makes use of loops, collections, subprocess selection and macros and is therefore also an interesting showcase for more ...

Created: 2010-05-15

Workflow Pivoting with Consideration of Example Wei... (1)

Thumb
This process shows how the Pivoting operator can also consider example weights encoded by a special attribute with the role 'weight'. The first operators only create a demo data set for this purpose, the actual pivoting is performed by the last operator only. I recommend to place a breakpoint before the Pivoting and check the data (structure) in order to see how the Pivoting handles the weights.

Created: 2010-07-07

Uploader
Avatar Gb Awc

Workflow Find and replace missing values with other... (1)

Thumb
Sometimes, it is helpful to replace a numerical attribute containing a missing value with a value from another attribute in the example row rather than using a constant such as an average or maximum value. This process generates some missing values, detects them and replaces with a calculation based on the value of another attribute. (Purists will be alarmed at the mathematical impossibility of taking the square root of a negative number). The process detects missing values by subtracti...

Created: 2010-08-25 | Last updated: 2010-08-25

Workflow Cross tabulation via aggregation and pivoting (1)

Thumb
Creates a contingency table using the Aggregate and Pivot operators.

Created: 2010-08-25 | Last updated: 2010-08-25

Workflow Constructing user defined linear regressio... (1)

Thumb
This process shows how one can construct a user defined Linear Regression model using the execute script operator

Created: 2010-09-02 | Last updated: 2010-09-02

Uploader

Workflow 111 (1)

Thumb
This process shows how several different classifiers could be graphically compared by means of multiple ROC curves.

Created: 2010-09-04 | Last updated: 2010-09-04

Uploader

Workflow 222 (1)

Thumb
This process shows how several different classifiers could be graphically compared by means of multiple ROC curves.

Created: 2010-09-04

Uploader

Workflow 11111 (1)

Thumb
Reads collections of text from a set of directories, assigning each directory to a class (as specified by parameter text_directories), and transforms them into a TF-IDF or other word vector. Finally, an SVM is applied to model the input texts.

Created: 2010-09-04

Workflow Defining positive class with Remap Binominal (1)

Thumb
This process shows how one can use the Remap Binominal operator to define which label value is treated as the positive class.

Created: 2010-05-11

Workflow Setting an attribute value in a specific E... (1)

Thumb
This process will first generate an artificial dataset and tag it with ids. Then a Filter Examples Operator is used to get a dataset with exactly one example identified by it's id. Then a value is set in this example. Since the change in the data will be reflected in all views of the example set, a simply copy is passed by to the process' output. If you take a look at the attributes of the example with id 5, you will find the 12 there.

Created: 2010-05-14

Workflow Pivoting (1)

Thumb
This process shows the basics of Pivoting. A data set with three columns is loaded and partially generated. Afterwards, the data is rotated and missings are replaced by zero.

Created: 2010-05-15

Workflow Convert Nominal to Binominal to Numerical (1)

Thumb
This is a standard preprocessing subprocess taking nominal (categorical) attributes and introduces binominal dummy attributes before those are transformed to numerical which can be then used by learning schemes like SVM or Logistic Regression.

Created: 2010-05-15

Workflow Discard Attribute with More than x% Missin... (1)

Thumb
This process loops over all attributes and calculates the fraction of missings for each attribute. If this fration is larger than the fraction defined in the first "Set Macro" operator (macro: max_unknown), the attribute will be removed from the example set.

Created: 2010-05-15

Workflow Write/Store Correlation Matrix to an Excel... (1)

Thumb
This process demonstrates how to write/store/export the correlation matrix of the "Correlation Matrix" operator to an Excel file. One need to generate a Report with the "Generate Report" operator and export the Report to an Excel file.

Created: 2010-05-19

Workflow Apply Same Preprocessing to Learning and V... (1)

Thumb
This process demonstrates how to apply the same prepossessing work flow to learning data and test/validation data. Due to overview reasons the preprocessing is hidden in a Preprocessing subprocess. To perform the same preprocessing work flow on the learning and testing data, both are collected to a Collection of Data. Then the "Loop Collection" operator loops over each collection. The actual preprocessing is done inside the "Loop Collection" operator. In the example only a "Rename" is done...

Created: 2010-05-19

Workflow Filter Wrong Predicted/Classified Samples (1)

Thumb
This process demonstrates how to filter validation samples which are predicted incorrectly. Thee key operator is "Filter Examples" that is set up to "wrong_predictions"

Created: 2010-05-19 | Last updated: 2010-06-08

Workflow Exclude a Attribute/Variable (1)

Thumb
This simple process demonstrates how to exclude a attribute or set of attributes from an example set/data set. In particular the first "Select Attributes" filter excludes the Outlook attribute. The second filter excludes subset {Outlook, Humidity}.

Created: 2010-05-20

Uploader

Workflow Missing value count (1)

Thumb
This workflow enables users to filter examples by the number of missing values they contains; it inserts the numeric attribute 'Missings' - representing, for each example, the count of attributes with missing values.

Created: 2010-05-27 | Last updated: 2010-07-13

Workflow Linear Regression of Italian bookshops se... (1)

Thumb
Someone could help me? Why the correlation between the actual value and the predictive value of the attribute "Quantita" is so low?

Created: 2010-05-28

Results per page:
Sort by: