De User: Ingo Mierswa

1070?size=160x160

Name: Ingo Mierswa

Joined: Thursday 05 November 2009 09:04:28 (UTC)

Last seen: Thursday 17 January 2013 19:19:53 (UTC)

Email (public): Not specified

Website: http://www.rapid-i.com

Location: Dortmund, Germany

Ingo Mierswa has been credited 0 times

Ingo Mierswa has an average rating of:

0.0 / 5

(0 ratings in total)

for their items

Description/summary not set


Other contact details:

Not specified

Interests:

Not specified

Field/Industry: IT

Occupation/Role(s): CEO

Organisation(s):

Rapid-I GmbH
e-LICO

 

Note: some items may not be visible to you, due to viewing permissions.


Workflow Handling data and time example (1)

Thumb
This process shows how date and time formats can be converted from arbitrary formats to other arbitrary formats with the operators "Nominal to Date" and "Date to Nominal".

Created: 2011-11-28

Workflow Plotting Training vs Testing Error (Loop +... (1)

Thumb
This process increases the parameter C of a support vector machine and hence also the risk for overfitting. It uses an outer loop operator for increasing the parameter value and an inner log operator for storing the current number of applications together with the current errors on the training and the testing data set. At the end of the process, the log data can be plotted (for example with the plotter "Scatter Multiple" with "Count" on the x-axis and both "Training Error" and "Testing Error...

Created: 2011-08-15

Workflow Iterate over Attribute Subsets and Store A... (1)

Thumb
This process iterates over all possible feature subsets and stores a) the names of all attribute subsets, b) the number of used features, and c) the achieved performance in a log table which can then be further analyzed.

Created: 2011-07-07

Workflow Trim and Replace White Space (1)

Thumb
This process removed leading and trailing white spaces from an attribute and replaces inner white spaces of any type by a single space. Might be useful for text processing (if you are not using tokenization anyway...)

Created: 2011-07-07

Workflow Create Linear Combinations (1)

Thumb
This process creates new attributes based on all combinations of coefficients for two attributes, i.e. it produces all linear combinations c1 * att1 + c2 * att2 by looping through a set of possible values for c1 and c2 and all their different combinations.

Created: 2011-04-08

Workflow Rocchio (1)

Thumb
This process calculates the average values for all attributes grouped by the class (label). The resulting prototypes or centroids are then used for a nearest neighbor model which is applied again on the full data set for demonstration purposes.

Created: 2011-02-28

Workflow Count co-occurences in matrix with Aggrega... (1)

Thumb
The process counts and shows all combinations of values of two attributes and shows them in a count matrix. The process uses the operators Aggregate and Pivot for this purpose.

Created: 2011-01-15

Workflow Using R to add two columns (1)

Thumb
This process shows a simple R script which adds two columns of a data set with R. Of course this is much simpler by using the operator "Generate Attributes" which is done in parallel, but maybe some of you find this short process helpful in order to get started with R.

Created: 2010-12-10 | Last updated: 2010-12-10

Workflow Automatical Disabling / Enabling of Operat... (1)

Thumb
This meta process shows another possibility for automatically optimizing the process layout. The operator "OperatorEnabler" can be used to enable or disable one of its children. Together with one of the parameter optimization operators this can be used to check which operators should be used for optimal results. This is especially useful in order to determine which preprocessing operators should be used for a particular data set - learner combination or if you want to automatically test diffe...

Created: 2010-10-13

Workflow Extended Operations for Nominal Values (1)

Thumb
This process shows examples for the extended operations for nominal values coming with one of the next RapidMiner updates (5.0.011 or 5.1.000). The operations are performed with the operator "Generate Attributes" and can be used directly within the expressions for the new attributes. The supported functions include Number to String [str(x)], String to Number [parse(text)], Substring [cut(text, start, length)], Concatenation [concat(text1, text2, text3...)], Replace [replace(text, what, by)],...

Created: 2010-10-05

Workflow Pivoting with Consideration of Example Wei... (1)

Thumb
This process shows how the Pivoting operator can also consider example weights encoded by a special attribute with the role 'weight'. The first operators only create a demo data set for this purpose, the actual pivoting is performed by the last operator only. I recommend to place a breakpoint before the Pivoting and check the data (structure) in order to see how the Pivoting handles the weights.

Created: 2010-07-07

Workflow Discretization into Deviation Interval aro... (1)

Thumb
This process shows how numerical attributes can be discretized into intervals based on the standard deviation for each attribute around their mean values.

Created: 2010-06-22 | Last updated: 2010-06-22

Workflow Same Number of Examples per Class (Stratif... (1)

Thumb
This process can be used to sample examples from each class of the data set so that the number of examples per class is the same for all classes. The name of the label attribute is defined in the first "Set Macro" operator within the subprocess "Stratification". The result will be a stratified data set where each class is represented by the minimum number of examples for a single class minus 1 (due to calculation reasons in absolute sampling which is used here). The first two operators just...

Created: 2010-06-10

Workflow Transform Attribute Names to lower Case (S... (1)

Thumb
This process uses a Script operator which transforms the attribute names of the input example set into lower case.

Created: 2010-06-06

Workflow Transform Attribute Names to Upper Case (S... (1)

Thumb
This process uses a Script operator which transforms the attribute names of the input example set into UPPER case.

Created: 2010-06-06

Workflow Creation of New Attribute Depending on Val... (1)

Thumb
The process shows the usage of the operator "Generate Attributes" in combination with an "if - then - else" condition and nominal values. The values "value0" and "value1" are mapped to "T1", other values are mapped to "T2" for the new attribute.

Created: 2010-06-01

Workflow Discard Attribute with More than x% Missin... (1)

Thumb
This process loops over all attributes and calculates the fraction of missings for each attribute. If this fration is larger than the fraction defined in the first "Set Macro" operator (macro: max_unknown), the attribute will be removed from the example set.

Created: 2010-05-15

Workflow Convert Nominal to Binominal to Numerical (1)

Thumb
This is a standard preprocessing subprocess taking nominal (categorical) attributes and introduces binominal dummy attributes before those are transformed to numerical which can be then used by learning schemes like SVM or Logistic Regression.

Created: 2010-05-15

Workflow Pivoting (1)

Thumb
This process shows the basics of Pivoting. A data set with three columns is loaded and partially generated. Afterwards, the data is rotated and missings are replaced by zero.

Created: 2010-05-15

Workflow Evaluating Multiple Models with Looped X-V... (1)

Thumb
This process shows how multiple different models can be evaluated with cross validation runs. This allows for the comparison of, for example, three prediction model types (e.g., ANN, SVM and DT) all at once under a x-fold cross validation. Having said that, the process actually performs the same cross validation several times using each time a different modeling scheme. It makes use of loops, collections, subprocess selection and macros and is therefore also an interesting showcase for more ...

Created: 2010-05-15

Workflow Weighted Score Tables (1)

Thumb
This process calculates a weighted score as known from SAP BW. The first operator generates data similar to that used in the links you provided. The next operator (Discretize) defines the different age groups. The next three operators map each age group to the value used for scoring. The last operator finally calculates the score and adds it as a new column to your data set.

Created: 2010-05-05

What is this?

Linked Data

Non-Information Resource URI:


Alternative Formats

HTML
RDF
XML