Workflow Entry: Change Class Distribution of Your Training Data Set by Filtering and Sampling / Normalize Class Distribution / Stratification

Created at: 21/01/11 @ 14:57:11      Last updated: 21/01/11 @ 14:57:12
Information Version 1 (of 1)

Version created on: 21/01/11 @ 14:57:11 by: Sebastian Loh   |   Revision comments Expand

Title: Change Class Distribution of Your Training Data Set by Filtering and Sampling / Normalize Class Distribution / Stratification

Type: RapidMiner


Information Preview

(Click on the image to get the full size)

Medium


Information Description

This example process shows how to change the class distribution of your training data set (in this case the training data is what ever comes out of the "myData reader").

The given training set has a distribution of 10 "Iris-setosa" examples, 40 "Iris-versicolor" examples and 50 "Iris-virginica" examples. The aim is to get a data set which has the class distribution for the label, lets say 10 "Iris-setosa", 20 "Iris-versicolor" and 20 "Iris-virginica.

Beware that this may change some properties of the data so that a model trained on this subset but applied to a set of the initial structure may be biased.

See also the "Same Number of Examples per Class" process here on myExperiment http://www.myexperiment.org/workflows/1315.html

Tags: Rapidminer, sample, filter, class distribution, normalization, data set, label, label distribution, training, training data, Stratification


Information Download



Information Run

Not available


Information Workflow Components

Inputs (0)
Operators (17)
Outputs (0)

Information Workflow Type

RapidMiner

Information Original Uploader

Information License

All versions of this Workflow are licensed under:

Information Credits (0)

(People/Groups)

None

Information Attributions (0)

(Workflows/Files)

None

Information Tags (11)

Log in to add Tags

Information Shared with Groups (0)

None

Information Featured In Packs (0)

None

Log in to add to one of your Packs

Information Ratings (0)

Current:

0.0 / 5

(0 ratings)

Log in to rate and see breakdown of ratings

Information Attributed By (0)

(Workflows/Files)

None

Information Favourited By (0)

No one

 

Citations (0)

None


Version History

Earliest Version:
[1] - Change Class Distribution of Your Training Data Set by Filtering and Sampling / Normalize Class Distribution / Stratification

Created on: Friday 21 January 2011 @ 14:57:11 (GMT)

Created by: Sebastian Loh

Revision comments:

None

This Workflow only has one version.



Reviews Reviews (0)

No reviews yet

Be the first to review!



Comments Comments (0)

No comments yet

Log in to make a comment




Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.

What is this?

Linked Data

Non-Information Resource URI: http://www.myexperiment.org/workflows/1775


Alternative Formats

HTML
RDF
XML

New/Upload

Log in / Register

Username or Email:

Password:

Remember me:

OR

Use OpenID:


(eg: name.myopenid.com)

Need an account?
Click here to register

Forgot Password?

Front Page

Home

Invite people to myExperiment

Help pages

About Us

News and Events

Mailing List

Contact Us

Developers

Publications


Taverna Workflow Workbench

myGrid

BioCatalogue

Trident

Google Coop Search

EPSRC

JISC

Microsoft

Powered by:

Rails

Icons:
Silk icon set 1.3