Change Class Distribution of Your Training Data Set by Filtering and Sampling / Normalize Class Distribution / Stratification

Created: 2011-01-21 14:57:11 Last updated: 2011-01-21 14:57:12

Download Workflow

This example process shows how to change the class distribution of your training data set (in this case the training data is what ever comes out of the "myData reader").

The given training set has a distribution of 10 "Iris-setosa" examples, 40 "Iris-versicolor" examples and 50 "Iris-virginica" examples. The aim is to get a data set which has the class distribution for the label, lets say 10 "Iris-setosa", 20 "Iris-versicolor" and 20 "Iris-virginica.

Beware that this may change some properties of the data so that a model trained on this subset but applied to a set of the initial structure may be biased.

See also the "Same Number of Examples per Class" process here on myExperiment http://www.myexperiment.org/workflows/1315.html

Tags: Rapidminer, sample, filter, class distribution, normalization, data set, label, label distribution, training, training data, Stratification

Preview

Download as scalable diagram (SVG)

Run

Not available

Workflow Components

Unavailable

Information Workflow Type

Information Uploader

Sebastian Loh

Information License

All versions of this Workflow are licensed under:

Information Version 1 (of 1)

Information Credits (0)

(People/Groups)

None

Information Attributions (0)

(Workflows/Files)

None

Information Tags (11)

Uploader tags

Log in to add Tags

Information Shared with Groups (0)

None

Information Featured In Packs (0)

None

Log in to add to one of your Packs

Information Attributed By (0)

(Workflows/Files)

None

Information Favourited By (0)

No one

Information Statistics

2024 viewings

1258 downloads

[ see breakdown ]

More

Citations (0)

None

Version History

In chronological order:

Change Class Distribution of Your Training Data Set by Filtering and Sampling / Normalize Class Distribution / Stratification

Created by Sebastian Loh on Friday 21 January 2011 14:57:11 (UTC)

Reviews (0)

No reviews yet

Be the first to review!

Comments (0)

View Timeline

No comments yet

Log in to make a comment

Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.

Funded by: BioVeL, EPSRC, JISC, Microsoft

Affiliates: myGrid, Taverna, BioCatalogue

About | Privacy | Publications | Contact / Feedback | Mailing List

Icons: Silk icon set 1.3

Copyright © 2007 – 2018 The University of Manchester and University of Southampton