Change Class Distribution of Your Training Data Set by Filtering and Sampling / Normalize Class Distribution / Stratification
This example process shows how to change the class distribution of your training data set (in this case the training data is what ever comes out of the "myData reader").
The given training set has a distribution of 10 "Iris-setosa" examples, 40 "Iris-versicolor" examples and 50 "Iris-virginica" examples. The aim is to get a data set which has the class distribution for the label, lets say 10 "Iris-setosa", 20 "Iris-versicolor" and 20 "Iris-virginica.
Beware that this may change some properties of the data so that a model trained on this subset but applied to a set of the initial structure may be biased.
See also the "Same Number of Examples per Class" process here on myExperiment http://www.myexperiment.org/workflows/1315.html
Tags: Rapidminer, sample, filter, class distribution, normalization, data set, label, label distribution, training, training data, Stratification
Preview
Run
Not available
Workflow Components
Unavailable
Reviews (0)
Other workflows that use similar services (0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment