The process determines the best value for the parameter k for the k-NN classification of the Breast Cancer Wisconsin (Diagnostic) data set available in the UCI Machine Learning Repository. The optimal k is computed by using 10-fold cross-validation. (To get better results each cross-validation is repeated 10 times and the averages of the runs are considered.) Finally, a k-NN classifier is built and evaluated on the entire data set using the optimal k. During the process the resulting average ...

Created: 2012-09-28 | Last updated: 2012-09-28


Workflow AIT Matchbox Scenario Find Duplicates usin... (1)

In this scenario matchbox will find duplicates in passed digital collection. All matchbox workflow steps are defined separately using input parameter sequence: clean, extract, train, bowhist and compare. User will get a list of duplicates in result. Matchbox in this scenario is installed on remote Linux VM. Digital collection is stored on Windows machine.

Created: 2012-11-24

Credits: User Roman Network-member SCAPE

