Matchbox Evaluation

Created: 2012-10-02 12:37:08      Last updated: 2012-10-02 12:40:04

Matchbox evaluation against ground truth. The evaluation process first creates the matchbox output and ground truth lists. It then counts each page tuple from the matchbox output that is in the ground truth as correctly identified tuple (true positive). Those that are not in the ground truth are counted as incorrectly identified tuples (false positives), and finally, those that are in the ground truth but not in the matchbox output are counted as missed tuples (false negatives). The precision is then calculated as the number of true positives (i.e. the number of items correctly labeled as duplicate page pairs) divided by the total number of elements assumed to be duplicate page pairs (i.e. the sum of true positives and false positives, which are items incorrectly labeled as being duplicate page pairs ). Recall is then defined as the number of true positives divided by the total number of elements of duplicate page pairs (i.e. the sum of true positives and false negatives, which are items have not been labeled as being duplicate page pairs but actually should have been). The ground truth contains single page instances without duplicates and n-tuples (duplicates, triples, quadruples, etc.). n-tuples with n>2 are expanded, the result is a list of 2-tuples which is used to determine the number of missed duplicates (false negatives).

Information Preview

Information Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/3212/download?version=1
[ More InfoExpand ]

Run this Workflow on the cloud with OnlineHPC...

Click the link below to visit OnlineHPC
http://onlinehpc.com/workflows/editor?provider=myexperiment&workflowId=3212
[ More InfoExpand ]


Information Workflow Components

Information Authors (1)
Information Titles (1)
Information Descriptions (1)
Information Dependencies (2)
Inputs (2)
Processors (10)
Beanshells (3)
Outputs (6)
Datalinks (24)
Coordinations (0)

Information Workflow Type

Taverna 2

Information Uploader

Information License

All versions of this Workflow are licensed under:

Information Version 1 (of 1)

Information Credits (1)

(People/Groups)

Information Attributions (0)

(Workflows/Files)

None

Information Tags (1)

Log in to add Tags

Information Shared with Groups (1)

Information Featured In Packs (0)

None

Log in to add to one of your Packs

Information Attributed By (0)

(Workflows/Files)

None

Information Favourited By (0)

No one

Information Statistics

 

Citations (0)

None


Version History

In chronological order:



Reviews Reviews (0)

No reviews yet

Be the first to review!



Comments Comments (0)

No comments yet

Log in to make a comment




Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.