ONB Web Archive Fits Characterisation using ToMaR

Created: 2013-12-09 15:58:54      Last updated: 2013-12-10 17:06:09

Hadoop based workflow for applying FITS on the files contained in ARC web archive container files and ingest the FITS output in a MongoDB using C3PO.


- Spacip (https://github.com/shsdev/spacip) - Tomar (https://github.com/openplanets/tomar) - C3PO (https://github.com/peshkira/c3po)


- hdfs_input_path: Path to a directory which contains textfile(s) with absolute HDFS paths to ARC files - num_files_per_invokation: Number of items to be processed per invokation - fits_local_tmp_dir: Local directory where the FITS output XML files will be stored - c3po_collection_name: Name of the C3P0 collection

The workflow uses Spacip to unpackage the ARC container files into HDFS and creating input files which can be used by ToMaR. After merging the mapper output files from Spacip (MergeTomarInput) into one single file, the FITS characterisation process is invoked by ToMaR as a MapReduce job. The tool invokation depends on a tool specification file which must be available in HDFS, this is explained in the Tomar documentation.

Information Preview

Information Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
[ More InfoExpand ]

Information Workflow Components

Information Authors (1)
Information Titles (1)
Information Descriptions (1)
Information Dependencies (0)
Inputs (4)
Processors (6)
Beanshells (0)
Outputs (2)
Datalinks (10)
Coordinations (3)

Information Workflow Type

Taverna 2

Information Uploader

Information License

All versions of this Workflow are licensed under:

Information Version 2 (latest) (of 2)

View version:

Information Credits (1)


Information Attributions (0)



Information Tags (5)

Log in to add Tags

Information Featured In Packs (0)


Log in to add to one of your Packs

Information Attributed By (0)



Information Favourited By (0)

No one

Information Statistics


Citations (0)


Version History

In chronological order:

Reviews Reviews (0)

No reviews yet

Be the first to review!

Comments Comments (0)

No comments yet

Log in to make a comment

Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.