BowtieToPileupforewardReadsFileNames11 /data2/Nasonia/s_1_1_sequence.txt 2010-07-14 16:12:19.137 CEST /data2/Nasonia/ 2010-07-14 16:12:08.916 CEST A list of all forward reads files. These are generally the files starting with s_N_1, where N is the number of the pair. 2010-07-14 16:11:49.846 CEST reverseReadsFileNames11 /data2/Nasonia/s_1_2_sequence.txt 2010-07-14 16:12:56.21 CEST A list of all reverse reads files. These are generally the files starting with s_N_2, where N is the number of the pair. 2010-07-14 16:12:38.160 CEST alignmentBasename00 bowtie_alignment 2010-07-14 16:10:34.664 CEST Full path and name of the desired alignment file, but without an extension. A logfile with the same name (with .log) will be created. 2010-07-14 16:10:20.676 CEST referenceGenome00 Full path and name of reference genome file. 2010-07-14 16:14:58.509 CEST /data2/Nasonia/ref_genome/Nvit_2.0.linear.fa 2010-07-14 16:15:26.803 CEST /data2/Nasonia/ref_genome/ 2010-07-14 16:15:12.49 CEST indexBasename00 Nvitripennis 2010-07-14 16:09:16.534 CEST Desired bowtie index base name. Bowtie-build will generate 6 files with this base name: .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, and .rev.2.ebwt. Bowtie will use this index, the original reference genome file is no longer used by bowtie. 2010-07-14 16:09:09.642 CEST relativeIndexLocation00 Location to write index to. Is relative to the reference genome. Do not use leading slashes unless you wish to move up in the directory structure using '../'. 2010-09-16 11:52:08.714 CEST Location to write index to. Is relative to the reference genome. Do nut use leading slashes unless you wish to move up in the directory structure using '../'. 2010-07-14 16:08:14.408 CEST bowtie_index 2010-07-14 16:08:34.699 CEST pileupBasename00 Desired name for the pileup file (without path) 2010-07-14 16:57:33.993 CEST raw.pileup 2010-07-14 16:57:49.772 CEST ra 2010-07-14 16:57:37.92 CEST bowtie_err String that may contain some of Bowties error output. 2010-09-16 11:52:49.987 CEST bowtieBuild_err String that may contain some of Bowties error output. 2010-09-16 11:53:04.507 CEST Pileup_rawPileup Final workflow output: location and name of the pileup file. 2010-09-16 11:56:55.458 CEST intermediate_bowtieIndexBasename Intermediate output for bowtie-build: the basename of the index files. 2010-09-16 11:54:18.152 CEST intermediate_bowtie_samLocationAndBasename Intermediate output for Bowtie: the path and base name of the created SAM alignment file. 2010-09-16 11:53:48.740 CEST intermediate_samToBam_bamName Intermediate output for samToBam: the resulting BAM file. 2010-09-16 11:56:08.761 CEST intermediate_filterAndSort_sortedBamName Intermediate output for filterAndSort: resulting BAM file. 2010-09-16 11:54:58.863 CEST intermediate_refIndex Intermediate output for indexReference: the SAMtools index. 2010-09-16 11:55:38.667 CEST Bowtie_buildreferenceGenome0indexBasename0indexLocation0bowtieBuild_bowtieIndexBasename00err00 bowtieBuild creates a Bowite-specific index of the reference genome. 2010-09-16 11:58:49.86 CEST bowtieBuild creates a Bowtie-specific index of the reference genome. 2010-09-16 11:59:21.447 CEST net.sf.taverna.t2.activitiesdataflow-activity1.2net.sf.taverna.t2.activities.dataflow.DataflowActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeBowtiebowtie_readsFileNames1bowtie_reverseReadsFileNames1bowtie_alignmentBasename0bowtie_indexBasename0bowtie_err00bowtie_samLocationAndBasename00 Bowtie aligns the reads to the reference genome. 2010-09-16 11:58:17.245 CEST net.sf.taverna.t2.activitiesdataflow-activity1.2net.sf.taverna.t2.activities.dataflow.DataflowActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokesamToBamsamToBam_samName0samToBam_bamName00 SAMtools SAM to BAM conversion. 2010-09-16 12:00:58.379 CEST net.sf.taverna.t2.activitiesdataflow-activity1.2net.sf.taverna.t2.activities.dataflow.DataflowActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeFilterAndSortfilterAndSort_bamName0sortedBamName00 Filters all unaligned reads from the input BAM file and sorts the rest. Outputs the name and location of the filtered and sorted BAM file. 2010-09-16 12:00:13.113 CEST net.sf.taverna.t2.activitiesdataflow-activity1.2net.sf.taverna.t2.activities.dataflow.DataflowActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokePileupfaidx0pileupName0referenceGenomeFilename0sortedBamName0rawPileup00 Generates a pileup (list with information on each genomic position) from a filtered and sorted BAM file. 2010-09-16 12:00:45.322 CEST net.sf.taverna.t2.activitiesdataflow-activity1.2net.sf.taverna.t2.activities.dataflow.DataflowActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeindexReferencereferenceGenomeFilename0refIndex00 indexReference creates a SAMtools-specific index of the reference genome. 2010-09-16 11:59:10.423 CEST net.sf.taverna.t2.activitiesdataflow-activity1.2net.sf.taverna.t2.activities.dataflow.DataflowActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeBowtie_buildreferenceGenomereferenceGenomeBowtie_buildindexBasenameindexBasenameBowtie_buildindexLocationrelativeIndexLocationBowtiebowtie_readsFileNamesforewardReadsFileNamesBowtiebowtie_reverseReadsFileNamesreverseReadsFileNamesBowtiebowtie_alignmentBasenamealignmentBasenameBowtiebowtie_indexBasenameBowtie_buildbowtieBuild_bowtieIndexBasenamesamToBamsamToBam_samNameBowtiebowtie_samLocationAndBasenameFilterAndSortfilterAndSort_bamNamesamToBamsamToBam_bamNamePileupfaidxindexReferencerefIndexPileuppileupNamepileupBasenamePileupreferenceGenomeFilenamereferenceGenomePileupsortedBamNameFilterAndSortsortedBamNameindexReferencereferenceGenomeFilenamereferenceGenomebowtie_errBowtiebowtie_errbowtieBuild_errBowtie_builderrPileup_rawPileupPileuprawPileupintermediate_bowtieIndexBasenameBowtie_buildbowtieBuild_bowtieIndexBasenameintermediate_bowtie_samLocationAndBasenameBowtiebowtie_samLocationAndBasenameintermediate_samToBam_bamNamesamToBamsamToBam_bamNameintermediate_filterAndSort_sortedBamNameFilterAndSortsortedBamNameintermediate_refIndexindexReferencerefIndex db85225f-f384-4d67-aa4d-8deb0381e7b1 2010-07-14 16:15:27.950 CEST 0c8e27d0-8631-4ca4-a8ac-c819f99b0ccb 2010-07-14 16:12:56.993 CEST eeef4890-923c-461f-b73e-5b94dfd36c8d 2010-09-16 12:01:17.718 CEST WBKoetsier 2010-07-14 15:53:30.84 CEST This example workflow aligns short sequencing reads to a reference genome using Bowtie and generates a SAMtools pileup file. The main problem with sequencing data is the sheer amount. By analysing an actual data set (SNP detection in N. vitripennis) and translating this pipeline into a Taverna workflow, I was able to come up with an easy way of using Taverna for such analysis. I created a Java API (with my limited Java experience) that wraps the command line programs used in the analysis pipeline: Bowtie and some of the SAMtools. Instead of passing Taverna or the API data, only references to files are used. The API does not have a main entry point, instead, each step in the analysis pipeline is represented by a short Beanshell script that calls the appropriate method of the API. These scripts are used as services. This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). 2010-09-16 11:46:42.100 CEST 6731624c-555d-413f-aa13-73a219acc869 2010-07-14 16:12:19.543 CEST 281758c3-029e-4ee8-bcec-c9a10638d2d0 2010-07-14 16:57:52.828 CEST 48e52609-9e53-4454-be5d-8fcee2226f46 2010-09-16 11:42:44.943 CEST 284cc100-cfaf-404a-9a6c-6249adf1a843 2010-07-14 16:08:34.994 CEST This workflow This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). 2010-09-16 11:36:33.997 CEST 75105663-5942-4bce-8945-a0119a7a2c77 2010-07-14 16:56:55.128 CEST c13e9eee-2aa6-4c52-8623-4214f1aaf7b5 2010-07-14 16:01:44.651 CEST afb2b38b-b35e-4dca-97be-4358656f21d8 2010-09-16 11:46:44.300 CEST This example workflow aligns short sequencing reads to a reference genome using Bowtie and generates a SAMtools pileup file. The main problem with sequencing data is the sheer amount. By analysing an actual data set (SNP detection in N. vitripennis) and translating this pipeline into a Taverna workflow, I was able to come up with an easy way of using Taverna for such analysis. I created a Java API (with my limited Java experience) that wraps the command line programs used in the analysis pipeline: Bowtie and SAMtools/Picard This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). 2010-09-16 11:41:56.862 CEST BowtieToPileup 2010-07-14 15:53:41.892 CEST 6f194cf9-4d62-4a49-87aa-ca34b4ed9864 2010-07-14 16:10:35.564 CEST This example workflow aligns short sequencing reads to a reference genome using Bowtie and generates a SAMtools pileup file. By analysing an actual data set (SNP detection in N. vitripennis) and translating this analysis pipeline into a Taverna workflow, I was able to come up with an easy way of using Taverna for such analysis. I created a Java API (with my limited Java experience) that wraps the command line programs used in the analysis pipeline: Bowtie and some of the SAMtools. Instead of passing Taverna or the API data, only references to files are used. The API does not have a main entry point, instead, each step in the analysis pipeline is represented by a short Beanshell script that calls the appropriate method of the API. These scripts are used as services. This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). Please note that Bowtie and the SAMtools need to be installed and in the path. The API needs to be present in the .taverna/lib directory, please check dependencies of the Beanshell services. Assumes Linux. 2010-09-16 11:51:08.88 CEST Takes raw reads and reference genome and returns pileup 2010-07-14 15:54:07.365 CEST 958b89a5-1848-4c4f-9894-915706aee73b 2010-07-14 15:54:14.492 CEST 6640f876-4bba-45ed-a932-c61659bc03d0 2010-07-14 16:09:18.98 CEST a55bd66d-9c62-4727-9d7b-0690c02b4d1a 2010-09-16 12:04:47.780 CEST f8631dd8-5947-48b3-ae40-6dba9c18330b 2010-07-14 16:25:06.63 CEST Pileupfaidx00 /path/to/refGenome.fa.fai 2010-07-14 15:46:57.665 CEST Name and full path of an existing SAMtools reference genome index (*.fai) 2010-07-14 15:46:44.542 CEST pileupName00 raw.pileup 2010-07-14 15:47:34.59 CEST Desired base name for the pileup. Defaults to raw.pileup. 2010-07-14 15:47:30.607 CEST referenceGenomeFilename00 /path/to/Nvit_2.0.linear.fa 2010-07-14 15:48:08.836 CEST Full path and name of reference genome. 2010-07-14 15:47:50.528 CEST /path/to/ 2010-07-14 15:47:57.832 CEST sortedBamName00 /path/to/sorted.bam 2010-07-14 15:48:34.556 CEST Name and full path of alignment file in BAM format. 2010-07-14 15:48:43.578 CEST Name and full path of alignment file 2010-07-14 15:48:23.514 CEST rawPileup Full path and name of pileup file. 2010-07-14 15:49:02.350 CEST pileupfaidx0pileupName0referenceGenomeFilename0sortedBamName0rawPileup00net.sf.taverna.t2.activitiesbeanshell-activity1.2net.sf.taverna.t2.activities.beanshell.BeanshellActivity workflow NGSPipeline-0.3.jar java.lang.String true sortedBamName 0 text/plain java.lang.String true referenceGenomeFilename 0 text/plain java.lang.String true pileupName 0 text/plain java.lang.String true faidx 0 text/plain 0 rawPileup 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokepileupfaidxfaidxpileuppileupNamepileupNamepileupreferenceGenomeFilenamereferenceGenomeFilenamepileupsortedBamNamesortedBamNamerawPileuppileuprawPileup Pileup 2010-07-14 13:39:49.327 CEST 9b33a75d-061a-4f40-8bd3-b2e7e04a4da6 2010-07-14 15:48:10.424 CEST WBKoetsier 2010-07-14 13:39:41.639 CEST Creates a pileup from the alignment file using 'samtools pileup -c'. 2010-07-14 13:41:51.414 CEST d4780d10-e633-403a-8a5f-d0b153113f0c 2010-07-14 15:49:03.666 CEST 66a227d5-c250-46f8-8267-30d48dc172e0 2010-07-14 15:46:59.326 CEST d2c01c50-fa34-4589-814a-1c6dce82fba2 2010-07-14 15:47:35.423 CEST 538a762a-509a-4aac-885a-a4ba43d75db2 2010-07-14 15:48:43.698 CEST 0eafce7f-2492-4515-a2b9-3c60d193a87f 2010-07-14 13:41:59.713 CEST Creates a pileup from the alignment file using 'samtools pileup - 2010-07-14 13:40:14.358 CEST samToBamsamToBam_samName00 Full path and name of sam file. It will be converted into a bamfile with the same name. 2010-07-12 20:12:34.604 CEST /path/to/alignment.sam 2010-07-12 20:12:43.194 CEST samToBam_bamName /path/to/alignment.bam 2010-07-12 20:13:05.46 CEST Full path and name of created bam file. 2010-07-12 20:12:57.764 CEST samToBamsamName0bamName00net.sf.taverna.t2.activitiesbeanshell-activity1.2net.sf.taverna.t2.activities.beanshell.BeanshellActivity workflow NGSPipeline-0.3.jar java.lang.String true samName 0 text/plain 0 bamName 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokesamToBamsamNamesamToBam_samNamesamToBam_bamNamesamToBambamName c1922db8-5a84-4e0a-8a7f-31003a0c9639 2010-07-14 12:18:03.498 CEST c060ad17-3113-4f50-a496-3f4d2e9d0ae4 2010-07-13 12:28:32.526 CEST bab6dd0a-4fb7-415d-996f-45e380e9d532 2010-07-13 12:27:26.510 CEST samToBam 2010-07-13 12:28:05.802 CEST 11aa98e0-614e-4b43-9b2d-d175c4674175 2010-07-14 12:17:25.101 CEST Converts given SAM file to BAM using samtools view -bS 2010-07-13 12:28:32.400 CEST WBKoetsier 2010-07-13 12:27:57.568 CEST indexReferencereferenceGenomeFilename00 /home/Nikki/dev/testData/refGenome/ 2010-06-07 16:49:52.415 CEST /path/to/Nvit_2.0.linear.fa 2010-07-14 13:11:59.554 CEST Reference genome filename 2010-06-07 16:49:34.144 CEST /home/Nikki/dev/testData/refGenome/Nvit_2.0.linear.fa 2010-06-07 16:50:05.387 CEST refIndex Index of reference genome in 2010-07-14 13:12:14.762 CEST IndexReferencereferenceGenomeFilename0refIndex00net.sf.taverna.t2.activitiesbeanshell-activity1.2net.sf.taverna.t2.activities.beanshell.BeanshellActivity workflow NGSPipeline-0.3.jar java.lang.String true referenceGenomeFilename 0 text/plain 0 refIndex 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeIndexReferencereferenceGenomeFilenamereferenceGenomeFilenamerefIndexIndexReferencerefIndex Index reference genome using 'samtools faidx' 2010-07-14 13:11:25.232 CEST 37688cbe-bd0c-469d-97b9-f602c9883ceb 2010-07-14 13:11:35.29 CEST WBKoetsier 2010-07-14 13:10:59.628 CEST indexReference 2010-07-14 13:11:33.169 CEST c977075e-9861-4223-9816-7dbc96faf9b9 2010-07-14 13:31:09.607 CEST e3755d79-0de7-4080-888a-ee8cdb074e72 2010-07-14 13:36:26.247 CEST f3147cdf-26ec-410c-91b2-39b2ad3f55f9 2010-07-14 13:12:00.590 CEST Bowtiebowtie_alignmentBasename00 /path/to/desired/alignmentname 2010-07-12 20:03:55.460 CEST /path/to/desired/alignmentname 2010-07-12 20:04:01.271 CEST Full path and name of the desired alignment file, but without an extension. A logfile with the same name (with .log) will be created. 2010-07-12 20:05:01.902 CEST bowtie_indexBasename00 Full path and base name of the bowtie-build index. 2010-07-12 20:05:23.916 CEST /path/to/genome 2010-07-12 20:05:41.298 CEST Full path and base name of the bowtie-build index. This base name is the name of the 'ebwt' files, but without .N.ebwt. 2010-07-12 20:06:20.736 CEST bowtie_readsFileNames11 A list of all forward reads files. These are generally the files starting with s_N_1, where N is the number of the pair. 2010-07-12 20:07:28.273 CEST /path/to/s_1_1_sequence.txt 2010-07-12 20:07:48.939 CEST bowtie_reverseReadsFileNames11 A list of all reverse reads files. These are generally the files starting with s_N_2, where N is the number of the pair. 2010-07-12 20:08:06.551 CEST /path/to/s_1_2_sequence.txt 2010-07-12 20:08:24.252 CEST bowtie_samLocationAndBasename This service returns the full path and name of the resulting alignment file (in SAM format). If an alignment base name was given, this is used. Otherwise the file will be called alignment.sam and written to the same directory as the reads files. 2010-07-12 20:10:01.354 CEST /path/to/alignment.sam 2010-07-12 20:10:08.496 CEST This service returns the full path and name of the resulting alignment file (in SAM format). 2010-07-12 20:08:59.812 CEST /path/to 2010-07-12 20:09:05.773 CEST bowtie_err Error string should the service fail. 2010-07-12 20:40:27.804 CEST bowtiealignmentBasename0indexBasename0readsFileNames1reverseReadsFileNames1samLocationAndBasename00err00net.sf.taverna.t2.activitiesbeanshell-activity1.2net.sf.taverna.t2.activities.beanshell.BeanshellActivity workflow NGSPipeline-0.3.jar java.lang.String true readsFileNames 1 text/plain java.lang.String true reverseReadsFileNames 1 text/plain java.lang.String true indexBasename 0 text/plain java.lang.String true alignmentBasename 0 text/plain 0 samLocationAndBasename 0 0 err 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokebowtiealignmentBasenamebowtie_alignmentBasenamebowtieindexBasenamebowtie_indexBasenamebowtiereadsFileNamesbowtie_readsFileNamesbowtiereverseReadsFileNamesbowtie_reverseReadsFileNamesbowtie_samLocationAndBasenamebowtiesamLocationAndBasenamebowtie_errbowtieerr Bowtie: runs bowtie using these set options: -a -3 5 -n 2 -l 28 -y -t --sam -p 16 --best 2010-07-13 10:03:04.159 CEST 45386c6b-cc44-477e-980c-331d954ee60a 2010-07-13 09:56:58.798 CEST Bowtie: runs bowtie using these set options: -a -3 5 -n 2 -l 28 -y -t --sam -p 16 --best If neccesary, these options can be changed by changing the bowtie.properties file in $HOME/.taverna/lib. If bowtie should fail, the error will be written to the error output port 'err' and port 'samLocationAndBasename' will come up empty. 2010-07-13 15:04:35.865 CEST 53cb7c63-8c2f-4888-90a0-02971f1d2f10 2010-07-13 10:03:04.320 CEST Bowtie 2010-07-13 10:03:45.14 CEST Bowtie: runs bowtie using these set options: -a -3 5 -n 2 -l 28 -y -t --sam -p 16 --best If bowtie should fail, the error will be written to the error output port 'err' and port 'samLocationAndBasename' will come up empty. 2010-07-13 13:21:36.33 CEST 5ffc112e-85b6-494b-84a6-811c93ec9178 2010-07-13 10:03:45.139 CEST Bowtie: runs bowtie using these set options: -a -3 5 -n 2 -l 28 -y -t --sam -p 16 --best If bowtie should fail, the error will be written to the error output port 'err' and port 2010-07-13 13:20:51.988 CEST bfb24cf1-703e-4f34-a9d9-1527b15ea5f2 2010-07-13 09:59:44.252 CEST b34a296f-361c-4ba0-a0ad-735aa2f37b12 2010-07-12 20:44:08.678 CEST 2010-07-13 10:03:32.845 CEST WBKoetsier 2010-07-13 10:01:30.895 CEST 99ae6bae-0810-4e33-9da1-13b214df6c91 2010-07-13 12:21:16.121 CEST Bowtie: runs bowtie using these set options: -a -3 5 -n 2 -l 28 -y -t --sam -p 16 --best 2010-07-13 10:03:38.649 CEST d0ef063e-c6d1-4275-b42b-26e69d9772d1 2010-07-13 10:04:55.331 CEST b8273aa2-fb95-48f9-a460-5d855d7568d9 2010-07-13 13:21:36.159 CEST 921d4d26-39e6-4ea5-9e16-c66a69bf7764 2010-07-12 20:40:28.934 CEST 95b791a4-549e-4ae6-8932-5e86d67b9f81 2010-07-13 15:04:35.987 CEST 5d9c174d-cb24-43d0-bc30-2bc233617b86 2010-07-13 15:34:24.651 CEST 5f2a95d7-6d62-47f8-92ba-64bdffb0e926 2010-07-13 12:19:11.671 CEST Bowtie: runs bowtie using these set options: 2010-07-13 10:01:48.151 CEST 945aa086-0f72-446f-8761-537e673671b5 2010-07-13 14:26:07.310 CEST Bowtie: runs bowtie using these set options: -a -3 5 -n 2 -l 28 -y -t --sam -p 16 --best If bowtie should fail, the error will be written to the error output port 'err' 2010-07-13 13:20:32.642 CEST 28f50309-c87e-493b-ac56-3e504c68894b 2010-07-12 20:36:29.920 CEST 9d08ecb0-0f1e-4886-8f30-dadcc67ba80f 2010-07-13 10:54:36.687 CEST Bowtie_buildindexBasename00 Desired bowtie index base name. Bowtie-build will generate 6 files with this base name: .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, and .rev.2.ebwt. Bowtie-build: runs bowtie-build. Takes reference genome filename, index basename and index location as input. When index basename and location are left blank, default values will be read from the bowtie.properties file in $HOME/.taverna/lib. 2010-07-14 10:36:19.239 CEST Desired bowtie index base name. Bowtie-build will generate 6 files with this base name: .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, and .rev.2.ebwt. Bowtie will use this index, the original reference genome file is no longer used by bowtie. 2010-07-14 10:37:11.933 CEST Desired bowtie index basename. Bowtie-build will generate Bowtie-build: runs bowtie-build. Takes reference genome filename, index basename and index location as input. When index basename and location are left blank, default values will be read from the bowtie.properties file in $HOME/.taverna/lib. 2010-07-14 10:34:55.201 CEST Nvitripennis 2010-07-14 10:37:20.144 CEST indexLocation00 Location to write index to. Give full path. 2010-07-14 10:37:50.958 CEST Location to write index to. Is relative to the reference genome 2010-07-14 10:38:51.821 CEST Location to write index to. Is relative to the reference genome. Do nut use leading slashes unless you wish to move up in the directory structure using '../'. 2010-07-14 10:58:12.3 CEST bowtie_index 2010-07-14 10:58:38.443 CEST Location to write index to. Is relative to the reference genome. Do nut use leading slashes 2010-07-14 10:46:52.849 CEST /path/to/index/ 2010-07-14 10:38:13.887 CEST referenceGenome00 /data2/Nasonia/ref_genome/Nvit_2.0.linear.fa 2010-07-14 10:59:41.59 CEST Full path and name of the reference genome file (must be in FASTA format) 2010-07-14 10:59:07.33 CEST bowtieBuild_bowtieIndexBasename full path and base name of the created index 2010-07-14 11:00:22.267 CEST err Error string should anything go wrong. 2010-07-14 11:55:52.595 CEST bowtieBuildindexBasename0indexLocation0referenceGenome0bowtieIndexBasename00err00net.sf.taverna.t2.activitiesbeanshell-activity1.2net.sf.taverna.t2.activities.beanshell.BeanshellActivity workflow NGSPipeline-0.3.jar java.lang.String true referenceGenome 0 text/plain java.lang.String true indexBasename 0 text/plain java.lang.String true indexLocation 0 text/plain 0 bowtieIndexBasename 0 0 err 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokebowtieBuildindexBasenameindexBasenamebowtieBuildindexLocationindexLocationbowtieBuildreferenceGenomereferenceGenomebowtieBuild_bowtieIndexBasenamebowtieBuildbowtieIndexBasenameerrbowtieBuilderr 41375c4b-5795-4ba2-a081-e72cb612fd3a 2010-07-13 18:19:50.830 CEST 81085a74-5ed0-4c30-811e-5addc4908bc0 2010-07-14 11:00:26.459 CEST WBKoetsier 2010-07-13 17:57:26.246 CEST Bowtie-build 2010-07-13 17:57:38.5 CEST 5337636f-9508-4fed-a2db-f561c3ca5ab1 2010-07-14 11:55:52.722 CEST e0ec64a9-ed40-4985-8f18-e7d364418eaa 2010-07-14 10:59:41.179 CEST 65a7e8b5-ce7a-478a-ad8a-625a48dbfb22 2010-07-14 12:07:21.275 CEST Bowtie-build: runs bowtie-build. Takes reference genome filename, index basename and index location as input. When index basename and location are left blank, default values will be read from the bowtie.properties file in $HOME/.taverna/lib. 2010-07-13 17:59:31.834 CEST 85e9b539-6347-44c7-8cf7-5c831fe63b7a 2010-07-14 10:58:38.576 CEST e055476f-ea13-4043-ade7-e9c8477327f8 2010-07-13 17:59:33.63 CEST 010c2e3f-bf22-4b4c-b199-30942ab63f69 2010-07-14 12:06:06.820 CEST 2a6c44fd-cd57-4948-8502-dfd3a5709ad1 2010-07-14 10:37:20.345 CEST FilterAndSortfilterAndSort_bamName00 Name and location of BAM file 2010-05-19 16:42:27.770 CEST /path/to/my_alignment.bam 2010-05-19 16:42:38.793 CEST sortedBamNamefilterAndSortbamName0sortedBamName00net.sf.taverna.t2.activitiesbeanshell-activity1.2net.sf.taverna.t2.activities.beanshell.BeanshellActivity workflow NGSPipeline-0.3.jar java.lang.String true bamName 0 text/plain 0 sortedBamName 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.2net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokefilterAndSortbamNamefilterAndSort_bamNamesortedBamNamefilterAndSortsortedBamName Filters and sorts the given BAM file using 2010-07-14 12:55:14.930 CEST Filters and sorts the given BAM file. First, all unaligned reads are removed using the Picard program 'ViewSam'. using the following pipeline: java -jar ViewSam.jar ALIGNMENT_STATUS=Aligned I=alignment.bam 2> viewsam.err | sed '1d' | samtools view -bS - | samtools sort - sorted 2010-07-14 12:57:08.554 CEST Filters and sorts the given BAM file. First, all unaligned reads are removed using the Picard program 'ViewSam'. The unwanted header is then removed using Unix 'sed', after which the reads are sorted according to chromosome position. Output is a BAM file. 2010-07-14 13:00:46.737 CEST 2c46a18c-722c-46ed-9a77-427b8411d396 2010-07-14 13:05:09.696 CEST FilterAndSort 2010-07-14 12:54:55.283 CEST WBKoetsier 2010-07-14 12:54:46.967 CEST 5e2e4b3a-5eb3-48c6-9c30-ab9e2da2ada6 2010-07-14 13:00:46.903 CEST