Sequences_Alignment_FASTA_R100 FASTQ file from the forward strand 2015-11-25 17:24:48.255 UTC /home/murilo/Dropbox/Doutorado/Taverna/fastq/P1.R1.fastq.gz 2016-02-15 23:48:46.192 UTC FASTA_R200 FASTQ file from the reverse strand 2016-02-16 16:52:46.542 UTC /home/murilo/Dropbox/Doutorado/Taverna/fastq/P1.R2.fastq.gz 2016-02-15 23:49:02.474 UTC Out_Path00 /home/murilo/Dropbox/Doutorado/Taverna/results/Example 2016-02-16 16:52:14.822 UTC The path in which files will be written. If it does not exists, the folder will be created. 2016-02-16 16:52:07.485 UTC g_VCF0 Final g.vcf file. 2016-02-16 16:54:49.174 UTC STDERR1BAM0 Final BAM file 2016-02-16 16:55:59.448 UTC Sequences_AlignmentFASTA_R20FASTA_R10REF0PICARD0Sorted_bam00STDERR00 # Sequences Alignment and Sorting by Coordinate # # We automatically retrieve BAM filenames based on the FASTA_R1 port name # We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz # Alignment via Burrows-Wheeler transformation using BWA-MEM algorithm # Sorts the input SAM or BAM # Input and output formats are determined by file extension 2016-02-22 17:19:37.176 UTC net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> 57f35e8c-c384-4869-9bf6-c2e8178a2878 ################################################################################ # Sequences Alignment and Sorting by Coordinate ################################ ################################################################################ # We automatically retrieve BAM filenames based on the FASTA_R1 port name # We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz fastaName=%%FASTA_R1%% filename=$(basename "$fastaName") filename=${fastaName##*/} sample=`echo $filename | awk -F ".R1" '{print $1}'` bam=${sample}.bam # Alignment via Burrows-Wheeler transformation using BWA-MEM algorithm bwa mem %%REF%% %%FASTA_R1%% %%FASTA_R2%% > $bam.mem # Sorts the input SAM or BAM # Input and output formats are determined by file extension java -jar %%PICARD%% SortSam I=$bam.mem O=Sorted_bam SO=coordinate 1200 1800 FASTA_R1 FASTA_R2 PICARD REF FASTA_R1 FASTA_R1 false false false UTF-8 false false false FASTA_R2 FASTA_R2 false false false UTF-8 false false false PICARD PICARD false false false UTF-8 false false false REF REF false false false UTF-8 false false false Sorted_bam Sorted_bam true false false true 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Loop net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeREFvalue00 Reference genome. Notice you must index the fasta file as in http://gatkforums.broadinstitute.org/gatk/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference and with "bwa index" command. 2016-02-16 16:54:17.473 UTC net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/REF/genome.fa net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokePICARDvalue00net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/picard-tools-1.141/picard.jar net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeGATKvalue00 Complete path to GATK jar file. https://www.broadinstitute.org/gatk/download/ 2016-02-16 16:53:47.128 UTC net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/picard-tools-1.141/GenomeAnalysisTK.jar net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeDBSNPvalue00 Location to a VCF file containing variations from DBSNP. ftp://ftp.ncbi.nih.gov/snp/ 2016-02-16 16:53:13.287 UTC net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/dbsnp.vcf net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokePost_Alignment_ProcessingPICARD0Sorted_bam0FASTA0STDERR00Validated_bam00 # Post Alignment File Processing # We automatically retrieve BAM filenames based on the FASTA_R1 port name # We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz # MarkDuplicates examines aligned records in the supplied SAM or BAM file to # locate duplicate molecules. All records are then written to the output file # with the duplicate records flagged # Here we retreave sequencing information from the FASTA port # Notice that the parameters RGPL and RGLB may be adjusted according to your # biological experiment and sequencing platform # AddOrReplaceReadGroups replaces all read groups in the INPUT file with a single # new read group and assigns all reads to this read group in the OUTPUT BAM # BuildBamIndex generates a BAM index (.bai) file # ValidateSamFile read a SAM or BAM file and report on its validity. 2016-02-22 17:20:56.79 UTC net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> 22fad5f7-bd8b-4297-9b1f-9bc03bffdbe6 ################################################################################ # Post Alignment File Processing ############################################### ################################################################################ # We automatically retrieve BAM filenames based on the FASTA_R1 port name # We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz fastaName=%%FASTA%% filename=$(basename "$fastaName") filename=${fastaName##*/} sample=`echo $filename | awk -F ".R1" '{print $1}'` bam=${sample}.bam # MarkDuplicates examines aligned records in the supplied SAM or BAM file to # locate duplicate molecules. All records are then written to the output file # with the duplicate records flagged java -jar %%PICARD%% MarkDuplicates INPUT=Sorted_bam OUTPUT=$bam.MarkDuplicates METRICS_FILE=$bam.MarkDuplicates.metrics # Here we retreave sequencing information from the FASTA port # Notice that the parameters RGPL and RGLB may be adjusted according to your # biological experiment and sequencing platform fasta=%%FASTA%% id=$(zcat ${fasta} | head -n 1 | grep "@" | cut -f3-4 -d: | uniq) SM=`echo $sample | awk -F _ '{print $1}'` FLOWCELL=`zcat ${fasta} | head -n 1 | grep "@" | cut -f 3 -d:` BARCODE=`zcat ${fasta} | head -n 1 | grep "@" | cut -f 10 -d:` LANE=`zcat ${fasta} | head -n 1 | grep "@" | cut -f 4 -d:` RGPL=ILLUMINA RGLB=Nextera RGCN=$(zcat ${fasta} | head -n 1 | grep "@" | cut -f1 -d: | uniq) # AddOrReplaceReadGroups replaces all read groups in the INPUT file with a single # new read group and assigns all reads to this read group in the OUTPUT BAM java -jar %%PICARD%% AddOrReplaceReadGroups I=$bam.MarkDuplicates O=Validated_bam RGID=$bam.AddOrReplaceReadGroups RGSM=$SM RGCN=$RGCN RGPL=$RGPL RGPU=$FLOWCELL-$BARCODE.$LANE RGLB=$RGLB # BuildBamIndex generates a BAM index (.bai) file java -jar %%PICARD%% BuildBamIndex INPUT=Validated_bam OUTPUT=Validated_bam.bai # ValidateSamFile read a SAM or BAM file and report on its validity. java -jar %%PICARD%% ValidateSamFile INPUT=Validated_bam OUTPUT=Validated_bam.val VALIDATE_INDEX=true MODE=SUMMARY 1200 1800 FASTA PICARD Sorted_bam Sorted_bam true false true UTF-8 false false false PICARD PICARD false false false UTF-8 false false false FASTA FASTA false false false UTF-8 false false false Validated_bam Validated_bam true false false true 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeGenome_Analysis_ToolkitDBSNP0GATK0PICARD0REF0Validated_bam0FASTA0STDERR00BAM_OUT00g_VCF_OUT00 # Genome Analysis Toolkit # We automatically retrieve BAM filenames based on the FASTA_R1 port name # We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz # BuildBamIndex generates a BAM index (.bai) file. # RealignerTargetCreator defines intervals to target for local realignment # IndelRealigner aplies the realignment in the targets from RealignerTargetCreator # BuildBamIndex generates a BAM index (.bai) file. # BaseRecalibrator generates base recalibration table to compensate for systematic errors in basecalling confidences # PrintReads writes out sequence read data (for filtering, merging, subsetting etc) # HaplotypeCaller calls germline SNPs and indels via local re-assembly of haplotypes 2016-02-22 17:21:56.917 UTC net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> 585785f2-7547-4ec5-a540-ab13c46c7bb8 ################################################################################ # Genome Analysis Toolkit ###################################################### ################################################################################ # We automatically retrieve BAM filenames based on the FASTA_R1 port name # We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz fastaName=%%FASTA%% filename=$(basename "$fastaName") filename=${fastaName##*/} sample=`echo $filename | awk -F ".R1" '{print $1}'` bam=${sample}.bam mv Validated_bam $bam # BuildBamIndex generates a BAM index (.bai) file. java -jar %%PICARD%% BuildBamIndex INPUT=$bam OUTPUT=$bam.bai # RealignerTargetCreator defines intervals to target for local realignment java -jar %%GATK%% -T RealignerTargetCreator -R %%REF%% -I $bam -o $bam.AddOrReplaceReadGroups.intervals -known %%DBSNP%% # IndelRealigner aplies the realignment in the targets from RealignerTargetCreator java -jar %%GATK%% -T IndelRealigner -R %%REF%% -I $bam -targetIntervals $bam.AddOrReplaceReadGroups.intervals -o $bam.IndelRealigner.bam -known %%DBSNP%% -LOD 0.4 -model USE_READS -compress 0 --disable_bam_indexing # BuildBamIndex generates a BAM index (.bai) file. java -jar %%PICARD%% BuildBamIndex INPUT=$bam.IndelRealigner.bam OUTPUT=$bam.IndelRealigner.bai # BaseRecalibrator generates base recalibration table to compensate for systematic errors in basecalling confidences java -jar %%GATK%% -T BaseRecalibrator -R %%REF%% -I $bam.IndelRealigner.bam -o $bam.BaseRecalibrator.csv -knownSites %%DBSNP%% -l INFO # PrintReads writes out sequence read data (for filtering, merging, subsetting etc) java -jar %%GATK%% -T PrintReads -R %%REF%% -I $bam.IndelRealigner.bam -BQSR $bam.BaseRecalibrator.csv -o $bam.PrintReads.bam # HaplotypeCaller calls germline SNPs and indels via local re-assembly of haplotypes java -jar %%GATK%% -T HaplotypeCaller -R %%REF%% -I $bam.PrintReads.bam -o g_VCF_OUT -D %%DBSNP%% -ERC GVCF --variant_index_type LINEAR --variant_index_parameter 128000 mv $bam.PrintReads.bam BAM_OUT 1200 1800 DBSNP FASTA GATK PICARD REF DBSNP DBSNP false false false UTF-8 false false false GATK GATK false false false UTF-8 false false false Validated_bam Validated_bam true false false UTF-8 false false false PICARD PICARD false false false UTF-8 false false false FASTA FASTA false false false UTF-8 false false false REF REF false false false UTF-8 false false false g_VCF_OUT g_VCF_OUT false BAM_OUT BAM_OUT true false false true 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeSave_BAMFILENAME0FASTA0PATH_OUT0net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> 22bfa75c-8c8d-4193-962f-57785bd57037 PATH_OUT=%%PATH_OUT%% FILE=`realpath FILENAME` INDEX=`echo $FILE | awk -F BAM_OUT '{print $1}'` fastaName=%%FASTA%% filename=$(basename "$fastaName") filename=${fastaName##*/} sample=`echo $filename | awk -F ".R1" '{print $1}'` if [ -e "$PATH_OUT" ] then echo moving $FILE to $PATH_OUT cp $FILE $PATH_OUT/$sample.bam cp $INDEX*PrintReads.bai $PATH_OUT/$sample.bai else echo creating $PATH_OUT and moving $FILE to $PATH_OUT mkdir $PATH_OUT cp $FILE $PATH_OUT/$sample.bam cp $INDEX*PrintReads.bai $PATH_OUT/$sample.bai fi 1200 1800 FASTA PATH_OUT PATH_OUT PATH_OUT false false false UTF-8 false false false FILENAME FILENAME true false false UTF-8 false false false FASTA FASTA false false false UTF-8 false false false false false false 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeSave_g_vcfFASTA0FILENAME0PATH_OUT0net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> 22bfa75c-8c8d-4193-962f-57785bd57037 PATH_OUT=%%PATH_OUT%% FILE=`realpath FILENAME` fastaName=%%FASTA%% filename=$(basename "$fastaName") filename=${fastaName##*/} sample=`echo $filename | awk -F ".R1" '{print $1}'` if [ -e "$PATH_OUT" ] then echo moving $FILE to $PATH_OUT cp $FILE $PATH_OUT/$sample.g.vcf else echo creating $PATH_OUT and moving $FILE to $PATH_OUT mkdir $PATH_OUT cp $FILE $PATH_OUT/$sample.g.vcf fi 1200 1800 FASTA PATH_OUT PATH_OUT PATH_OUT false false false UTF-8 false false false FILENAME FILENAME true false false UTF-8 false false false FASTA FASTA false false false UTF-8 false false false false false false 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeSequences_AlignmentFASTA_R2FASTA_R2Sequences_AlignmentFASTA_R1FASTA_R1Sequences_AlignmentREFREFvalueSequences_AlignmentPICARDPICARDvaluePost_Alignment_ProcessingPICARDPICARDvaluePost_Alignment_ProcessingSorted_bamSequences_AlignmentSorted_bamPost_Alignment_ProcessingFASTAFASTA_R1Genome_Analysis_ToolkitDBSNPDBSNPvalueGenome_Analysis_ToolkitGATKGATKvalueGenome_Analysis_ToolkitPICARDPICARDvalueGenome_Analysis_ToolkitREFREFvalueGenome_Analysis_ToolkitValidated_bamPost_Alignment_ProcessingValidated_bamGenome_Analysis_ToolkitFASTAFASTA_R1Save_BAMFILENAMEGenome_Analysis_ToolkitBAM_OUTSave_BAMFASTAFASTA_R1Save_BAMPATH_OUTOut_PathSave_g_vcfFASTAFASTA_R1Save_g_vcfFILENAMEGenome_Analysis_Toolkitg_VCF_OUTSave_g_vcfPATH_OUTOut_PathSTDERRPost_Alignment_ProcessingSTDERRSTDERRSequences_AlignmentSTDERRSTDERRGenome_Analysis_ToolkitSTDERRg_VCFGenome_Analysis_Toolkitg_VCF_OUTBAMGenome_Analysis_ToolkitBAM_OUT 33a8abc2-1790-41f0-8945-55be36faeb4a 2016-01-13 13:11:33.663 UTC bb68281f-42a0-43e4-91a1-e0fc3095f3f5 2015-12-01 15:41:57.88 UTC fdf80388-ca5d-4d0d-9bcf-5e1bc44210ff 2016-01-15 12:33:05.82 UTC 2de4076c-86f2-47de-8ece-d9c209f8954c 2016-01-22 13:17:07.385 UTC 1666cd24-8375-488e-8451-4767bbfdb97c 2016-01-13 11:41:14.465 UTC 88bf418e-819b-4e5c-bc61-9063d078cb9d 2016-01-15 12:25:03.548 UTC f574273f-7450-4ce8-85ac-f486cc55a9b2 2016-01-27 12:49:55.554 UTC 6be42b7c-1bb2-482a-a8f2-25a62e1f842e 2016-01-27 12:56:40.306 UTC 0b4adf4b-4483-4a3a-a970-e578122c67f4 2015-12-01 15:46:09.356 UTC e211a690-a483-4435-bd1d-a4527ca3311e 2016-02-01 13:47:44.288 UTC cdb76eb7-ef8b-4502-a7b4-ed449e842cfe 2016-01-13 12:44:26.770 UTC ce71efd7-a5ee-4603-821a-7f31a6a48e50 2015-12-03 16:25:49.459 UTC 4fa2d667-8edd-46d3-b1ef-e984eec1d236 2016-02-16 12:31:40.103 UTC e46f5a2a-03e0-40f2-8930-08a20a1ac5e6 2016-01-22 13:51:06.180 UTC a0a80313-a740-4e94-a2ef-e25c27f5ab9c 2016-01-13 12:06:00.174 UTC b1c5fcce-7032-41d4-940a-3e5a50ddeab4 2016-01-13 12:37:10.659 UTC 9e87ebd8-f7b3-4355-b1a4-b9f82009d390 2016-02-16 16:56:27.255 UTC abda5860-6950-4e41-8ab9-a0d106f85df7 2016-01-15 12:56:31.926 UTC a170e6bb-fabb-4847-9f2b-5cf689c8d7de 2016-01-27 12:46:21.355 UTC 82a12414-5a18-4685-a323-b71c62e2a09a 2016-01-22 13:32:59.764 UTC ab65b9be-65ef-4514-ac22-f18b53524b22 2016-05-16 19:23:00.451 UTC 34bb4f94-bb0e-45a4-a365-035bd658c819 2016-01-22 13:43:17.86 UTC ad91e129-e38c-48fc-83e8-ee6906a79dae 2016-02-16 11:54:31.855 UTC debf8695-0634-41d5-aed4-1fc412ecda6e 2016-01-27 13:10:22.13 UTC ac11fd82-96a9-4103-9c75-9e7bd2ea242c 2016-02-01 11:53:51.741 UTC f2e41fa1-71a5-491a-9fe7-e142f40cea38 2015-12-01 15:47:00.235 UTC fd337246-a4cd-4158-aa72-fa826109e3b3 2016-02-01 12:05:24.29 UTC 553af1e0-59d1-4081-a019-181b0e0ed0f2 2016-01-22 13:10:04.344 UTC c157742e-bff5-439f-8b09-209408d6de3b 2015-12-03 16:58:32.365 UTC 94dfb68b-4103-4f64-a73c-cb5974ac1dea 2016-01-13 15:50:01.715 UTC 8ad394b5-3db8-4e4c-b23f-f554d9ba0776 2016-01-13 12:54:07.258 UTC a406edd6-ba79-40d7-8521-c8c6ade97ccb 2016-02-01 11:55:28.865 UTC 2d972bc1-7970-4afa-901a-1594a1e076a6 2016-01-12 11:23:50.219 UTC d514bbcf-60db-406c-a1d0-9602b6454a2f 2016-01-13 16:42:01.391 UTC 5f9bfa2c-6c57-48a3-9579-892c740907bb 2016-01-27 12:50:52.874 UTC 6ecd3e18-198c-4d4b-a4f7-14240d87f468 2016-01-15 12:23:51.423 UTC 1fd4cfc8-e79b-4cfc-bfe8-ff5665cea03d 2016-01-13 12:12:48.871 UTC 270c70e4-ef66-4778-a804-f2c3ab194a3a 2016-02-04 12:36:44.704 UTC 215c090f-fdbc-4cb6-8db5-ec295212249a 2015-12-01 15:45:14.38 UTC b7cc9499-ba05-42ab-ab93-c927ea27b81b 2016-01-22 13:50:11.146 UTC 54c86c97-83a0-45cb-9d1a-88a334e00549 2016-02-01 11:41:15.301 UTC 5fcafedd-cfa0-4831-8ede-baf9e69b30e8 2016-01-12 11:21:53.328 UTC 72cc003f-e94a-4eda-9630-d6936c390fdc 2016-02-16 12:40:06.126 UTC 8e35b8d2-2138-4a3d-9161-dff5dd52a5e1 2015-12-01 16:00:55.91 UTC e02e798a-eac6-4a12-80e9-195bb0a974d4 2016-01-13 13:00:09.971 UTC 5d2020e7-dc96-4256-86e9-56ef401b8ee7 2016-02-01 12:03:25.526 UTC bc7f7730-b2a8-44cb-810b-aeb0b61c4e04 2016-01-12 11:37:09.242 UTC d96e1331-d9ae-4508-ba0b-edac1f8626c7 2016-01-22 13:47:02.773 UTC 21d4638c-44a6-4674-ac5f-5aa8e4a9b89e 2016-01-13 13:01:09.648 UTC 3f0e718f-63c6-4c89-8241-51817c6ae57b 2016-01-22 13:22:39.557 UTC d054b34e-2c6e-4313-ab4b-dc1aa5788f28 2016-01-12 17:19:16.180 UTC 7ff059f6-540d-476f-98e7-fbef9c5bf686 2016-01-13 12:16:53.223 UTC 39d7d7be-f10b-4bcd-a837-cf0a28d812ea 2016-01-18 12:48:54.365 UTC b2e72d2d-6464-4b6b-8e00-9a20d69f0c4e 2016-01-22 13:28:10.754 UTC 01785f81-cae8-4eaf-a4c3-b2aa3bad9ed7 2016-01-22 13:59:32.670 UTC 3f0a1a17-b5b3-43e9-b64b-4c80b4dca549 2016-02-22 17:22:28.834 UTC be0d3600-27fc-4633-9563-9a79fbb9221a 2016-02-16 12:39:32.853 UTC 41bfed5c-b3e0-4ae9-b359-c3e3c30210bd 2016-02-03 13:21:53.60 UTC c785da75-5ac0-4f74-b959-6a432e40f43a 2016-01-13 11:37:33.416 UTC 3f506e86-8083-4a01-b7e2-db0344d8936d 2016-02-01 13:18:42.767 UTC ad131a73-8f9d-4994-9737-b1cc90bfbc32 2016-01-12 17:28:27.788 UTC d9a58a3b-e42b-48cf-9d0e-6438a72d3ba1 2015-11-25 17:39:05.70 UTC 0714a230-53f5-410e-a653-d5f764256b61 2016-01-22 13:24:56.800 UTC 3cd78064-e33e-4dc2-be34-c6f608c5dde4 2016-01-18 12:50:08.381 UTC b0ad6095-ec05-4c7f-a5ef-a3c632297802 2016-02-01 11:23:55.503 UTC 5a4abdc7-cacd-4c9a-99a9-533eb98da3be 2016-01-15 12:11:38.670 UTC 9d3359d6-3d70-4738-9fbd-d0666e6fe6c9 2016-01-22 13:45:59.948 UTC a4555e39-1113-4c37-98e2-8ed770523ef0 2016-01-27 12:23:23.752 UTC 84302f8d-2769-4a22-ac7d-7d2896f744ea 2016-01-12 17:14:05.571 UTC 5972d5dc-d895-4e36-a8a5-4a1de0d21a44 2016-01-15 12:17:49.969 UTC 1e581471-bc16-48a8-9f17-0d329f78fc74 2016-01-22 13:40:54.268 UTC 5ed8b959-abe7-4661-abe6-8a01b2c0f159 2016-02-15 21:02:32.127 UTC c8f3910e-2fbb-4828-97d6-df47d6c3e875 2016-01-27 12:54:41.205 UTC 1e0337a1-bc5d-42e6-9efb-c16de7cdb671 2016-01-22 13:52:36.633 UTC 78a157ab-e943-4bd3-9b2a-fe4610a28358 2016-01-13 12:18:59.107 UTC 0165b4ea-e445-4894-8ff8-8b6957cc33b1 2016-01-27 12:49:17.68 UTC ccb997b3-d61e-4ff2-bd88-05fd913fc8a4 2016-01-22 13:19:37.716 UTC afafdb45-935e-4e69-af93-7978ff7b5c94 2015-12-01 15:48:05.362 UTC ce0321e1-6ab5-42b7-8887-0b37d44158f1 2016-01-22 13:48:29.666 UTC 7ea7246c-0ab8-4aef-9136-8cab6f7e3c0d 2016-02-01 13:55:10.702 UTC 89da0068-401e-4c2c-9310-f50a4eb0742b 2016-02-01 11:28:12.47 UTC 7f50e81d-35f4-4183-b287-7ca75f16e50d 2016-01-15 12:27:01.578 UTC 8a3569da-14bc-48b7-abe5-6d8b5ba161fe 2016-02-03 15:42:33.825 UTC e149528e-9a40-442d-943c-8140978e6ee2 2016-01-15 12:10:01.214 UTC We automatically retrieve BAM filenames based on the FASTA_R1 port name We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz Alignment via Burrows-Wheeler transformation using BWA-MEM algorithm Sorts the input SAM or BAM Post Alignment File Processing We automatically retrieve BAM filenames based on the FASTA_R1 port name We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz MarkDuplicates examines aligned records in the supplied SAM or BAM file to locate duplicate molecules. All records are then written to the output file with the duplicate records flagged Here we retreave sequencing information from the FASTA port Notice that the parameters RGPL and RGLB may be adjusted according to your biological experiment and sequencing platform AddOrReplaceReadGroups replaces all read groups in the INPUT file with a single new read group and assigns all reads to this read group in the OUTPUT BAM BuildBamIndex generates a BAM index (.bai) file ValidateSamFile read a SAM or BAM file and report on its validity. Genome Analysis Toolkit We automatically retrieve BAM filenames based on the FASTA_R1 port name We assume FASTA_R1 name to be SAMPLENAME.R1.FASTQ or SAMPLENAME.R1.fastq.gz BuildBamIndex generates a BAM index (.bai) file. RealignerTargetCreator defines intervals to target for local realignment IndelRealigner aplies the realignment in the targets from RealignerTargetCreator BaseRecalibrator generates base recalibration table to compensate for systematic errors in basecalling confidences PrintReads writes out sequence read data (for filtering, merging, subsetting etc) HaplotypeCaller calls germline SNPs and indels via local re-assembly of haplotypes 2016-05-16 19:22:50.25 UTC 35fc6426-a6ec-4d5a-bc5d-9fb24a78b6a7 2016-01-27 12:47:29.38 UTC 0e381732-475e-4669-9c89-a234f8ae3066 2016-01-11 18:25:18.841 UTC 5f2e156d-ca9c-481f-9da0-1673ec9e7c0d 2016-01-15 12:07:26.719 UTC 7a8ddfae-601f-41fb-933f-a9df42fd5d2a 2016-02-16 12:24:17.344 UTC fd4b179e-eeea-4391-b086-1048d4907d13 2016-02-12 16:39:20.310 UTC 76e342c8-5963-4b1a-a920-8e68d2b31b51 2016-02-01 13:59:10.6 UTC 99c04c47-9c4e-4e1e-951b-d02d1ab045dc 2016-01-27 13:24:24.746 UTC 4722537e-ce0d-48c5-830d-6122ef2176d9 2016-01-12 17:24:36.807 UTC 89b1c406-3028-4f7d-85e6-84db5b4f9bb9 2016-01-27 13:22:01.903 UTC 3fbb9c8e-3077-470d-bd8b-507d0394028b 2016-01-22 13:53:53.160 UTC 100ffd38-a3cf-4325-8bee-27c7b371366b 2016-01-13 13:09:40.177 UTC 8fbf92a1-1180-4062-8521-c6b21d3882b7 2016-01-13 16:22:41.887 UTC 5a4c7aa7-14da-447a-8a20-d2203b12242f 2016-01-27 15:15:10.561 UTC f1a43c34-213c-4271-9627-648442b849bc 2016-01-27 13:27:48.375 UTC 276535f8-5b74-4dbf-871a-2deab38e95bd 2016-01-27 13:25:37.171 UTC 339c07f1-42ed-4f50-9db8-c881e58b1bbf 2016-01-13 15:51:37.133 UTC b5f52153-68f6-4c3c-b228-77b42780a225 2016-02-01 11:48:34.712 UTC c8351d99-b182-4251-b961-0c9ebeaa5c34 2016-02-01 11:31:50.42 UTC 8b97558c-f607-47cf-a69d-be73e3b41aa1 2015-11-25 18:35:25.434 UTC 19193a76-c63d-4604-ae1e-fbe0ebb0290d 2016-01-13 12:51:44.511 UTC Murilo GuimarĂ£es Borges 2016-05-16 19:22:58.379 UTC 52e5da38-8629-46b7-a9c2-8548d631e9eb 2016-01-15 12:15:01.586 UTC bef32a84-cdd3-4f41-9b00-822498f5dd84 2016-02-01 13:44:10.72 UTC c0eff266-76ea-4168-a4eb-4fc936b38235 2016-01-13 12:15:54.640 UTC 85e9b987-8775-4668-b1c8-badb84ab94a6 2015-12-01 15:43:26.504 UTC 6bac28e5-d8d6-48f4-babf-39f93be66357 2016-01-15 12:36:51.429 UTC 4d6c21b5-a1c7-4af4-a227-85b95e0a4754 2016-02-15 23:49:29.994 UTC 4cb2d024-850c-49b9-ae8a-75ad582e17e5 2015-12-03 16:54:00.355 UTC 642c78aa-508d-4dc1-bae4-4ec6c73b5e5d 2016-01-18 12:47:37.20 UTC 4d1f0b8f-00a2-4c12-9ae5-94b302669ef6 2016-01-22 13:45:10.527 UTC f09be366-51b7-4321-866c-1fb627186808 2016-02-01 13:28:35.331 UTC 580558bc-f59c-45dd-a9b8-07380848348b 2016-02-16 12:16:32.44 UTC 6389ae43-2772-4627-9900-d9d199563652 2016-01-15 12:28:38.153 UTC 94ee64e0-847f-43b7-b04b-a1e0a8b13b44 2016-01-12 17:29:10.139 UTC 94ef8b44-9dbf-4040-826f-3220da893a8e 2016-02-16 11:29:25.82 UTC 0c653987-51ca-4ff3-a885-a5c027049c05 2016-02-16 16:55:20.0 UTC Sequences Alignment and Sorting by Coordinate 2016-05-16 19:22:51.361 UTC 71aab145-a3bd-4334-90ca-22e484c0016f 2016-01-22 13:28:55.233 UTC bdb8ab7e-4a71-4ba5-bd43-e0338e660cae 2016-02-01 13:56:46.366 UTC