Workflow Entry: BlastandParse2
Created at: 19/03/10 @ 12:20:29
Version 1
(of 1)
|
|
Title:
BlastandParse2
Type:
Taverna 2
Preview
(Click on the image to get the full size)
Description
This workflow allows you to configure a BioMart query to fetch sequences you want from Ensembl. These sequences are retrieved and a blast database of them is created (by default, in the directory you ran taverna from).
Warning: This workflow assumes that you have blastall and formatdb installed on the machine, and that by default, these are both found or linked in /usr/local/bin. It also assumes that you have write permission to the directory you have run taverna from. The beanshells "create_blastall_cmdArgs" and "create_formatdb_cmdArgs" are what you need to edit if the default locations are not appropriate for you.
Shortcomings:
The names of all the files created and used is hard coded in this workflow. This means that if you run this workflow more than once without editing anything, you will overwrite files you have previously created.
The results of the blastall are parsed to give those that are above the max e-value and minimum percent identity as indicated by the user and an output of uniprot accession numbers is created that can be searched against KEGG.
Download
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
![header=[] body=[This is the author information extracted from the workflow version] cssheader=[boxoverTooltipHeader] cssbody=[boxoverTooltipBody] delay=[200] Information](/images/famfamfam_silk/information.png?1202402239)
Authors (2)
| Baywatch Solutions |
| Bela Tiwari |
![header=[] body=[These are the descriptive titles embedded within the workflow version] cssheader=[boxoverTooltipHeader] cssbody=[boxoverTooltipBody] delay=[200] Information](/images/famfamfam_silk/information.png?1202402239)
Titles (2)
| BlastandParse2 |
| fetchEnsemblSeqsAndBlast |
![header=[] body=[These are the descriptions embedded within the workflow version] cssheader=[boxoverTooltipHeader] cssbody=[boxoverTooltipBody] delay=[200] Information](/images/famfamfam_silk/information.png?1202402239)
Descriptions (2)
| This workflow allows you to configure a BioMart query to fetch sequences you want from Ensembl. These sequences are retrieved and a blast database of them is created (by default, in the directory you ran taverna from). Warning: This workflow assumes that you have blastall and formatdb installed on the machine, and that by default, these are both found or linked in /usr/local/bin. It also assumes that you have write permission to the directory you have run taverna from. The beanshells "create_blastall_cmdArgs" and "create_formatdb_cmdArgs" are what you need to edit if the default locations are not appropriate for you.Shortcomings:The names of all the files created and used is hard coded in this workflow. This means that if you run this workflow more than once without editing anything, you will overwrite files you have previously created.The results of the blastall are parsed to give those that are above the max e-value and minimum percent identity as indicated by the user and an output of uniprot accession numbers is created that can be searched against KEGG. |
| This workflow allows you to configure a BioMart query to fetch sequences you want from Ensembl. These sequences are retrieved and a blast database of them is created (by default, in the directory you ran taverna from). Warning: This workflow assumes that you have blastall and formatdb installed on the machine, and that by default, these are both found or linked in /usr/local/bin. It also assumes that you have write permission to the directory you have run taverna from. The beanshells "create_blastall_cmdArgs" and "create_formatdb_cmdArgs" are what you need to edit if the default locations are not appropriate for you.Shortcomings:The names of all the files created and used is hard coded in this workflow. This means that if you run this workflow more than once without editing anything, you will overwrite files you have previously created.All files created in the working directory are not yet coded to be deleted via the workflow. Ideally there would be an option that a user could choose that would set the files to be kept or deleted after use. |
Inputs (5)
| Name |
Description |
| sequenceFileName |
Provide the name, and if not in your working directory, the location of the file of fasta sequence(s) that you wish to use to search the blast database created in this workflow.
|
| database_input |
the fasta filepath to be used to format a database to blastall against
|
| evalue |
evalue to be used for blastall
|
| minP |
minimum percent identity to parse the blast results.
minimum percent identity to parse the blast results.
|
| maxE |
maximum expectation value to parse the blast results with
|
Processors (8)
| Name |
Type |
Description |
| runBlastSearch |
localworker |
|
| local_create_blastdb |
localworker |
|
| create_blastall_cmdArgs |
beanshell |
|
| create_formatdb_cmdArgs |
beanshell |
|
| runBlastSearch_command_defaultValue |
stringconstant |
|
| local_create_blastdb_command_defaultValue |
stringconstant |
|
| parse_blast_results |
beanshell |
|
| Merge_String_List_to_a_String |
localworker |
|
Beanshells (3)
| Name |
Description |
Inputs |
Outputs |
| create_blastall_cmdArgs |
|
sequenceFileName
evalue
|
cmdArgsList
|
| create_formatdb_cmdArgs |
|
dbsequences
|
cmdArgsList
|
| parse_blast_results |
|
blast_results
minP
maxE
|
protein_list
records
|
Outputs (4)
| Name |
Description |
| Result |
resulting blast report in tabular format
|
| queryEntered |
A List of blast results containing the usual information such as query id, hit id, percent identity etc
|
| proteinList |
list of uniprot accession numbers in a format that can be searched against KEGG
|
| Blast_hits |
A List of blast results containing the usual information such as query id, hit id, percent identity etc
|
Datalinks (15)
| Source |
Sink |
| create_blastall_cmdArgs:cmdArgsList |
runBlastSearch:args |
| runBlastSearch_command_defaultValue:value |
runBlastSearch:command |
| local_create_blastdb_command_defaultValue:value |
local_create_blastdb:command |
| create_formatdb_cmdArgs:cmdArgsList |
local_create_blastdb:args |
| sequenceFileName |
create_blastall_cmdArgs:sequenceFileName |
| evalue |
create_blastall_cmdArgs:evalue |
| database_input |
create_formatdb_cmdArgs:dbsequences |
| runBlastSearch:result |
parse_blast_results:blast_results |
| maxE |
parse_blast_results:maxE |
| minP |
parse_blast_results:minP |
| parse_blast_results:protein_list |
Merge_String_List_to_a_String:stringlist |
| runBlastSearch:result |
Result |
| runBlastSearch:theQuery |
queryEntered |
| Merge_String_List_to_a_String:concatenated |
proteinList |
| parse_blast_results:records |
Blast_hits |
Coordinations (1)
| Controller |
Target |
| local_create_blastdb |
runBlastSearch |
Original Uploader
License
All versions of this Workflow are
licensed under:
Credits (1)
(People/Groups)
Attributions (1)
(Workflows/Files)
[ edit ]
Shared with Groups (1)
Featured In Packs (0)
None
Log in to add to one of your Packs
Ratings (0)
Current:
0.0 / 5
(0 ratings)
Log in to rate and see breakdown of ratings
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Statistics
Version History
Earliest Version:
[1] - BlastandParse2
This Workflow only has one version.
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
No comments yet
Log in to make a comment