Workflows

Search filter terms
Filter by type
Filter by tag
Filter by user
Filter by licence
Filter by group
Filter by wsdl
Results per page:
Sort by:
Showing 40 results. Use the filters on the left and the search box below to refine the results.
Tag: text mining

Workflow EBI_Whatizit (1)

Thumb
Perform a text-mining analysis of an input text document using the EBI's Whatizit tool (http://www.ebi.ac.uk/webservices/whatizit/info.jsf). Whatizit provides a number of text-mining pipelines which can can detect various terms of biological interest in text documents. For example finding gene names and mapping them to UniProtKB identifiers, finding chemical terms and mapping them to ChEBI, etc.

Created: 2008-07-09

Credits: User Hamish McWilliam

Uploader

Workflow Termine Webservice (1)

Thumb
Termine is a service provided by the National Centre for Text Mining (NaCTeM) to assist in the discovery of terms in text. More information on the Termine service can be found here. This workflow represents the simplest method of using Termine. The input represents a text string with the output being an string containing a representation of the list of terms, with their C-Value scores (representing significance in the text), in a simple xml format. Other variations of this tools will be adde...

Created: 2008-05-19 | Last updated: 2008-05-19

Credits: User Brian Rea Network-member National Centre for Text Mining (NaCTeM)

Workflow Terms from collection of text files (1)

Thumb
This workflow will give you a set of candidate terms for each text file in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows.  These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow then...

Created: 2010-02-22 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Load PDF from directory (1)

Thumb
This workflow will automate the reading of a set of PDF files stored in a single directory (the path to which should be supplied as a single input value). This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.  

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Load plain text from directory (1)

Thumb
This workflow will automate the reading of a set of text files stored in a single directory (the path to which should be supplied as a single input value).  It will assume that the text files are saved using the default character encoding for the system that Taverna is running on.  This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.  

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Clean plain text (ASCII) (1)

Thumb
This workflow will remove any XML-invalid and non-ASCII characters (e.g. for sending to the ASCII-only Termine service) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Rank Phenotype Terms (1)

Thumb
This workflow counts the number of articles in the pubmed database in which each term occurs, and identifies the total number of articles in the entire PubMed database. It also identified the total number of articles within pubmed so that a term enrichment score may be calculated. The workflow also takes in a document containing abstracts that are related to a particular phenotype. Scientiifc terms are then extracted from this text and given a weighting according to the number of terms that ...

Created: 2009-08-10

Credits: User Paul Fisher

Workflow Cosine vector space (1)

Thumb
This workflow calculates the cosine vector space between two sets of corpora. The workflow then removes any null values from the output. The result is a cosine vector score between 0 and 1, showing the significance of any links between one concept (e.g. pathway) to another (e.g. phenotype). A score of 0 means there is no or an undetermined correlation between the two concepts. A score approaching 1 represents positive correlation.

Created: 2009-08-10 | Last updated: 2009-08-10

Credits: User Paul Fisher

Workflow Clean plain text (1)

Thumb
This workflow will remove any XML-invalid characters (these characters often appear in the output of PDF to text software) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.  

Created: 2010-02-18 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Author to Wordcloud (3)

Thumb
This small workflow demonstrates how to connect to and use Europe PMC (http://europepmc.org/RestfulWebService). The workflow searches the publications of an author, extracts the abstracts, counts the word frequencies and plot a wordcloud using the R package of the same name. The Rshell plot_wordcloud also applies text mining operations (transformation to lower case, removing punctuation, stripping whitespace and removing English stopwords) using the R package tm.

Created: 2015-12-02 | Last updated: 2015-12-07

Credits: User Magnus Palmblad

Results per page:
Sort by: