PhD Supplementary Information2011-11-22T12:35:04+00:00/groups/1912010-12-08T11:55:12+00:00Paul Fisher shared Text Mining WorkflowsThis pack contains workflows to navigate from candidate Quantitative Trait genes and pathways to a given phenotype.urn:uuid:ab0cc886-8655-4852-9e29-9249eb5a8f8cPaul Fisher2010-12-08T11:50:04+00:00Paul Fisher shared Extract Scientific TermsThis workflow takes in a document containg text and removes and non-ascii characters. The cleaned text is then sent to a service in dresden to extract all scientific terms. These terms represent a profile for the input document. Any null values are also removed.urn:uuid:1417aa63-7161-4586-8d5d-300ac111442dPaul Fisher2010-12-08T11:47:14+00:00Paul Fisher shared Pathway to PubmedThis workflow takes in a list of KEGG pathway descriptions and searches the PubMed database for corresponding articles. Any matches to the pathways are then retrieved (abstracts only). These abstracts are then returned to the user.urn:uuid:7095a7c3-efb1-46c2-82ad-15c6062de72bPaul Fisher2010-12-08T11:38:40+00:00Paul Fisher shared Rank Phenotype TermsThis workflow counts the number of articles in the pubmed database in which each term occurs, and identifies the total number of articles in the entire PubMed database. It also identified the total number of articles within pubmed so that a term enrichment score may be calculated. The workflow also takes in a document containing abstracts that are related to a particular phenotype. Scientiifc terms are then extracted from this text and given a weighting according to the number of terms that appear in the document. The higher the value the better the score. This is given as: X = log((a / b) / (c / d)) where: a = number of occurnaces of individual terms in phenotype corpus b = number of abstracts in entire phenotype corpus c = number of occurnaces of individual terms in entire pubmed d = numb …urn:uuid:0243059e-2911-4522-9d4e-16f6657eeb24Paul Fisher2010-12-08T11:35:21+00:00Paul Fisher shared Cosine vector spaceThis workflow calculates the cosine vector space between two sets of corpora. The workflow then removes any null values from the output. this is some extra text vbeing addedurn:uuid:2c62d4c0-815b-4f7f-9686-0199b75b50ddPaul Fisher2010-11-15T12:30:31+00:00Paul Fisher shared KEGG pathways common to both QTL and microarray based investigationsThis workflow takes in two lists of KEGG pathway ids. These are designed to come from pathways found from genes in a QTL (Quantitative Trait Loci) region, and from pathways found from genes differentially expressed in a microarray study. By identifying the intersecting pathways from both studies, a more informative picture is obtained of the candidate processes involved in the expression of a phenotypeurn:uuid:23105c9a-77f7-4689-ac51-858451e79489Paul Fisher2010-11-15T12:25:17+00:00Paul Fisher shared Pathways and Gene annotations for RefSeq idsThis workflow searches for genes which were found to be differentially expressed from a microarray study in the mouse, Mus musculus. The workflow requires an input of gene ref_seq identifiers. Data is then extracted from BioMart to annotate each of the genes found for each ref_seq id. The Entrez and UniProt identifiers are then sent to KEGG to obtain KEGG gene identifiers. The KEGG gene identifiers are then used to search for pathways in the KEGG pathway database.urn:uuid:98f8d66a-36c8-462b-8238-40373274e5e8Paul Fisher2010-11-15T12:08:34+00:00Paul Fisher shared Pathways and Gene annotations for QTL regionThis workflow searches for genes which reside in a QTL (Quantitative Trait Loci) region in the mouse, Mus musculus. The workflow requires an input of: a chromosome name or number; a QTL start base pair position; QTL end base pair position. Data is then extracted from BioMart to annotate each of the genes found in this region. The Entrez and UniProt identifiers are then sent to KEGG to obtain KEGG gene identifiers. The KEGG gene identifiers are then used to searcg for pathways in the KEGG pathway database.urn:uuid:d0c5b518-db34-4152-8356-1b69664166cbPaul Fisher2010-07-05T14:07:39+00:00Paul Fisher shared Phenotype to pubmedThis workflow takes in a phenotype search term, and searches for abstracts in the PubMed database. These are passed to the eSearch function and searched for in PubMed. Those abstracts found are returned to the userurn:uuid:bb27f784-8fea-45b0-8304-994c3adce68aPaul Fisher2010-07-05T13:14:41+00:00Paul Fisher shared Gene to PubmedThis workflow takes in a list of gene names and searches the PubMed database for corresponding articles. Any matches to the genes are then retrieved (abstracts only). These abstracts are then returned to the user.urn:uuid:10e2bffa-d34e-4546-994a-c14ea25aa17cPaul Fisher2009-08-11T14:52:13+00:00Paul Fisher shared Genotype to PathwayThis pack is for investigating links between the genotype of an organisms to possible pathways. This constitutes half of the pathway-driven approach, genotype to pathway, and pathway to phenotype.urn:uuid:8153526e-8fef-465e-a871-a7ba5d1f1a4fPaul Fisher2009-08-11T14:51:32+00:00Paul Fisher shared Pathway to Phenotype using Text MiningThis pack contains a list of workflows and result files obtained from the analysis of candidate pathways believed to play a role in resistance to African Trypanosomiasis in the mouse model organism.urn:uuid:3339a2f9-cf86-4633-a9e4-10db4c77932dPaul Fisher