e-LICO2011-11-22T12:35:09+00:00/groups/2262013-08-24T07:32:54+00:00Rahmi joined the e-LICO groupurn:uuid:cd1c101d-86d8-4b8c-b09d-7fcd6e0dc659Rahmi2013-05-08T14:52:01+00:00Simon Fischer shared Image Mining with RapidMinerThis is an image mining process using the image mining Web service provided by NHRF within e-Lico. It first uploads a set of images found in a directory, then preprocesses the images and visualizes the result. Furthermore, references to the uploaded images are stored in the local RapidMiner repository so they can later be used for further processing without uploading images a second time.urn:uuid:7ae559bb-1a9d-4912-b48a-4b5dacf57aa3Simon Fischer2013-05-08T14:52:01+00:00Sebastian land shared Using Remember / Recall for "tunneling" resultsThis process shows how Remeber and Recall operators can be used for passing results from one position to another position in the process, when it's impossible to make a direct connection. This process introduces another advanced RapidMiner technique: The macro handling. We have used the predefined macro a, accessed by %{a}, that gives the apply count of the operator. So we are remembering each application of the models that are generated in the learning subprocess of the Split validation. After the Split validation operator has been executed (take a look at the execution order to be sure (Menu Process / Operator Execution Order / Show...)), we can recall the remembered objects with their name. Note that we have replaced the macro here with the constant 2, since the complete model will be tr …urn:uuid:f9d89899-2f0b-4f81-b9bd-44a7fa0208d5Sebastian land2013-05-08T14:52:00+00:00James Eales shared Clean plain textThis workflow will remove any XML-invalid characters (these characters often appear in the output of PDF to text software) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow. urn:uuid:fbb82c23-1e1a-43f6-bda0-b5cdccd8c1f9James Eales2013-05-08T14:52:00+00:00James Eales shared Terms from collection of text filesThis workflow will give you a set of candidate terms for each text file in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows. These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack . If you receive errors when running this workflow then check if you have access to the NaCTeM web services here . If you do not have access then you can request access from the same page.urn:uuid:20fed91d-6bd1-4a45-af85-7381418bbbcdJames Eales2013-05-08T14:52:00+00:00James Eales shared Load plain text from directoryThis workflow will automate the reading of a set of text files stored in a single directory (the path to which should be supplied as a single input value). It will assume that the text files are saved using the default character encoding for the system that Taverna is running on. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow. urn:uuid:9ab8d301-824c-4f41-80a2-46e8f4f7becfJames Eales2013-05-08T14:52:00+00:00James Eales shared Load PDF from directoryThis workflow will automate the reading of a set of PDF files stored in a single directory (the path to which should be supplied as a single input value). This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow. urn:uuid:a7a93c3d-369d-41d7-9f28-9621ac268a39James Eales2013-05-08T14:52:00+00:00James Eales shared PDF to plain textThis workflow will extract the plain text content of PDF files supplied to the input port.  You can connect the Load PDF from directory workflow to this workflows input. We recommend you send the output from this workflow to the Clean plain text workflow, because the PDF to text process can add characters into the text that are XML-invalid and therefore can not be sent to most services as plain text.  Another way round this problem is to encode the text as Base64 using the handy local service ("Encode Byte Array to Base 64") included with Taverna, although this requires a service that knows to decode the Base 64 back to text, which is not common. The PDF to text service makes use of the "pdftotext" executable from Xpdf . This is a workflow component, designed to be used as a neste …urn:uuid:f682e813-504e-4679-83f7-b45a3f8f16b8James Eales2013-05-08T14:52:00+00:00James Eales shared Sentence splittingThis workflow will attempt to split up text into sentences, returning a list of sentences to the output port. The sentence splitting service makes use of the OpenNLP sentence detector and has been trained to work on english text. This workflow can be used to provide input to the Termine with c-value threshold workflow. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.urn:uuid:05bb00b1-b2ed-415b-9ee8-8397cb8c61edJames Eales2013-05-08T14:52:00+00:00James Eales shared Termine with c-value thresholdThis workflow accepts a list of sentences from a single document and returns the terms found by the TerMine web service. It also allows you to set a threshold c-value score so that only terms with a user-controlled probability (of being a real term) are returned as an output.   To get sentences to supply to this workflow you can use the sentence splitting workflow.  The TerMine service (used in this workflow) only accepts text in ASCII encoding, so you should also use the Clean plain text (ASCII) workflow before splitting sentences. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow. Unfortunately there are some restrictions on IP access to the TerMine web service at the NaCTeM. These can be viewed here . …urn:uuid:a188cd76-ff67-44a8-8169-739997ca2759James Eales2013-05-08T14:52:00+00:00James Eales shared Clean plain text (ASCII)This workflow will remove any XML-invalid and non-ASCII characters (e.g. for sending to the ASCII-only Termine service) from any text supplied to the input port. This is a workflow component, designed to be used as a nested workflow inside a larger text mining or text processing workflow.urn:uuid:a80e4bae-ec1e-4f32-8f0e-d34b7c54aa7fJames Eales2013-05-08T14:52:00+00:00James Eales shared Terms from collection of PDF files This workflow will give you a set of candidate terms for each PDF document in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows. These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack . If you receive errors when running this workflow then check if you have access to the NaCTeM web services here . If you do not have access then you can request access from the same page.urn:uuid:158e3945-f092-4294-ab12-fde23079785dJames Eales