James Eales' Workflows

Search filter terms
Filter by type
Filter by tag
Filter by licence
Filter by group
Filter by wsdl
Results per page:
Sort by:
Showing 2 results. Use the filters on the left and the search box below to refine the results.
Tag: e-lico Wsdl: http://gnode1.mib.man.ac.uk:8080/FullTextWebServices/PdfToTextService?wsdl

Workflow PDF to plain text (1)

Thumb
This workflow will extract the plain text content of PDF files supplied to the input port.  You can connect the Load PDF from directory workflow to this workflows input. We recommend you send the output from this workflow to the Clean plain text workflow, because the PDF to text process can add characters into the text that are XML-invalid and therefore can not be sent to most services as plain text.  Another way round this problem is to encode the text as Base64 using the handy loc...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Workflow Terms from collection of PDF files (2)

Thumb
This workflow will give you a set of candidate terms for each PDF document in a user-specified directory. You can also specify a c-value threshold that will restrict the terms to those with higher scores. This workflow was created using only nested workflows.  These workflow components work on their own and can be linked together to form more complex workflows such as this. You can view the text mining workflow components in this pack. If you receive errors when running this workflow t...

Created: 2010-02-19 | Last updated: 2011-12-13

Credits: User James Eales

Results per page:
Sort by: