Pack: Core text mining workflows

Created: 2010-02-19 10:12:33      Last updated: 2011-12-13 16:03:17
Information Live view

Title: Core text mining workflows

Information Description

This pack contains workflows we have created to support core text mining tasks.

We currently provide workflows to do these tasks

  • Loading documents (text or PDF)
  • PDF to text conversion
  • Sentence splitting
  • Text cleaning (ASCII or XML-valid)
  • Term recognition (using NaCTeM service TerMine)


Subscribe to RSS feed for items in this pack feed Download Information Items (7)

  • Internal item

    Workflow: Clean plain text (James Eales)

    Added by James Eales ... more than 1 year ago (2010-02-19 10:13:42)


  • Internal item

    Workflow: Clean plain text (ASCII) (James Eales)

    Added by James Eales ... more than 1 year ago (2010-02-19 10:14:16)


  • Internal item

    Workflow: Load PDF from directory (James Eales)

    Added by James Eales ... more than 1 year ago (2010-02-19 10:13:30)


  • Internal item

    Workflow: Load plain text from directory (James Eales)

    Added by James Eales ... more than 1 year ago (2010-02-19 10:13:22)


  • Internal item

    Workflow: PDF to plain text (James Eales)

    Added by James Eales ... more than 1 year ago (2010-02-19 10:14:38)


  • Internal item

    Workflow: Sentence splitting (James Eales)

    Added by James Eales ... more than 1 year ago (2010-02-19 10:14:47)


  • Internal item

    Workflow: Termine with c-value threshold (James Eales)

    Added by James Eales ... more than 1 year ago (2010-02-19 10:14:55)


Information Relationships (0)

There are no relationships.

Information Download

Information Creator

7 items in this pack

Information License

All versions of this Pack are not licensed.

Information Tags (3)

Log in to add Tags

Information Shared with Groups (1)

Information Featured In Packs (0)


Log in to add to one of your Packs

Information Favourited By (0)

No one

Information Statistics

3763 viewings


[ see breakdown ]


Comments Comments (0)

No comments yet

Log in to make a comment

What is this?

Linked Data

Non-Information Resource URI:

Alternative Formats