Version 1
(of 1)
|
Version created on:
15/03/11 @ 15:24:43
by:
Matko Bošnjak
|
Revision comments
Last edited on: 15/03/11 @ 15:29:48 by: Matko Bošnjak
Title: Content based recommender
Type: RapidMiner
Preview
(Click on the image to get the full size)
Description
This process is a special case of the item to item similarity matrix based recommender where the item to item similarity is calculated as cosine similarity over TF-IDF word vectors obtained from the textual analysis over all the available textual data.
The inputs to the process are context defined macros: %{id} defines an item ID for which we would like to obtain recommendation and %{recommender_no} defines the required number of recommendations. The process internally uses an example set of items containing item ID and an arbitrary amount of textual attributes.
This process essentially selects only textual attributes which are then used as an input for text mining operator, Process Documents from Data. This operator lowers the case of the text, tokenizes it, filters out short and long tokens, filters out stopwords and in the end does stemming based on Porter’s algorithm. The resulting tokens are then filtered for their appearance in the data: tokens appearing in more than 30% or less than 1% are filtered out. The result of the analysis is an example set of TF-IDF word vectors and a bag of words. The bag of words is used to create a TF-IDF vector for the requested item. Afterwards, using the cosine similarity/distance, we calculate the distance between the requested item TF-IDF vector and all other items vectors. First %{recommender_no} items with their distance score are outputted as a final result.
The output of the process is an example set consisting of two attributes: recommendation and score of the recommendation.
Download
Run
Not available
Workflow Components
Workflow Type
Log in to add Tags
Shared with Groups (0)
None
Current:
0.0 / 5
(0 ratings)
Log in to rate and see breakdown of ratings
Statistics
None
Earliest Version:
[1] - Content based recommender
This Workflow only has one version.
Reviews
(0)
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Linked Data
Non-Information Resource URI: http://www.myexperiment.org/workflows/1947
Alternative Formats
Copyright © 2007 - 2011 The University of Manchester and University of Southampton
No comments yet
Log in to make a comment