User-Based and Item-Based Collaborative Filtering
- Login to Download
- 1 Credits
Resource Overview
Implementation of user-based and item-based collaborative filtering algorithms using the MovieLens dataset. The data is stored in a ga.mat file containing a MATLAB struct structure where ga.train represents the training set and ga.test represents the test set. The original MovieLens dataset, which comes pre-divided into training and testing subsets, has been consolidated into this single ga.mat file for convenience. Both collaborative filtering approaches require pre-computation of similarity distances between items or users, which is performed offline and stored in .mat files for efficiency.
Detailed Documentation
User-based and item-based collaborative filtering are widely used recommendation algorithms. In this implementation, the data is stored in a ga.mat file containing a MATLAB struct structure, where ga.train represents the training set and ga.test represents the test set. For my experiments, I utilized the MovieLens dataset, which comes pre-divided into training and testing subsets. I consolidated both subsets into the ga.mat dataset for streamlined processing.
Collaborative filtering algorithms require computation of similarity measures between either items or users. This calculation process is typically performed offline, meaning we need to pre-compute the similarity matrix and store it in .mat files for use during testing. The similarity matrices are generally very large in size, making them impractical to upload to forums. However, users can employ the SimilitudItems.m function to compute their own similarity matrices. This function implements standard similarity metrics like cosine similarity or Pearson correlation coefficient for calculating item-item or user-user similarities.
In my implementation, I used the MovieLens dataset, which is a web-based movie recommendation system containing 43,000 users and over 3,500 movies. Due to the large size of the test set and the extensive computation time required, I randomly selected a subset of the test data for evaluation. For detailed information about the calculation methodology and implementation, please refer to the probar.m file, which contains the core algorithmic logic for handling the collaborative filtering computations and performance evaluation metrics.
- Login to Download
- 1 Credits