REPO

To understand this, let’s first review the two common types of recommender systems: content-based and collaborative filtering. Afterward, we’ll explore how the hybrid recommender combines both to create a more data-efficient recommender that overcomes the cold start problem.

Cold Start Problem

The cold start problem refers to the challenge recommendation systems face when a new user arrives. Approaches relying on past actions or items purchased are susceptible to this, as new users lack any historical data.

Background

The main source for this background is a tutorial by Google Developers.

Content-based Filtering

Image Credit: Google Developers Tutorial

Content-based filtering uses item features to recommend other items similar to what the user likes, based on their previous actions or explicit feedback. Users themselves may also be represented in the same feature space as the items. Recommendations are made using a similarity metric between users and items, such as the dot product.

Advantages include scalability to a large number of users, as it doesn’t require explicit user features. However, the hand-engineering of item features may limit its performance, as the model can only be as good as the engineered features.

Collaborative Filtering

Image Credit: Google Developers Tutorial

Collaborative filtering, in contrast, doesn’t rely on hand-engineered features. It utilizes similarities between users and items simultaneously to provide recommendations, allowing for serendipitous suggestions based on similar users’ interests. The embeddings, or latent features, are learned automatically without manual feature engineering.

Advantages include not requiring domain knowledge and the ability to provide serendipitous recommendations. However, collaborative filtering struggles with fresh items, known as the cold-start problem.

Hybrid Recommender

Now, introducing the hybrid recommender. Instead of learning one embedding per user, we learn a mapping from user features to their latent representation. For items, we still learn one embedding per item. If item features are available, we can also learn the mapping from item features to their latent embeddings. This mapping can be a matrix multiplication (equivalent to a single-layer neural network) or more complex, trained with backpropagation.

This approach simplifies training, as the size of the mapping doesn’t scale with the number of users, making it suitable for scenarios with numerous users. Furthermore, it allows recommendations for new users, even without a purchase history.

Project-related Links:

Cold Start Problem

Background

Content-based Filtering

Collaborative Filtering

Hybrid Recommender