In search of a simple recommendation engine

What is a Recommendation System?

Recommendation engines are a class of algorithms and models created to suggest items or content to users based on their preferences. These algorithms utilize user behavior patterns to identify and present the most likely items of interest to the user. An early and widely embraced manifestation of recommendation engines is the “people also bought” algorithm, developed through a market basket analysis approach.

In contemporary scenarios, recommendation engines exhibit a diverse range of complexities as organizations employ various methodologies for their design and deployment. For instance, some recommendation engines are designed to evolve their understanding of user preferences over time, with the goal of continuously improving recommendations. In contrast, others may utilize “collaborative filtering,” drawing on aggregated behavioral patterns and preferences from a larger user base to discern and provide the most relevant products to specific users.

The choice of an appropriate approach for constructing recommendation engines depends on organizational factors, including data availability, industry context, user demographics, and desired functionalities. Implementing a recommendation engine doesn’t have to be a singular, burdensome project; instead, it can be modularized into distinct phases, starting with a specific use case and corresponding recommender systems approach, and evolving over time.

A recommendation system is a subclass of Information filtering Systems that seeks to predict the rating or the preference a user might give to an item. In simple words, it is an algorithm that suggests relevant items to users. Eg: In the case of Netflix which movie to watch, In the case of e-commerce which product to buy, or In the case of kindle which book to read, etc.

Here’s a brief overview of how you can create a simple recommendation engine:

  1. Define the problem: Before you start building the recommendation engine, it’s important to understand what you want to achieve. You need to define the problem you’re trying to solve and what kind of recommendations you want to generate.
  2. Choose a recommendation algorithm: There are different types of recommendation algorithms, such as content-based filtering, collaborative filtering, and hybrid filtering. You need to choose the one that best suits your needs.
  3. Gather data: You need to gather data about the items you want to recommend and the users who will be using the recommendation engine.
  4. Preprocess the data: Once you have the data, you need to preprocess it to make it suitable for the recommendation algorithm you’ve chosen.
  5. Implement the recommendation algorithm: You need to implement the recommendation algorithm using a programming language.
  6. Test the recommendation engine: Finally, you need to test the recommendation engine to make sure it’s generating accurate recommendations.

Let’s build a simple project 

This project recommends games based on your friends preferences. The assumption is that if you have friends who like similar categories (ie: genres) as you, those games will be recommended to you. It also has a function to promote the least popular games (this is to boost them).

Althoug you can get all the categories from the following list, I wish to keep it simple for this example. I am not specifying any game titles here although this can be replaced very easily.

Players vs Game categories

Who like what category?

  Player 1 Player 2 Player 3 Player 4 Player 5 Totals
First person shooter x     x x 3
Racing   x x   x 3
Puzzle x x x x   4
Sports x x       2
RPG           0
Totals 3 3 2 2 2  

Games vs categories

What game comes under what category?

  Game 1 Game 2 Game 3 Game 4 Game 5 Totals
First person shooter x   x     2
Racing   x   x x 3
Puzzle         x 1
Sports     x     1
RPG x         1
Totals 2 1 2 1 2  

Player friendship matrix

Who is friends with who?

  Player 1 Player 2 Player 3 Player 4 Player 5 Totals
Player 1   x x   x 3
Player 2 x     x x 3
Player 3 x     x x 3
Player 4   x x     2
Player 5 x x x     3
Totals 3 3 3 2 3  

Questions we need to ask:

  1. Which game category is liked by which users
  2. Which game categories are liked least → RPG
  3. Who likes racing → user 2, user 3, user 5
  4. From #3 what other categories do they like most → Puzzle
  5. Propose the results from #4 to user who did not select those → user 5 to puzzle
  6. Propose results from #2 to random users

We can approach this project using 2 fronts. The traditional “tabular, relational database” approach looks simple but the queries become complicated. A graph database based approach gives much more flexibility in queries. Neo4J is one of the pioneers in this field. I am using the free tire for this project.

Using “relational database” system

Models/entities

  • players [id, name, email]
  • games [id, title, publisher, version, year]
  • categories [id, title]
  • game_categories [game_id, category_id]
  • player_categories [player_id, category_id]
  • player_friends [player_id, friend_id]

SQL

Using “graph database” system

Sample code

I am sharing a GitHub repo, a work in progress. Please pay attention to the README.

Coming soon…


Here are some resources that can help you get started:

Graph Database

Share

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.