Recommendation engine plays an important role in user experience. Most of the giant companies like Amazon, Google, Facebook try their best to personalize the recommendations for each customer. For example, YouTube gives you personalized recommendations based on your watch history. Amazon gives you a different set of recommendations like “Related to items you’ve viewed”, “Customers also bought”, “Items you may like”,.etc. Amazon sends you personalized emails based on your browsing and purchasing history. Most of the companies have their own recommendation engines to increase their customer’s conversion rates.
Giving personalized recommendations in real-time is very very difficult. For big companies, the user base and the item catalogue will range from millions to billions. Handling such huge amount of data and recommending a user in real-time is a nightmare. In this article, I will summarize the paper “Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time”.
Pixie was developed by Pinterest which helped them to increase their user engagement by 58% by providing real-time recommendations. Pixie is a graph-based real-time recommendation engine. It is fast, scalable and runs in constant time and is independent of the size of the graph. Let us understand the power of the Pixie. Let’s get started.
![]() |
Image source: Google Images |
Pinterest is a visual catalogue with several billion pins, which are visual bookmarks containing a description, a link, and an image or a video. Users can create boards by collating different pins together.
Each user can curate different pins to create a board. And a pin can be used in multiple boards. For example, we can curate all the Indian cricket team players into the board. With hundreds of millions of users on Pinterest, billions of boards would be generated. We can think of Pinterest as a giant human-curated bipartite graph of 7 billion pins and boards, and over 100 billion edges.
Bipartite graph |
Can you imagine the size of the graph and the interaction data they have? From the bipartite graph, we can find the most created boards, most used pins,..etc. There would be a lot of boards that were created only once and pins that were used only once..etc. The main goal of Pinterest here is to recommend similar pins to users, as they interact with pins. We create a query set Q of pins each with different weights based on user interaction. We query this graph in real-time to recommend similar pins to the users. The query set Q is user-specific and changes dynamically as the user interacts with pins. I hope you got a high-level view of how it works. Now, we will discuss how Pixie random walk algorithm works.
Firstly, I will summarize the basic random walk algorithm. Consider a bipartite graph G, input query Q that contains only one query pin q. Given a query pin q, there can be many short random walks starting from pin q. We can record visit count for each candidate pin p (pins in the graph), which tells the number of times the random walk visited pin p. Higher the visit count, the more it is related to query pin q.
Basic Random walk algorithm:
The random walk consists of a sequence of steps and each step has 3 operations. Before diving into the explanation, I will define the notation. The pin is represented by p, the board is represented by b, E(p) contains all the edges connected to the pin p with different boards (all edges starting from pin p). E(b) contains all the edges connected to the board b with different pins. When a query pin q is given, the current pin p is initialized to pin q. Now, we are at the pin q.
- We randomly pick an edge e from E(q) that connects q with board b. Now, we are at board b.
- We randomly pick an edge e1 from E(b) that connects b with pin q1. Now, we are at pin q1.
- The current pin q is updated to q1.
Bipartite graph - Pins & Boards |
We repeat the above steps. Walk length is determined by the parameter alpha. We stop the random walk based on alpha. The total time complexity N is the sum of all short random walks. After completion of the random walk, we sort the visit count of all candidate pins p to find the top most related pins to pin q.
Basic Random Walk |
Summarizing the random walk, when a query pin q is given, we walk randomly through the graph G, starting from q. Based on the visit count of candidate pins, we generate pins that are related to query pin and recommend them to the user in real-time. The time complexity is N and is independent of the input graph.
Pixie is an advanced version of the random walk algorithm. Below are the following improvements that were made to the random walk:
- Biasing the random walk towards user-specific pins
- Multiple query pins each with different weights
- Multi-hit booster that boosts pins that are related to multiple query pins
- Early stopping that minimizes the number of steps of the random walk while maintaining the quality of the results
1. Biasing the random walk towards user-specific pins:
For example, you have a graph G and query set Q. If you use the basic random walk, then we end up recommending the same pins to all the users having a query set Q. To generate personalized recommendations, the selection of random edge is biased based on user-specific features. In simple terms, we can assume that more priority is given to some edges based on user interaction. This will ensure that the random walk focuses on a set of pins that are more personalized to the user. The edge selection can change dynamically based on the recent user interaction with pins.
Biasing the random walk Thick edge (High priority) - Dashed edge (Low priority) |
In practice, this idea helped Pinterest to improve personalization and quality of recommendations and in turn, the user engagement is increased. Without creating different graphs for 200+ million users, they are changing edge priority/weights during random edge selection for each user to generate personalized recommendations. This also saves a lot of memory. Refer Algorithm 2 (image attached down - taken directly from the original paper) for your reference.
2. Multiple query pins each with different weights:
We cannot recommend users just based on one interaction. We should consider the entire user history to generate good recommendations. Instead of using one pin in query set Q, we include multiple query pins in the query set Q based on user interaction history. Each pin q in Q is assigned a different weight (w_q) based on time since they interacted with pin and the type of interaction with the pin. The random walk algorithm is run for each query pin q in Q. For each candidate pin p, we maintain a different visit counter (Vq[p]) for each query pin q.
Multiple query pins |
Can we use the same walk length for all the query pins? It’s not fair to use a constant walk length for all query pins q. For example, if a pin is very popular, it means that it is used by many boards. The number of edges connected to that pin will be high for a popular pin. In this case, the walk length should be high to get a meaningful visit counts. So, the walk length should depend on the pin’s degree/popularity.
We achieve our goal of step distribution by allocating the number of steps based on a function that increases sub-linearly with the query pin degree and scale the per pin weights w_q by a scaling factor s_q.
scaling factor s_q |
Total number of steps N_q |
The walk length (N_q) depends on recency (w_q) and degree/popularity (s_q). The more steps are allocated to starting pins with high degrees. This is implemented in the lines 1, 2, 3 of Algorithm 3.
Pixie recommendations for multiple pins |
3. Multi-hit booster that boosts pins that are related to multiple query pins:
If you remember, we discussed maintaining q different visit counters for each candidate pin p. How do we combine multiple q visit counters? Rather than simply summing visit counts Vq[p] for a given pin p over all the query pins q ∈ Q, we reward pins that get visited multiple times from multiple different query pins q.
Formulae for aggregating visit counts |
For example, consider that we have 3 query pins in the query set Q. For a candidate pin p1, the respective visit counts look like this: [16, 0, 0]. For a candidate pin p2, the respective visit counts look like this: [8, 8, 0]. If we summate the visit counts, the total visit count is 10 (same for both the candidate pins).
In Pixie, the algorithm gives more priority to candidate pin p2 rather than candidate pin p1, because pin p2 is being visited by different query pins. This says that pin p2 is more related to query set Q. If we use the above formulae to calculate the total visit count, we end up with 16 for candidate pin p1 and 32 for candidate pin p2. This automatically gives high priority to pins if they are visited by different query pins q. This is implemented in line 5 of Algorithm 3.
4. Early stopping that minimizes the number of steps of the random walk while maintaining the quality of the results:
We are using N_q walk steps to find similar pins for a query set Q. To generate real-time recommendations, the Pixie run time is very important. If we can reduce N_q, it is a great achievement.
The solution to this problem is to terminate the walks once the set of top candidates becomes stable, i.e., does not change much with more steps. We do this by maintaining two integer variables n_p and n_v. We terminate the walk when at-least n_p candidate pins are visited at least n_v times. This is implemented in 10 to 13 lines of Algorithm 2.
Pixie Random Walk algorithm with early stopping |
In practice, the early stopping helped Pinterest to achieve the same results in about half the number of steps. This is really a great achievement. The recommendations can be generated very quickly.
The above innovations on top of random walk helped Pinterest to recommend users in real-time and at the same time handle a huge amount of users.
I hope you enjoyed reading the article. If you are interested to know more about the algorithm, I highly suggest you go through the original paper that was mentioned above. In the paper, they also talked about Graph Pruning, that helped them to save memory by removing unnecessary nodes and edges from the graph. They also discussed implementation details (which uses efficient graph data structures), recommendation quality, performance and stability. After reading the paper, please let me know your thoughts in the comments section below.
Thank you so much for reading my blog and supporting me. Stay tuned for my next article. If you want to receive email updates, don’t forget to subscribe to my blog. If you have any queries, please do comment in the comment section below. I will be more than happy to help you. Keep learning and sharing!!
Follow me here:
GitHub: https://github.com/Abhishekmamidi123
LinkedIn: https://www.linkedin.com/in/abhishekmamidi/
Kaggle: https://www.kaggle.com/abhishekmamidi
If you are looking for any specific blog, please do comment in the comment section below.
GitHub: https://github.com/Abhishekmamidi123
LinkedIn: https://www.linkedin.com/in/abhishekmamidi/
Kaggle: https://www.kaggle.com/abhishekmamidi
If you are looking for any specific blog, please do comment in the comment section below.
This is very informative and interesting for me. Thank you for such a wonderful post about Architectural Rendering Firms and for sharing. God bless you.
ReplyDelete