Hybrid Graph Neural Networks
Why Recommendation Systems are Better Off Using Hybrid Graph Neural Networks
Hybrid GNNs: Transforming Recommendation Systems with Kumo AI
Recommendation systems have been a subject of great innovation over the past few decades, starting with matrix factorization in the early 2000s and evolving into two-tower and other deep learning approaches in the 2010s. In recent times, graph neural networks emerged as the leading approach to power recommender systems, enabling many well known tech companies (such as Pinterest, Uber, Amazon, and others) to deliver magical customer experiences and double-digit lifts in business metrics. Kumo offers a robust architecture known as hybrid graph neural networks (hybrid GNNs), which has been empirically shown to deliver outstanding performance on both public Kaggle data science challenges, and real-world production deployments.
Understanding the complexity of recommendation systems
At their core, recommendation systems are responsible for recommending content or products to users, with the goal of providing inspiration to users. However, developing these systems is inherently challenging, due to fickle preferences and complex patterns of human behavior. Users vary significantly in preferences; some are explorers who constantly seek new experiences, while others are repeaters who prefer familiarity. This is made even more challenging in the face of big data, cold-start items, new users with very little interaction history, and lack of data diversity.
Because this problem is so challenging, many engineering teams have developed multi-stage recommendation pipelines involving numerous candidate generation steps followed by a complex ensemble of ranking models. These systems take tens of millions of dollars to build, in terms of human and infrastructure cost, and are challenging to maintain over time.
Introducing the hybrid GNN architecture
To address these challenges, Kumo developed a hybrid GNN approach that can create great recommendations with a single model, while capturing the nuanced behaviors of different users with remarkable accuracy. The hybrid GNN models two distinct user behaviors differently within a single backbone GNN model: repeated interactions and explorative interactions; hence, referring to this model as a hybrid GNN. The hybrid GNN is the default model architecture for recommendation and personalization tasks at Kumo, and it can be fine tuned to your specific dataset using the Model Planner.
Why GNNs are ideal for recommendations
GNNs are highly-suitable for recommendation tasks because, unlike traditional models, they can leverage rich graph connectivity patterns to gain a deeper understanding of user preferences and insights that are often missed by other algorithms.
The recommendation problem forms a bipartite graph between users and items, where nodes represent the users and items, and edges represent the user-item interactions. Edges often come with timestamps. Moreover, multiple edges may exist between pairs of nodes, since a user may repeatedly interact with the same item (e.g., repeat ordering of the same product in e-commerce). Given the bipartite graph of the past interactions, a recommendation task can be cast as a link prediction task—one that calls for predicting future interactions between user nodes and item nodes.
Model input: Processing data with the hybrid GNN
The hybrid GNN model is designed to capture fine-grained user behaviors by leveraging graph connectivity. Similar to a standard GNN model (e.g., GraphSAGE), the Hybrid GNN model processes input through a subgraph centered around each user node. For simplicity and efficiency, consider a 1-hop neighbor sampler:
A 1-hop neighbor subgraph contains items that a user previously interacted with, as well as features associated with the sampled users, items, and edges (e.g., timestamp, price etc). Given the subgraph, a hybrid GNN employs a heterogeneous GNN to compute embeddings of the user and items.
Exploring the hybrid GNN model architecture
The key innovation of the hybrid GNN is its hybrid approach to computing item scores per user, which are then sorted to produce the top K item recommendation for the user. Specifically, the hybrid GNN computes item scores differently based on whether or not items are sampled within the subgraph.
These differing scoring approaches are as follows:
(1) For items sampled in the subgraph, the Hybrid GNN computes the item scores by applying a multi-layer perceptron (MLP) over the GNN's item embeddings. Since the GNN's item embedding contains information about historical interactions between the user and the item, the MLP can be applied to the item embedding to predict whether or not the user would repeat the interaction with the item.
(2) For items not sampled in the subgraph, the Hybrid GNN computes the item scores by taking an inner product between GNN's user embeddings with shallow item embeddings. As the low-dimensional item embeddings can capture similarity between items (à la matrix factorization), GNNs can use this method to recommend similar items that a user has never interacted with before.
The observation that user behaviors are diverse is the final element that makes the hybrid GNN work. Some users prefer to repeatedly interact with the same set of items (better captured by the first hybrid GNN scoring approach), while others like to constantly explore new items (better captured by the second hybrid GNN scoring approach). To accommodate such diversity across different users, the hybrid GNN learns a user-specific repetition scalar predicted from GNN's user embeddings with another MLP. The scalar is added to the score of the first approach to capture the repetitiveness of each user's behavior. The more repetitive a user is, the hybrid GNN will predict a higher user-specific repetition scalar. On the other hand, if users are highly explorative (i.e., they interact with new items a lot), the hybrid GNN will predict lower user-specific repetition scalar.
Training and optimization: Maximizing performance with the hybrid GNN
The hybrid GNN is trained end-to-end, optimizing both types of item scores as well as the repetition scalar altogether to maximize the predictive performance of future user-item interactions.This way, the hybrid GNN will figure out the user behaviors from data on its own, producing highly accurate predictions that capture the complex nature of repetition versus exploration behaviors.
Empirical studies: Assessing hybrid GNN performance
The hybrid GNN performance was tested using a Kaggle H&M recommendation challenge. The challenge called for predicting the top 12 items each user would purchase in the next 7 days, with model performance measured by mean average precision (MAP) @ 12. The dataset contains two years of historical data consisting of 1.4M users, 106K items, and 31.7M interactions between them. The challenge attracted a total of 3,000+ teams that submitted results to the public Kaggle leaderboard, over the course of the 3-month competition held in 2022.
Comparing hybrid GNN results to top Kaggle competitors
The following are the results of the hybrid GNN, evaluated on the hidden test set after the competition (Kaggle allows post-competition submissions). A comparison of the hybrid GNN results to the top Kaggle competitor submissions is also provided below:
Model | MAP@12 score on Kaggle public leaderboard |
Hybrid GNN | 0.031 |
Kaggle top 10% | 0.024 |
Kaggle Median | 0.021 |
The hybrid GNN placed in the top 1% of all submissions in the Kaggle H&M recommendation challenge, which is 47% better than the median score (which is what an average data scientist will be able to achieve with traditional techniques). Using Kumo, the entire hybrid GNN training and prediction time took approximately two hours on a single GPU, without any feature engineering required. In contrast, the leading Kaggle competition challengers utilized complex model ensembling techniques and feature engineering code that would require months to develop and maintain in production.
Ablation studies
In order to confirm that the hybrid GNN produces better results than a traditional GNN approaches, Kumo ran an ablation study, where only one ranking technique was used at prediction time. The hybrid GNN was more than 100% better than the “inner product” approach, which is the standard approach used by two-tower recommendation models.
Model | MAP@12 | Hybrid GNN is |
Approach (1) - Use MLP to score items in the sampled GNN subgraph. | 0.023 | 35% better |
Approach (2) - Inner product between user embedding and item embeddings. | 0.015 | 107% better |
Real-world applications: Deploying the hybrid GNN in production
Kumo has deployed the hybrid GNN model architecture to production in many enterprises, resulting in significant boosts in model performance when compared to internal baselines and improvements in revenues and customer experiences. The following illustrates Kumo’s recommendation performance on a large-scale local food delivery service—with the task of recommending the restaurants that each customer will most likely order from in the next 7 days (out of 600K+ restaurants). The impact of the Kumo recommendations powered by the hybrid GNN architecture generated over $100 million in additional sales for the food delivery company.
Model | MAP@12 score |
Kumo hybrid GNN | 0.32 |
Approach (1) | 0.31 |
Approach (2) | 0.27 |
Conclusion: The power of the hybrid GNN in recommendation systems
The hybrid GNN model is a testament to the power of innovative machine learning techniques in handling the intricacies of recommendation systems. By simplifying the recommendation process into a single model that adapts to various user behaviors, this not only enhances user satisfaction but also provides a scalable, efficient solution for businesses aiming to personalize their services. As enterprises continue to seek out technologies that can deliver precise recommendations in real-time, hybrid GNN stands out as a beacon of innovation and performance in the data science community and beyond.
Updated 6 months ago