Architecture and Security
Architecture
As a Snowflake native app, Kumo and its various required binaries, container images, and packages are available as a single package in the Snowflake Marketplace. Kumo runs multiple containers (e.g., UI, REST, workers and trainer) that perform various functions to allow users to connect and process data and build ML models needed for predictive analytics.
Kumo’s architecture consists of two key layers:
Control Plane
The control plane is a collection of services that includes the metadata manager, a compiler that both translates predictive queries to an execution plan for the graph model and is used to generate predictions, and workflow orchestrators to coordinate various activities across Kumo.
Predictive Query Engine
The Predictive Query Engine is a heterogeneous distributed system that processes predictive queries. It contains multiple components, each responsible for a specific function in the ML pipeline.
The Data Engine processes the input relational data using Snowpark APIs and generates graph and training data that is used subsequently to build the GNN model. These outputs are materialized to an intermediate Snowflake stage, owned by the client.
The Graph Engine loads graphs and (node) attributes generated by the data engine, in the Graph and Column Store respectively. Its primary role is to serve subgraph requests and node attributes for the GNN training and batch inference.
- 
The Graph Store contains all entities in the data warehouse (rows). 
- 
The Column Store contains all the attributes about the entities (columns). 
The GPU trainer is responsible for GNN training and batch inference. It uses the available GPU compute in Snowflake to build the core graph model.
Kumo’s GNN models learn from graph-structured data and use the leading open-source framework, PyTorch Geometric, for model execution, built and maintained by Kumo members. Kumo leverages a variety of graph neural network architectures and training procedures especially designed for learning on relational databases.
Security
The Snowflake native app supports Kumo's end-to-end machine learning (ML) platform. Kumo enables enterprises to leverage state-of-the-art predictive analytics to make predictions, allowing data scientists to immediately tackle many prediction problems by first registering data sources and then issuing different SQL-like predictive query interfaces that specify their ML tasks. Kumo then executes the predictive query and automates the entire process of feature preparation, label engineering, training dataset creation, model optimization, and MLOps, making it easy for users to build multiple ML models.
Additionally, the Kumo native app for Snowflake provides additional security benefits for your organization, since all data resides within your Snowflake environment.

Updated about 1 month ago
