HomeDocsAPI Reference
Kumo.ai
Docs

Architecture and Security

Architecture

As a Snowflake native app, Kumo and its various required binaries, container images, and packages are available as a single package in the Snowflake Marketplace. Kumo runs multiple containers (e.g., UI, REST, workers and trainer) that perform various functions to allow users to connect and process data and build ML models needed for predictive analytics.

Kumo’s architecture consists of two key layers:

Control Plane 

The control plane is a collection of services that includes the metadata manager, a compiler that both translates predictive queries to an execution plan for the graph model and is used to generate predictions, and workflow orchestrators to coordinate various activities across Kumo.

Predictive Query Engine

The Predictive Query Engine is a heterogeneous distributed system that processes predictive queries. It contains multiple components, each responsible for a specific function in the ML pipeline.

The Data Engine processes the input relational data using Snowpark APIs and generates graph and training data that is used subsequently to build the GNN model. These outputs are materialized to an intermediate Snowflake stage, owned by the client.

The Graph Engine loads graphs and (node) attributes generated by the data engine, in the Graph and Column Store respectively. Its primary role is to serve subgraph requests and node attributes for the GNN training and batch inference.

  • The Graph Store contains all entities in the data warehouse (rows).

  • The Column Store contains all the attributes about the entities (columns).

The GPU trainer is responsible for GNN training and batch inference. It uses the available GPU compute in Snowflake to build the core graph model.

Kumo’s GNN models learn from graph-structured data and use the leading open-source framework, PyTorch Geometric, for model execution, built and maintained by Kumo members. Kumo leverages a variety of graph neural network architectures and training procedures especially designed for learning on relational databases.

Security

The native app for Snowflake supports Kumo's end-to-end machine learning (ML) platform. Kumo enables enterprises to leverage state-of-the-art predictive analytics to make predictions, allowing data scientists to immediately tackle many prediction problems by first registering data sources and then issuing different SQL-like predictive query interfaces that specify their ML tasks. Kumo then executes the predictive query and automates the entire process of feature preparation, label engineering, training dataset creation, model optimization, and MLOps, making it easy for users to build multiple ML models.

With the Kumo native app for Snowflake:

  • There is no hardware (virtual or physical) to select, install, configure, or manage.
  • There is no software to install, configure, or manage.
  • Kumo handles ongoing maintenance, management, upgrades, and tuning.

Additionally, the Kumo native app for Snowflake provides additional security benefits for your organization, since all data resides within your Snowflake environment. However, keep in mind that the setup of Kumo as a Snowflake native app will require assistance from Kumo during installation and the PoC process.

For a simpler deployment mode that requires less maintenance overhead, please see Kumo's SaaS version.