HomeDocsAPI Reference
Kumo.ai
Docs

Installing the Snowflake Native App

Overview

In just four easy steps, you can install Kumo as a native Snowflake app to generate high-quality predictions, while keeping your data safely inside your Snowflake environment.

  1. Sign up for a free trial
  2. Install Kumo from the Snowflake Marketplace
  3. Create databases, roles, network rules, and privileges for Kumo
  4. Create a compute pool and launch the Kumo container

1. Sign up for a free trial

As Kumo on Snowflake native apps is under private preview, a Kumo team member will enable the Kumo native app for your Snowflake Account ID. Once this is done, you can install the Kumo app from the Snowflake Marketplace.

To request a free trial, navigate to Kumo's Snowflake Marketplace Listing. Click the Get button, and then click Request.

2. Install Kumo from the Snowflake Marketplace

Prerequisites

Before installing and configuring Kumo as a native Snowflake application, ensure that you have completed the following:

Here are the currently supported settings for Kumo’s Native App to be used in Snowflake.

  • Private link is not enabled
  • If SSO is configured, it must be using SAML2 and not legacy SAML (see Migrating to SAML2)
  • Customer would need to provision a "service account" user with key-pair authentication or basic auth, which has access to the Snowflake tables that you wish to load into Kumo.
  • Snowflake Account is on AWS.

Install the Kumo Native App

Note that the following steps can only be done using the ACCOUNTADMIN role or by an user/role with the appropriate privileges

  1. You should see Kumo shared with you when navigating to “Data Products” → “Apps” in SnowSight. Click the Get button.

    Screenshot 2024-06-06 at 7.22.10 PM.png
  2. Choose a warehouse to download the app and click the Get button in the next prompt. This warehouse is used to only install the app and can be changed later.

    Screenshot 2024-06-06 at 7.23.18 PM.png
  3. The app will start installing and take 2-5 mins to complete.

    Screenshot 2024-06-06 at 7.23.34 PM.png
  4. Once the installation is completed, click on the Done button and setup Kumo using the instructions in the next section.

    Screenshot 2024-06-06 at 7.23.46 PM.png

3. Setting up the Kumo Native app (run by an administrator)

Step 1: Create required objects and roles

The following SQL script creates the databases, roles, network rules and privileges required to setup Kumo in your account. It uses the ACCOUNTADMIN role and a user with that role should execute it.

  1. You are required to provide the warehouse name. Note that ACCOUNTADMIN must have USAGE privilege on the warehouse provided.
  2. Set the name of the Kumo Native app if a different name was used when the app was installed from the marketplace. The app name defaults to KUMO .
-----------------------------
---  USER INPUT REQUIRED  ---
-----------------------------
SET USER_WAREHOUSE = '<ADD WAREHOUSE TO USE>';

-- (Optional) Modify the following if a different name was used to install the app.
SET KUMOAPP_NAME = 'KUMO';
SET KUMOAPP_DB = 'KUMO_APP_DB';

SET KUMOAPP_USR = CONCAT(($KUMOAPP_NAME),'.APP_USER');

USE ROLE ACCOUNTADMIN;
USE WAREHOUSE IDENTIFIER($USER_WAREHOUSE);

-- Grant BIND SERVICE ENDPOINT privilege to the Kumo application to enable 
-- network ingress and access to the Kumo UI for users of Kumo in your account.
-- Details of the privilige can be found here
-- https://other-docs.snowflake.com/LIMITEDACCESS/native-apps/na-spcs-consumer#set-up-access-to-network-objects
GRANT BIND SERVICE ENDPOINT ON ACCOUNT TO APPLICATION IDENTIFIER($KUMOAPP_NAME);

-- Share events from the application with the provider, KUMO.AI
ALTER APPLICATION IDENTIFIER($KUMOAPP_NAME) SET AUTHORIZE_TELEMETRY_EVENT_SHARING=true;

-- Database and schema to hold network rules for the Kumo application along
-- with usage for the Kumo application.
CREATE DATABASE IF NOT EXISTS IDENTIFIER($KUMOAPP_DB);
USE DATABASE IDENTIFIER($KUMOAPP_DB);
CREATE SCHEMA IF NOT EXISTS KUMO_SCHEMA;
USE SCHEMA KUMO_SCHEMA;

---------------------
-- EGRESS RULES FOR KUMO
---------------------

-- Network rule to allow access to Temporal Cloud.
CREATE OR REPLACE NETWORK RULE kumo_temporal_egress
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('preprod.put0h.tmprl.cloud', 'preprod.put0h.tmprl.cloud:7233');

-- Access to AWS resources used by Kumo
CREATE OR REPLACE NETWORK RULE kumo_aws_egress
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = (
    -- AWS Secrets Manager contains secrets Kumo uses to access other AWS services.
    -- No customer data or secrets are written to this.
    'secretsmanager.us-west-2.amazonaws.com', 
    -- The following bucket contains library dependencies used by Kumo. The Kumo Native app downloads
    -- these dependencies on startup and does not write any data to this bucket.
    'kumo-pyspark-venv.s3.amazonaws.com',
    'kumo-pyspark-venv.s3.us-west-2.amazonaws.com'
);
  
-- Network rule to allow access to Mixpanel for logging product usage stats. While optional,
-- enabling Mixpanel will allow Kumo's customer success team provide better support to
-- users, by counting which pages were visited in the UI, and how many errors did they encounter.
-- No snowflake data, metadata (eg. predictive queries, table names, etc), or error logs are
-- written to Mixpanel from the Kumo Native app.
CREATE OR REPLACE NETWORK RULE kumo_mixpanel
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('api.mixpanel.com');

-- Create network rules to allow access to customer's Snowflake account from Kumo running as a Native app.
-- Even when running as a Native app, the Snowflake connectors used by Kumo to access objects in customer's
-- account require using certain hostnames and these hostnames will need to be added to the firewall for access.
-- See https://docs.snowflake.com/en/sql-reference/functions/system_allowlist for more details.
CREATE OR REPLACE PROCEDURE CREATE_SNOWFLAKE_ALLOW_LIST()
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'main'
AS
$$
import json

def main(session):
    """
    Method to create network egress rules for SYSTEM$ALLOWLIST in the Snowflake account.
    """
    # get allow list
    allow_list = session.sql("SELECT SYSTEM$ALLOWLIST();").collect()[0].as_dict()

    allow_list_hosts = []
    if len(allow_list.values()) > 0:
        data = json.loads(list(allow_list.values())[0])

        # Iterate over each object in the array
        for entry in data:
            # Extract host, port, and type from each object
            if 'host' in entry and 'port' in entry:
                host = entry['host']
                port = entry['port']
                allow_list_hosts.append(f"'{host}:{port}'")
    
    session.sql(
        f"""CREATE OR REPLACE NETWORK RULE snowflake_allow_list
                MODE = EGRESS
                TYPE = HOST_PORT
                VALUE_LIST = ({', '.join(allow_list_hosts)})"""
    ).collect()

    return "Successfully created network rule snowflake_allow_list"
$$;

-- Call the procedure above to create a network rule to access Snowflake endpoints.
CALL CREATE_SNOWFLAKE_ALLOW_LIST();

-- Create an External access integration with the allowed network rules.
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION KUMO_EXTERNAL_ACCESS_INTEGRATION
    ALLOWED_NETWORK_RULES = (kumo_temporal_egress, kumo_aws_egress, kumo_mixpanel, snowflake_allow_list)
    ENABLED = true;

-- Grant usage on the KUMO_EXTERNAL_ACCESS_INTEGRATION to the Kump application.
GRANT USAGE ON INTEGRATION KUMO_EXTERNAL_ACCESS_INTEGRATION TO APPLICATION IDENTIFIER($KUMOAPP_NAME);

------------------------

-- Create KUMO_USER_ROLE and grant necessary privileges.
CREATE ROLE IF NOT EXISTS KUMO_USER_ROLE;

-- Grant the Application role to KUMO_USER_ROLE.
GRANT APPLICATION ROLE IDENTIFIER($KUMOAPP_USR) TO ROLE KUMO_USER_ROLE;

-- Grant USAGE on the database, schema and warehouse to KUMO_USER_ROLE.
GRANT USAGE ON DATABASE IDENTIFIER($KUMOAPP_DB) TO ROLE KUMO_USER_ROLE;
GRANT USAGE ON ALL SCHEMAS IN DATABASE IDENTIFIER($KUMOAPP_DB) TO ROLE KUMO_USER_ROLE;
GRANT USAGE ON WAREHOUSE IDENTIFIER($USER_WAREHOUSE) TO ROLE KUMO_USER_ROLE;

-- GRANT Create Compute Pool privilege to KUMO_USER_ROLE
-- A compute pool is required to start the Kumo application.
GRANT CREATE COMPUTE POOL ON ACCOUNT TO ROLE KUMO_USER_ROLE;

Step 2: Grant user privilege to use Kumo

The Snowflake administrator can provide access to the Kumo app to other Snowflake users with the following:

-----------------------------
---  USER INPUT REQUIRED  ---
-----------------------------
USE ROLE ACCOUNTADMIN;
-- Grant the KUMO_USER_ROLE to the DEFAULT_ROLE of any user who are expected to use the Kumo application.
GRANT ROLE KUMO_USER_ROLE TO ROLE <DEFAULT_USER_ROLE>;

4. Launch and Use Kumo

Note: This step is intended to be done by the user of Kumo.

Step 1: Create a compute pool and launch Kumo

  1. Provide warehouse to use, the Kumo app name and the instance type to use and start the compute pool for Kumo. The instance type must be one with a GPU.

    Note: For processing tables with more than 50 million rows, Kumo requires a “Large” Snowflake warehouse and a GPU_NV_M instance for optimal performance. Using smaller warehouses or instances (GPU_NV_S) can result in longer training and prediction times.

-----------------------------
---  USER INPUT REQUIRED  ---
-----------------------------
-- Note that KUMO_USER_ROLE should be granted usage on this warehouse.
SET USER_WAREHOUSE = '<ADD WAREHOUSE TO USE>';
SET KUMOAPP_NAME = 'KUMO';
SET INSTANCE_TYPE = 'GPU_NV_M'; -- use 'GPU_NV_S' only for tables with less than 50M rows.

USE ROLE KUMO_USER_ROLE;

CREATE COMPUTE POOL IF NOT EXISTS KUMO_COMPUTE_POOL
  FOR APPLICATION IDENTIFIER($KUMOAPP_NAME)
  min_nodes = 1
  max_nodes = 1
  instance_family = $INSTANCE_TYPE;

GRANT USAGE ON COMPUTE POOL KUMO_COMPUTE_POOL TO APPLICATION IDENTIFIER($KUMOAPP_NAME);
  1. The compute pool created above will take about 10-15 minutes to get to IDLE or ACTIVE state. You can check the state of the compute pool by running the following command:
DESCRIBE COMPUTE POOL KUMO_COMPUTE_POOL;
  1. Start Kumo after the compute pool is in IDLE or ACTIVE state (see (2) above to check the state of the compute pool).
USE WAREHOUSE IDENTIFIER($USER_WAREHOUSE);
SET KUMOAPP_COMPUTE = CONCAT(($KUMOAPP_NAME),'.KUMO_APP_SCHEMA.START_APP');
CALL IDENTIFIER($KUMOAPP_COMPUTE)('KUMO_COMPUTE_POOL', 'USER_SCHEMA', $INSTANCE_TYPE);

The above command will run until all the containers are in READY state, which can take up to 20 minutes and returns the URL to use to access Kumo. DO NOT abort this command as this might leave the app in an inconsistent state.

  1. Navigate to the URL from the above command in your preferred browser. Login using your Snowflake credentials and you can start using Kumo!

Step 2: Using Kumo

For detailed guidelines on how to use Kumo, visit the Kumo Quick Start Guide.

Step 3 (optional): Shutting down Kumo

  1. Once you are done using Kumo, you can stop the Kumo application using the following command:
SET KUMOAPP_STOP = CONCAT(($KUMOAPP_NAME),'.KUMO_APP_SCHEMA.SHUTDOWN_APP');
CALL IDENTIFIER($KUMOAPP_STOP)('USER_SCHEMA');
  1. Snowflake automatically suspends the compute pool when idle. To shutdown the compute pool, run the following command:
DROP COMPUTE POOL IF EXISTS KUMO_COMPUTE_POOL;

Appendix

Deleting the Kumo Native app

You can reset your account and remove all objects associated with Kumo using the following script. Note that this will permanently delete all metadata, data and models trained with the Kumo Native app and this cannot be reverted. Please reach out to Kumo support if you have any questions regarding the deletion of your Kumo app.

-- Description: This script is used to reset the account to the initial state before the Kumo app setup.

-- NOTE: All the objects created for the Kumo app will be deleted.
--       Any models trained or data stored in the Kumo app will be lost.

USE ROLE ACCOUNTADMIN;
-----------------------------
---  USER INPUT REQUIRED  ---
-----------------------------

SET USER_WAREHOUSE = '<ADD WAREHOUSE TO USE>';
SET KUMOAPP_NAME = 'KUMO';
SET KUMOAPP_DB = 'KUMO_APP_DB';

USE WAREHOUSE IDENTIFIER($USER_WAREHOUSE);

-- Delete the Kumo application and all associated objects.
DROP APPLICATION IF EXISTS IDENTIFIER($KUMOAPP_NAME) CASCADE;

-- Drop the role created for the user of Kumo app.
DROP ROLE IF EXISTS KUMO_USER_ROLE;

-- Drop the external access rule created for the Kump app.
DROP EXTERNAL ACCESS INTEGRATION IF EXISTS KUMO_EXTERNAL_ACCESS_INTEGRATION;

USE DATABASE IDENTIFIER($KUMOAPP_DB);

CREATE OR REPLACE PROCEDURE DROP_ALL_NETWORK_RULES(db_name varchar)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'main'
AS
$$

def main(session, db_name):
    """
    Method to drop all network egress rules created for Kumo.
    """
    # get allow list and 
    db_names = session.sql(f"SHOW DATABASES like '{db_name}'").collect()
    if len(db_names) == 0:
        return f"No Database found with name {db_name}. No rules to delete"

    rows = session.sql(f"SHOW NETWORK RULES IN DATABASE {db_name}").collect()
    dropped_rules = []
    for row in rows:
        row_dict = row.as_dict()
        if 'name' in row_dict:
            rule = f"{db_name}.{row_dict['schema_name']}.{row_dict['name']}"
            session.sql(f"DROP NETWORK RULE IF EXISTS {rule}").collect()
            dropped_rules.append(rule)
    return f"Successfully deleted {len(rows)} network rules in Database {db_name}: {', '.join(dropped_rules)}"
$$;

CALL DROP_ALL_NETWORK_RULES($KUMOAPP_DB);

-- Drop Database created for the Kumo app
DROP DATABASE IF EXISTS IDENTIFIER($KUMOAPP_DB);

Enabling mandatory event sharing for Kumo

Be sure to share ALL events from the native app so that Kumo can provide any necessary operational support—this is required when using Kumo's Snowflake native app. You can examine the logs and events shared with Kumo using the event table configured for your account. See Snowflake's documentation for more details about event sharing.

Instructions to configure event sharing is below. Note that in both cases an event table must be configured for your Snowflake account. See Setting up an event table for details.

Configuring an existing Kumo Native App

Using Snowsight: If you have an existing Kumo app installation, you can easily enable mandatory events in Snowsight by going to Data Products -> Apps -> Kumo -> Events and Logs. On the Events and Logstab, toggle the All events button in the Events and logs sharing section.

Using SQL: The above event sharing can also be enabled using the following commands:

-- Set app name if it's different from the default.
SET KUMOAPP_NAME = 'KUMO';
-- Enables event sharing for the Native app.
ALTER APPLICATION IDENTIFIER($KUMOAPP_NAME) SET AUTHORIZE_TELEMETRY_EVENT_SHARING=true;

Configuring a New Kumo installation

All mandatory events will be enabled by default when installing Kumo. You will be notified of this when attempting to install the Kumo app. Note that an event table must be configured for your account to ensure installation can proceed.