Kumo AI SaaS Security White Paper
Introduction
Kumo prioritizes the security of its services and is committed to ensuring the highest standards of protection. While technical security measures are important, equally important are the processes and people involved in keeping both the platform secure and your data as safe as possible.
Our security philosophy centers around layered security controls designed to protect and secure Kumo's AI SaaS cloud infrastructure. We believe in multiple logical and physical security control layers including access management, least privilege access, strong authentication, logging and monitoring, and vulnerability management including external penetration testing exercises.
An integral component of our information security strategy is the proactive monitoring and management of systems to identify critical security issues. When issues are identified, they are thoroughly evaluated and promptly resolved.
We rely on industry standard information security best practices and compliance frameworks, such as NIST 800-53, ISO 27000 series, to support our security initiatives. Our goal is to make users feel confident using our service for their most sensitive workloads.
We firmly believe that maintaining transparency regarding our controls, environment, standards, and processes is of paramount importance. This document provides a deeper understanding of all the available security controls in the Kumo AI SaaS cloud infrastructure.
Kumo AI SaaS Cloud Infrastructure
The Kumo AI SaaS cloud infrastructure supports our end-to-end machine learning (ML) platform. Kumo enables enterprises to leverage state-of-the-art predictive analytics to make predictions, allowing data scientists to immediately tackle many prediction problems by first registering data sources and then issuing different SQL-like predictive query interfaces that specify their ML tasks. Kumo then executes the predictive query and automates the entire process of feature preparation, label engineering, training dataset creation, model optimization, and MLOps, making it easy for users to build multiple ML models.
Kumo is a fully managed software-as-a-service (SaaS) platform. More specifically:
- There is no hardware (virtual or physical) to select, install, configure, or manage.
- There is no software to install, configure, or manage.
- Kumo handles ongoing maintenance, management, upgrades, and tuning.
Along with our AI SaaS, Kumo also offers data warehouse native deployment options, which may provide additional security benefits for your organization. See our deployment options page for more information.
High Level Architecture
The following architecture diagram describes the Kumo AI SaaS cloud infrastructure:
Users (i.e., customers) only access Kumo via the predictive query interface. ML predictions are returned via the users' data store. All user data remains stored in their data store. User-defined predictive queries are run against the cache, which is populated from the customer's data store. Data resident in the cache and at cache retention time are user configurable. Users must provide Kumo credentials/access policies in order for Kumo to access (a subset of) the user’s Data Store.
All data objects created/stored by Kumo are hidden and not directly accessible by users; they are only accessible through the predictive query interface run using Kumo. Kumo utilizes cloud service providers as further described in the service agreement and/or documentation and provides services to users using a VPC/VNET and storage hosted by the applicable cloud provider.
The Kumo AI SaaS cloud architecture consists of four key layers:
- Control Plane
- Predictive Query Engine
- Metadata DB
- Cache
Control Plane
The control plane is a collection of services that coordinate activities across Kumo. The control plane runs on compute instances provisioned by Kumo from the cloud provider.
Predictive Query Engine
Predictive query processing is performed in the processing layer. Kumo processes queries using massively parallel processing (MPP) clusters of deep learning accelerators. After predictive query processing is complete, all data on the predictive query processing engine is purged.
Metadata DB
The Kumo metadata database holds data schema information, machine learning models, aggregate data statistics and other metadata.
Cache
The data cache is not directly visible nor accessible by users. Data cache objects are only accessible through predictive query operations run using Kumo. Kumo caches transformed and derived data for faster predictive query processing and execution, and manages all aspects of how this data is stored—to include the organization, file size,
structure, compression, metadata, statistics, and other aspects of data storage. The cache retention policy is defined by the user.
Internal Kumo AI Cloud Infrastructure Security
Data Centers
The Kumo AI cloud infrastructure runs on top of a public cloud provider's services. Cloud provider data centers are compliant with a myriad of physical and information security standard; subsequently, the Kumo AI cloud inheris the same best in class security controls.
For additional information, please refer to the cloud service provider's compliance page:
Encryption
Encryption in Transit
Encryption using TLS 1.2 is required for all client connections to the Kumo AI cloud.
Encryption at Rest
Cache at rest uses the default transparent AES-256 based encryption algorithm.
Kumo Employee Access Control
Kumo maintains an access management standard, updated annually at minimum, that dictates access control based upon the principles of least privilege, need to know, and segregation of duties. Kumo employee access reviews occur at time of hire, termination, change of role, and periodically through each calendar year. Additionally, employees undergo background checks where permissable by law.
Elevated Access Permissions and Just-in-time Access
Kumo engineers with a business need for access to production resources do not have standing access to the production environment. Instead, elevated access permissions are assigned for a limited period of time through a discrete action performed by the authorized engineer (i.e., just-in-time access). Additionally, this elevation of privilege requires authorization by another authorized Kumo engineer (i.e., multi-party approval). Exceptions to the just-in-time access rule are granted only for the time period when an engineer is actively performing on-call duties.
Audit Logs
Administrative actions are captured in Kumo’s audit log system and are retained for 12 months. These logs are reviewed for anomalies by Kumo’s information security team, and are maintained in a separate environment with separation of duties preventing unauthorized modification or deletion.
Internal Kumo AI Cloud Service Security
Separation of Production and Non-Production Environments
The Kumo AI cloud maintains strict separation between production and non-production environments. Kumo enforces the principle of least privilege and separation of duties for access to all environments, and access to production environments are limited to authorized personnel only.
Deletion of Cache Data
Upon termination or expiration of their Kumo AI SaaS subscription, users can request the deletion of their data as part of the account closure procedure. User data will be deleted within 30 days of the request.
Available Customer Security Controls
Kumo AI Cloud Authentication and User Management
The Kumo AI cloud web UI supports username/password-based authentication and single sign-on (SSO) with identity providers (IdP) such as Okta. Kumo user passwords are held by identity management solutions provider Auth0. Auth0 stores credentials using bcrypt, an industry-standard one-way salted hashing algorithm.
Business Continuity and Disaster Recovery
The Kumo AI cloud runs on a resilient, highly available IT infrastructure. To maintain an actionable business continuity and disaster recovery plan (BCDRP), Kumo conducts periodic (annual at minimum) resilience testing and BCDRP exercises to review incident management procedures, update plan documentation, and conduct system recovery testing.
Infrastructure Service Recovery
The Kumo AI cloud runs workloads on IT infrastructure resources provided by public cloud providers such as AWS. Hence, data availability is also subject to the BCP and DR process of those infrastructure providers. For more information about the cloud providers' certifications and audit reports, please see the following:
Incident Response
Kumo maintains a formal incident management policy and procedure, and communicates and trains the appropriate personnel on a periodic basis. Procedures include liaisons and points of contacts with local authorities in accordance with contracts and relevant regulations. Incident response is active during regular business hours to detect, manage, and resolve any security incidents.
Companywide Executive Review
Kumo's security steering committee meets biannually to review reports, identify control deficiencies and material changes in the threat environment, and make recommendations to executive management for new or improved controls and threat mitigation strategies.
Compliance
Kumo maintains the following compliance certifications. For additional information, please contact [email protected]
SOC 2
Kumo's SOC 2 report includes a description of its service organization’s system and a test of the design of the service organization’s relevant controls as they relate to security, availability, confidentiality, processing integrity, and privacy. For additional information, please contact [email protected]
Updated 4 months ago