Jump to Content
Home
Docs
API Reference
v1.22
v1.23
v1.24
v1.25
v1.26
v1.27
v1.28
v1.29
v1.30
v1.31
v1.32-sfnative
v1.32
v1.33-r
v1.33-sfnative
v1.33
v1.34
v1.35
v1.36-sfnative
v1.36
v1.37
v1.38
v1.39
v1.40
v1.41
v1.42
v1.43
v1.44
Kumo.ai
Docs
Kumo.ai
v1.44
Home
Docs
API Reference
Connect Your Tables
Search
Getting started
Welcome to Kumo
Quick Start
data
Deployment Modes
SaaS
Snowflake Native Application
Installing the Snowflake Native App
Architecture and Security
Troubleshooting and Support
Databricks Native Application
Installing the Databricks Native App
Architecture and Security
Data Connectors
Snowflake
Snowflake Data Connector (SaaS)
Snowflake Secure Data Sharing (SaaS)
Snowflake Native App Connector
AWS S3
Google Cloud BigQuery
Databricks
Local Data Upload
Schema Registration
Connector Source Type
Column Selection
Column Preprocessing
View Source Type
Graph Creation
Table Linkages
Security and Compliance
Kumo AI SaaS Security White Paper
Consumer Privacy
Privacy Policy
FAQ
Data Quality and Ingestion Errors
How can I troubleshoot data quality issues or pquery problems?
How are Kumo table columns preprocessed?
How can I improve the quality of my data?
How do I handle time correctness to prevent data leakage?
How does Kumo handle missing values in my dataset?
What mechanisms does Kumo provide to detect data leakage?
How does Kumo handle timezones for timestamp values?
Data Connectivity
What types of data can Kumo ingest?
How can I incorporate an external model for embedding a column?
SaaS Platform
What browsers does the Kumo SaaS platform support?
How and when are timestamps used in Kumo?
Snowflake Native App
What network ports does the Kumo application require access to, and why are these ports opened?
Modeling
Predictive Query
Tutorial
Predictive Query Structure
Task Types
Static vs. Temporal Predictive Queries
Commands and Operators
Putting It All Together
What's Next?
Examples
Training
Model Planner and AutoML
Evaluation
Evaluation Metrics
Classification
Link Prediction
Regression
Baselines
Explainability
Column Analysis
Explorer
Kumo Data Science FAQ
Evaluation and Explainability
How do I evaluate my model's prediction results?
How should I scale/handle outliers in the data?
How do I generate SHAP scores?
What types of model diagnostics does Kumo provide?
GNNs
What model architectures does Kumo incorporate into its GNN design search space?
Model Development and Iteration
What is the recommended way to reduce model training time?
How do I use a feature store with Kumo?
How do I specify the train/validation/test splits?
How do I perform feature engineering with Kumo?
How do I calibrate my model?
How does Kumo handle the cold start problem in ML?
Does Kumo support feature transformations?
Predictive Query Problem Formation
How can I compare a predictive query to an external model?
Which datasets should I use for my predictive query?
Predictive Query and Training Errors
Why is my predictive query underperforming for a particular subset of data?
Does a <MISSING> score field indicate that the column was excluded from training?
PQL Statements
Can I use specific timestamp values in my predictive query filters?
Model Performance
Model Tuning and Performance Guide
Model Performance FAQ
How do I improve the performance of my models?
How do I perform backtesting on a holdout dataset?
Can I tune model hyperparameters?
How can I make my training jobs run faster?
How can I start with a smaller graph and/or a downsampled data set?
production
Batch Predictions
Workflows
Jobs
Outputs
Statistics
Monitoring
Online Serving
REST API
Model Risk Management
FAQ
Batch Predictions
What are embedding outputs?
How do I generate predictions on new data using a previously trained model?
How can I diagnose problems with my data pipeline at batch prediction time?
What happens if the predictions per entity value in batch predictions differs from the initial pQuery’s RANK TOP K value?
Platform Capabilities and SaaS
Do I have to use the Kumo UI or can I interact programmatically?
Does Kumo handle natural language processing (NLP)?
Monitoring
How can I mitigate the problem of data drift?
Security and Governance
How does Kumo handle data governance and privacy?
Solutions
Personalization
Personalized Email Recommendations
Buy It Again
Try Something New
Search/Browse Reranking
Cold Start Recommendations (Cold Start Items)
Related Products
Hybrid Graph Neural Networks
Fraud
Detecting Payback Abuse
Identifying Money Laundering accounts
Detecting Chargeback Fraud
Uncovering Credit Card Fraud
Fraud Detection Demo
Growth and Marketing
Lead Scoring
Churn Prediction
Customer Lifetime Value (LTV) Prediction
Business Operations
Demand Forecasting
Shipment Delay Prediction
resources
Release Notes
Feature Summary
Quotas and Limits
SSO Configuration Guide
Quick Start
Data Processing Addendum
Customer Agreement
Predictive Query Reference
Primary Commands
ASSUMING
CLASSIFY/RANK TOP K
FOR EACH
PREDICT
WHERE
Aggregation Operators
AVG
COUNT
COUNT_DISTINCT
FIRST
LAST
LIST_DISTINCT
MAX
MIN
SUM
Boolean Operators
!=
<
<=
=
>
>=
AND
CONTAINS
ENDS_WITH
IS_IN
IS_NOT_NULL
IS_NULL
LIKE
NOT LIKE
NOT_CONTAINS
NOT
OR
STARTS_WITH
Model Planner Options
Column Processing
encoder_overrides
Model Architecture
activation
aggregation
channels
handle_new_target_entities
module
normalization
num_post_message_passing_layers
num_pre_message_passing_layers
ranking_embedding_loss_coeff
output_embedding_dim
target_embedding_mode
use_seq_id
distance_measure
Neighbor Sampling
num_neighbors
sample_from_entity_table
Optimization
base_lr
batch_size
early_stopping
lr_scheduler
majority_sampling_ratio
max_epochs
max_steps_per_epoch
max_test_steps
max_val_steps
weight_decay
weight_mode
Training Job Plan
refit_full
refit_trainval
metrics
num_experiments
tune_metric
enable_baselines
Training Table Generation
entity_candidate_aggregation
forecast_length
forecast_type
lag_timesteps
split
timeframe_step
train_end_offset
train_start_offset
Powered by
Connect Your Tables
Suggest Edits
Updated 7 months ago
What’s Next
Create Your Graph