Commands and Operators

PREDICT and FOR EACH

A Predictive Query starts with a PREDICT clause that defines the value you want to predict. After specifying the prediction target in PREDICT, you must include a FOR EACH clause to specify the entity you want to make predictions for. Only a primary key column can be selected for this field.

Boolean Operators

You can use any number of boolean operators in your predictive queries. For example, LAST(LOAN.AMOUNT, 0, 30) = 2.

ASSUMING

ASSUMING allows you to investigate hypothetical scenarios and evaluate the impact of your actions or decisions. TheASSUMING keyword is followed by one or more future-looking temporal aggregations which will be assumed to be true during predictions.

For example, you may want to investigate how much a user will spend in the next 30 days if you give them more than two coupons or notifications in the next 7 days:

PREDICT SUM(TRANSACTIONS.PRICE, 0, 30)
FOR EACH CUSTOMERS.CUSTOMER_ID
ASSUMING COUNT(NOTIFICATIONS.*, 0, 7) > 2

📘
When generating training data, this acts as a forward-looking entity filter, and when we generate Batch Predictions, the model assumes that this condition would be met

CLASSIFY and RANK TOP K

When creating Predictive Queries with a LIST_DISTINCT aggregation target, you can use CLASSIFY and RANK TOP K to classify or retrieve only the top ranked values for a prediction, respectively.

CLASSIFY

For example, the following predictive query predicts customer purchases over the next 30 days, resulting in a separate binary classification per article ID:

PREDICT LIST_DISTINCT(TRANSACTIONS.ARTICLE_ID, 0, 30) CLASSIFY
FOR EACH CUSTOMERS.CUSTOMER_ID

If you don't specify a TOP K clause, we will generate classes for all available classes up to the limit.

RANK

In contrast, the following predictive query uses RANK TOP K at the end of the target definition (where K is the number of items of interest) to predict likely customer purchases—in this case ranking the top 12 products the customer is most likely to buy in the next 30 days:

PREDICT LIST_DISTINCT(TRANSACTIONS.ARTICLE_ID, 0, 30) RANK TOP 12
FOR EACH CUSTOMERS.CUSTOMER_ID

Using RANK on a foreign key is the only scenario where Kumo will allow adding new targets at batch prediction time—in this case, by adding new rows to the article table.

PREDICT and FOR EACH

Boolean Operators

ASSUMING

📘When generating training data, this acts as a forward-looking entity filter, and when we generate Batch Predictions, the model assumes that this condition would be met

CLASSIFY and RANK TOP K

CLASSIFY

RANK

📘
When generating training data, this acts as a forward-looking entity filter, and when we generate Batch Predictions, the model assumes that this condition would be met