Commands and Operators
PREDICT and FOR EACH
A Predictive Query starts with a PREDICT
clause that defines the value you want to predict. After specifying the prediction target in PREDICT
, you must include a FOR EACH
clause to specify the entity you want to make predictions for. Only a primary key column can be selected for this field.
Boolean Operators
You can use any number of boolean operators in your predictive queries. For example, LAST(LOAN.AMOUNT, 0, 30) = 2
.
ASSUMING
ASSUMING
allows you to investigate hypothetical scenarios and evaluate the impact of your actions or decisions. TheASSUMING
keyword is followed by one or more future-looking temporal aggregations which will be assumed to be true during predictions.
For example, you may want to investigate how much a user will spend in the next 30 days if you give them more than two coupons or notifications in the next 7 days:
PREDICT SUM(TRANSACTIONS.PRICE, 0, 30)
FOR EACH CUSTOMERS.CUSTOMER_ID
ASSUMING COUNT(NOTIFICATIONS.*, 0, 7) > 2
When generating training data, this acts as a forward-looking entity filter, and when we generate Batch Predictions, the model assumes that this condition would be met
CLASSIFY and RANK TOP K
When creating Predictive Queries with a LIST_DISTINCT
aggregation target, you can use CLASSIFY
and RANK TOP K
to classify or retrieve only the top ranked values for a prediction, respectively.
CLASSIFY
For example, the following predictive query predicts customer purchases over the next 30 days, resulting in a separate binary classification per article ID:
PREDICT LIST_DISTINCT(TRANSACTIONS.ARTICLE_ID, 0, 30) CLASSIFY
FOR EACH CUSTOMERS.CUSTOMER_ID
If you don't specify a TOP K
clause, we will generate classes for all available classes up to the limit.
RANK
In contrast, the following predictive query uses RANK TOP K
at the end of the target definition (where K is the number of items of interest) to predict likely customer purchases—in this case ranking the top 12 products the customer is most likely to buy in the next 30 days:
PREDICT LIST_DISTINCT(TRANSACTIONS.ARTICLE_ID, 0, 30) RANK TOP 12
FOR EACH CUSTOMERS.CUSTOMER_ID
Using RANK
on a foreign key is the only scenario where Kumo will allow adding new targets at batch prediction time—in this case, by adding new rows to the article table.
Updated 3 days ago