HomeDocsAPI Reference
Kumo.ai
Docs

Predictive Query Structure

At the bare minimum, your predictive query will have an entity and target, as well as an optional filter to scale down your prediction results.

Target

The target is essentially the question you are trying to answer, preceded by the PREDICT command. For example, if you want to predict the total amount of money spent by each user over the next seven days, then the "sum of purchases over the next seven days" is your target.

Entity

The entity is who you are making predictions for. For example, if you want to predict the total amount of purchases per user, then the user is your entity.

Aggregation Operators

If you want to predict some aggregation of values over time (e.g., sum of purchase values over the next 30 days), you can specify an aggregation operator and the column representing the value you want to aggregate.

Some common aggregation functions include SUM() and COUNT(), to name a few.

📘

NOTE: Please refer to the Predictive Query Reference for a complete list of Kumo aggregation operators.

Revisiting the previous example, the following usage of the SUM() aggregation operator allows you to predict the total number of sales each customer will make in the next 30 days:

PREDICT SUM(TRANSACTIONS.PRICE,0,30,days)
FOR EACH CUSTOMERS.CUSTOMER_ID

Within the aggregation function inputs, the start and end parameters refer to the time period you want to aggregate across, calculated in days. For example, 10 for start and 30 for end implies that you want to aggregate from 10 days later (excluding the 10th day) to 30 days later (including the 30th day).

If you're making the prediction on 2020-01-01 00:00:00, Kumo will aggregate all rows with timestamps t where 2020-01-11 00:00:00 < t <= 2020-01-31 00:00:00.

When using aggregation with targets, both start and end values should be non-negative integers, and end values should be greater than start values.

Filters

Filtering allows you to further contextualize your predictive query by dropping rows that do not meet a specified condition. By using a WHERE clause, you can drop entities that are irrelevant to your prediction or targets to exclude from your aggregation (i.e., when you are only interested in a subset of the table).

A condition following the WHERE clause can also contain an aggregation, for cases where you may only want predictions on a certain subset of your entities. For example, the following predictive query predicts the lifetime value (LTV) for customers that were recently active in the last 30 days:

PREDICT SUM(TRANSACTIONS.PRICE, 0, 30)
FOR EACH CUSTOMERS.CUSTOMER_ID
WHERE COUNT(TRANSACTIONS.*,-30, 0) > 0

Using WHERE as an Inline Operation

The WHERE operator can be used several times in different locations of the same predictive query. For example, the following is an example of using theWHERE operator within an aggregation for target filters:

PREDICT COUNT(TRANSACTION.* 
WHERE TRANSACTION.VALUE > 10, 0, 7)
FOR EACH USER.USER_ID
WHERE COUNT(TRANSACTION.*, -7, 0) > 0

This predictive query will predict the number of transactions over $10 each active user (i.e., a user that has made at least one purchase in the past 7 days) will make over the next 7 days.

You can use the same aggregation functions and start and end parameter definitions as targets; however, in these cases both start and end parameters should be non-positive.

Using Nested WHERE Filters

Kumo also supports nested temporal filters, for example:

PREDICT COUNT(transaction.*, 0, 7)  
FOR EACH user.user_id  
WHERE COUNT(transaction.*  
  	   WHERE transaction.value > 10,  
                 -7, 0) > 0

This predictive query will predict the number of transactions over the next 7 days that each user who has made a purchase over $10 in the past 7 days will make.

Using Multiple Target Tables

Kumo also supports the use of multiple target tables, for example:

PREDICT COUNT(session.*, 0, 7) > 10 OR SUM (transaction.value, 0, 5) > 100
FOR EACH user.user_id 

This predictive query will predict whether each user will either have more than 10 sessions in the next 7 days or make purchases exceeding $100 over the next 5 days.

Using Specific Date/Time Values with WHERE

Specific date and time values can also be included in WHERE clauses. For example, the following predictive query excludes customers that have joined after January 1st, 2022 12:45:30:

PREDICT COUNT(TRANSACTIONS.*, 0, 10)
FOR EACH CUSTOMERS.CUSTOMER_ID
WHERE CUSTOMERS.DATE_JOINED < 2022-01-01 12:45:30

Dates should not be enclosed in quotes, and hours/minutes/seconds are optional and can be left out if only (in which case they are assumed to be 0).

A static filter is a filter that does not contain any aggregations. You can combine multiple conditions together with AND/OR/NOT operators; however, unlike target definitions (i.e., PREDICT clause), mixing static and temporal conditions is allowed when used inside a filter. A temporal filter can only be used when the target definition itself contains a temporal aggregation.


What’s Next