HomeDocsAPI Reference
Kumo.ai
Docs

CLASSIFY/RANK TOP K

CLASSIFY | RANK TOP <K> (Required for LIST_DISTINCT)

Description

A query with a LIST_DISTINCT aggregation target can serve two different targets: ranking or classification. The same applies to queries with multicategorical or multilabel target columns.

When predicting which products a user will buy, you are usually only interested in the ranking of the top few products that the user is most likely to buy. In that case, you can guarantee that by adding RANK TOP K at the end of your target definition, where K is the number of items that you are interested in.

On the other hand, you might be interested in a separate prediction for each item, making a separate binary prediction for each item type. To use this feature, add CLASSIFY to the end of your target definition.

Example

The examples for this part of the query can be found here:

PREDICT LIST_DISTINCT(transaction.article_id, 0, 30) RANK TOP 12
PREDICT LIST_DISTINCT(transaction.article_id, 0, 30) CLASSIFY
PREDICT target.multicategorical_column RANK TOP 20

The two operations are subject to different limits: ranking works up to 10,000,000 different entities, while classification only works with up to 1000 different entities. Ranking at most 1000 targets is permitted.

TOP K will be ignored if used with CLASSIFY. Adding CLASSIFY/RANK is required if the target output is LIST_DISTINCT or a multicategorical column.

CLASSIFY/RANK is not required and has no effect if LIST_DISTINCT appears as part of a condition, such as in the following pQuery:

PREDICT LIST_DISTINCT(transaction.category, 0, 30) CONTAINS "online"

📘

Note: For predictions per entity value in batch predictions that differ from the initial pQuery’s RANK TOP K value, Kumo uses the same trained model but produces the number of results specified at batch prediction time