HomeDocsAPI Reference
Kumo.ai
Docs

Batch Predictions

Start making real predictions with your pqueries on unseen data.

Once you are comfortable with how well your predictive query does on your historic data, you're ready to create batch predictions by setting up a new batch prediction workflow, followed by launching a batch prediction job.

Batch prediction workflows can be created for any predictive query that has completed training.

New Batch Prediction Workflows

You can access the "Create a Batch Prediction Workflow" page by clicking on Batch Predictions (left-hand column) → New Batch Prediction Workflow.

Existing Batch Prediction Workflows

To view the details for an existing batch prediction workflow, click on the batch prediction name in the "Name" column of the "Batch Prediction Workflows" page.

You can also click on the pQuery name in the "Predictive Query" column to view the details of the related predictive query, or click on the garbage can icon in the "Actions" column to delete the workflow.

Existing Batch Prediction Jobs

Click on the Jobs tab to view existing batch prediction jobs. You can view the details of each job by clicking on the job ID in the "Batch Prediction Job Id" column, or view the related predictive query by clicking on the predictive query name in the "Predictive Query" column.

Applying Filters At Batch Prediction Time

Incorporating comprehensive data points is essential for accurate predictive query training; when executing a predictive query, you should use all pertinent data for model training. However, after your predictive query is already trained, there might be instances where you only desire predictions based on a specific subset of the data.

To cater to such scenarios, Kumo allows you to apply filters during batch prediction workflow creation and batch prediction job run time:

Adding filters at batch prediction time provides the following benefits:

  • Efficiency in Batch Predictions: The application of filters when generating predictions for a data subset can substantially reduce the duration of batch prediction jobs. The execution time of these jobs is notably influenced by the filters specified.

  • Streamlined Data Handling: Without applying filters, predictions for tasks such as recommendations can generate billions of rows of data. By confining predictions to a pertinent data subset aligned with your business logic, the process of consumption or integration into underlying pipelines is streamlined as the output size is significantly reduced.


What’s Next