Description
Defines the numerical offset from the most recent entry to not use to generate training data labels. Unless a custom time unit is specified in the aggregation, this value is in days
. This can be used to make sure the query does not generate labels on data from the last train_end_offset
days. Regardless of this value, all data is used as an input to the model, but this value can help limit what labels are generated.
train_end_offset
must be >0
.- If used at the same time as
train_start_offset
,train_end_offset
should be strictly smaller.
Supported Task Types
- Temporal
Example 1
For example, you may want to only use training examples for customers that have churned before the last year, but those customers may have some data in the last year as well (e.g. because they might have returned significantly later).
train_end_offset: <integer>
train_end_offset: 10 # Do not train on data from the last 10 days
train_end_offset: 365 # Do not train on data from the last year
This only applies to temporal queries (queries that include a temporal aggregation such as SUM(TRANSACTIONS.AMOUNT, 0, 2, days)
) The unit of this step is the same as the unit in the aggregation.
Example 2
For example, for the query
PREDICT SUM(transactions.price, 0, 30, days)
FOR EACH customers.customer_id
The value of train_end_offset
will be in days
.
For example, if set to 10
, the training table will only include entries from the last 10 days.
Default Values
run_mode | Default Value |
---|---|
FAST | 0 |
NORMAL | 0 |
BEST | 0 |
Updated 5 months ago