duration_field()function

Create a duration column specification for use in a schema.

USAGE

duration_field(
    min_duration=None,
    max_duration=None,
    nullable=False,
    null_probability=0.0,
    unique=False,
    generator=None,
)

The duration_field() function defines the constraints and behavior for a duration (timedelta) column when generating synthetic data with generate_dataset(). You can control the duration range with min_duration= and max_duration=, enforce uniqueness with unique=True, and introduce null values with nullable=True and null_probability=.

Duration values are generated uniformly (at second-level resolution) within the specified range. If no range is provided, the default range is 0 seconds to 30 days. Both min_duration= and max_duration= accept datetime.timedelta objects or colon-separated strings in "HH:MM:SS" or "MM:SS" format.

Parameters

min_duration : str | timedelta | None = None

Minimum duration (inclusive). Can be a "HH:MM:SS" or "MM:SS" string, or a datetime.timedelta object. Default is None (defaults to 0 seconds).

max_duration : str | timedelta | None = None

Maximum duration (inclusive). Can be a "HH:MM:SS" or "MM:SS" string, or a datetime.timedelta object. Default is None (defaults to 30 days).

nullable : bool = False

Whether the column can contain null values. Default is False.

null_probability : float = 0.0

Probability of generating a null value for each row when nullable=True. Must be between 0.0 and 1.0. Default is 0.0.

unique : bool = False

Whether all values must be unique. Default is False. With second-level resolution within a duration range, uniqueness is feasible for moderate dataset sizes.

generator : Callable[[], Any] | None = None

Custom callable that generates values. When provided, this overrides all other constraints. The callable should take no arguments and return a single datetime.timedelta value.

Returns

DurationField

A duration field specification that can be passed to Schema().

Raises

: ValueError

If min_duration is greater than max_duration, or if a duration string cannot be parsed.

Examples


The min_duration= and max_duration= parameters accept timedelta objects for defining duration ranges:

import pointblank as pb
from datetime import timedelta

schema = pb.Schema(
    session_length=pb.duration_field(
        min_duration=timedelta(minutes=5),
        max_duration=timedelta(hours=2),
    ),
    wait_time=pb.duration_field(
        min_duration=timedelta(seconds=30),
        max_duration=timedelta(minutes=15),
    ),
)

pb.generate_dataset(schema, n=100, seed=23)
shape: (100, 2)
session_lengthwait_time
duration[μs]duration[μs]
1h 51m 24s13m 48s
44m 34s5m 26s
1h 58m 16s14m 39s
16m 24s1m 55s
7m 19s47s
34m 48s4m 13s
40m 16s4m 54s
25m 24s3m 3s
19m 37s2m 19s
1h 29m 36s11m 4s

Colon-separated strings can also be used for quick duration definitions:

schema = pb.Schema(
    call_duration=pb.duration_field(min_duration="0:01:00", max_duration="1:30:00"),
    break_time=pb.duration_field(min_duration="0:05:00", max_duration="0:30:00"),
)

pb.generate_dataset(schema, n=30, seed=42)
shape: (30, 2)
call_durationbreak_time
duration[μs]duration[μs]
1h 28m 18s26m 49s
16m 12s8m 48s
4m 24s5m 51s
38m 33s14m 23s
34m 26s13m 21s
31m 5s28m 56s
1h 2m 19s23m 36s
1h 21m 27s19m 19s
38m 58s12m 31s
1m 53s20m 19s

Optional durations can be created with nullable=True, and duration fields work well alongside other field types:

schema = pb.Schema(
    task_id=pb.int_field(min_val=1, max_val=500, unique=True),
    time_spent=pb.duration_field(
        min_duration=timedelta(minutes=1),
        max_duration=timedelta(hours=8),
    ),
    overtime=pb.duration_field(
        min_duration=timedelta(0),
        max_duration=timedelta(hours=4),
        nullable=True, null_probability=0.6,
    ),
)

pb.generate_dataset(schema, n=30, seed=7)
shape: (30, 3)
task_idtime_spentovertime
i64duration[μs]duration[μs]
1662h 57m 51snull
4861h 23m 23s41m 11s
783h 36m 37snull
2035h 56m 29s2h 57m 44s
33427m 22snull
315h 9m 48s2h 34m 24s
4241h 8m 36snull
2902h 2m 55s1h 57s
645h 45m 24snull
1155h 43m 39s2h 51m 19s