The int_field() function defines the constraints and behavior for an integer column when generating synthetic data with generate_dataset(). You can control the range of values with min_val= and max_val=, restrict values to a specific set with allowed=, enforce uniqueness with unique=True, and introduce null values with nullable=True and null_probability=. The dtype= parameter lets you choose the specific integer type (e.g., "Int8", "UInt16", "Int64"), which also determines the valid range of values.
When no constraints are specified, values are drawn uniformly from the full range of the chosen integer dtype. If both min_val= and max_val= are provided, values are drawn uniformly from that range. If allowed= is provided, values are sampled from that specific list.
Parameters
min_val:int | None=None
Minimum value (inclusive). Default is None (no minimum, uses dtype lower bound).
max_val:int | None=None
Maximum value (inclusive). Default is None (no maximum, uses dtype upper bound).
allowed:list[int] | None=None
List of allowed values (categorical constraint). When provided, values are sampled from this list. Cannot be combined with min_val=/max_val=.
nullable:bool=False
Whether the column can contain null values. Default is False.
null_probability:float=0.0
Probability of generating a null value for each row when nullable=True. Must be between 0.0 and 1.0. Default is 0.0.
unique:bool=False
Whether all values must be unique. Default is False. When True, the generator will retry until it produces n distinct values (subject to retry limits).
generator:Callable[[], Any] | None=None
Custom callable that generates values. When provided, this overrides all other constraints (min_val=, max_val=, allowed=, etc.). The callable should take no arguments and return a single integer value.
An integer field specification that can be passed to Schema().
Raises
:ValueError
If min_val is greater than max_val, if allowed is an empty list, if null_probability is not between 0.0 and 1.0, or if dtype is not a valid integer type.
Examples
The min_val= and max_val= parameters constrain generated ranges, while allowed= restricts values to a specific set: