Schema.validate#
- classmethod Schema.validate( ) DataFrame[Self] | LazyFrame[Self][source]#
Validate that a data frame satisfies the schema.
If an eager data frame is passed as input, validation is performed within this function. If a lazy frame is passed, the lazy frame is simply extended with the validation logic. The logic will only be executed (and potentially raise an error) once
collect()is called on it.- Parameters:
df – The data frame to validate.
cast – Whether columns with a wrong data type in the input data frame are cast to the schema’s defined data type if possible.
eager – Whether the validation should be performed eagerly and this method should raise upon failure. If
False, the returned lazy frame will fail to collect if the validation does not pass.
- Returns:
The input eager or lazy frame, wrapped in a generic version of the input’s data frame type to reflect schema adherence. This operation is guaranteed to maintain input ordering of rows.
- Raises:
SchemaError – If
eager=Trueand the input data frame misses columns orcast=Falseand any data type mismatches the definition in this schema. Only raised upon collection ifeager=False.ValidationError – If
eager=Trueand in any rule in the schema is violated, i.e. the data does not pass the validation. Wheneager=False, aComputeErroris raised upon collecting.InvalidOperationError – If
eager=True,cast=True, and the cast fails for any value in the data. Only raised upon collection ifeager=False.