Schema.scan_delta#

classmethod Schema.scan_delta(
source: str | Path | deltalake.DeltaTable,
*,
validation: Validation = 'warn',
**kwargs: Any,
) LazyFrame[Self][source]#

Lazily read a Delta Lake table into a typed data frame with this schema.

Compared to polars.scan_delta(), this method checks the table’s metadata and runs validation if necessary to ensure that the data matches this schema.

Parameters:
  • source – Path or DeltaTable object from which to read the data.

  • validation

    The strategy for running validation when reading the data:

    • "allow": The method tries to read the parquet file’s metadata. If the stored schema matches this schema, the data frame is read without validation. If the stored schema mismatches this schema or no schema information can be found in the metadata, this method automatically runs validate() with cast=True.

    • "warn": The method behaves similarly to "allow". However, it prints a warning if validation is necessary.

    • "forbid": The method never runs validation automatically and only returns if the schema stored in the parquet file’s metadata matches this schema.

    • "skip": The method never runs validation and simply reads the parquet file, entrusting the user that the schema is valid. Use this option carefully and consider replacing it with :meth:`polars.scan_delta` to convey the purpose better.

  • kwargs – Additional keyword arguments passed directly to polars.scan_delta().

Returns:

The lazy data frame with this schema.

Raises:

ValidationRequiredError – If no schema information can be read from the source and validation is set to "forbid".

Attention

Schema metadata is stored as custom commit metadata. Only the schema information from the last commit is used, so any table modifications that are not through dataframely will result in losing the metadata.

This method suffers from the same limitations as serialize().