Schema.scan_parquet#
- classmethod Schema.scan_parquet(
- source: str | Path | IO[bytes] | bytes | list[str] | list[Path] | list[IO[bytes]] | list[bytes],
- *,
- validation: Literal['allow', 'forbid', 'warn', 'skip'] = 'warn',
- **kwargs: Any,
Lazily read a parquet file into a typed data frame with this schema.
Compared to
polars.scan_parquet(), this method checks the parquet file’s metadata and runs validation if necessary to ensure that the data matches this schema.- Parameters:
source – Path, directory, or file-like object from which to read the data.
validation –
The strategy for running validation when reading the data:
"allow": The method tries to read the parquet file’s metadata. If the stored schema matches this schema, the data frame is read without validation. If the stored schema mismatches this schema or no schema information can be found in the metadata, this method automatically runsvalidate()withcast=True."warn": The method behaves similarly to"allow". However, it prints a warning if validation is necessary."forbid": The method never runs validation automatically and only returns if the schema stored in the parquet file’s metadata matches this schema."skip": The method never runs validation and simply reads the parquet file, entrusting the user that the schema is valid. Use this option carefully and consider replacing it with :meth:`polars.scan_parquet` to convey the purpose better.
kwargs – Additional keyword arguments passed directly to
polars.scan_parquet().
- Returns:
The data frame with this schema.
- Raises:
ValidationRequiredError – If no schema information can be read from the source and
validationis set to"forbid".
Attention
Be aware that this method suffers from the same limitations as
serialize().