I/O#

Writing Data#

Collection.write_parquet(directory, **kwargs)

Write the members of this collection to parquet files in a directory.

Collection.sink_parquet(directory, **kwargs)

Stream the members of this collection into parquet files in a directory.

Collection.write_delta(target, **kwargs)

Write the members of this collection to Delta Lake tables.

Reading Data#

Collection.read_parquet(directory, *[, ...])

Read all collection members from parquet files in a directory.

Collection.scan_parquet(directory, *[, ...])

Lazily read all collection members from parquet files in a directory.

Collection.read_delta(source, *[, validation])

Read all collection members from Delta Lake tables.

Collection.scan_delta(source, *[, validation])

Lazily read all collection members from Delta Lake tables.

Collection Serialization#

Collection.serialize()

Serialize the metadata for this collection to a JSON string.

deserialize_collection(data)

Deserialize a collection from a JSON string.

read_parquet_metadata_collection(source)

Read a dataframely Collection type from the metadata of a parquet file.