Skip to main content
Ctrl+K

Dataframely

  • User Guide
  • API Reference
  • GitHub
  • User Guide
  • API Reference
  • GitHub

Section Navigation

  • Schema
    • Validation
      • Schema.validate
      • Schema.filter
      • Schema.is_valid
      • Schema.cast
      • dataframely.rule
    • I/O
      • Schema.write_parquet
      • Schema.sink_parquet
      • Schema.write_delta
      • Schema.read_parquet
      • Schema.scan_parquet
      • Schema.read_delta
      • Schema.scan_delta
      • Schema.serialize
      • dataframely.deserialize_schema
      • dataframely.read_parquet_metadata_schema
    • Data Generation
      • Schema.create_empty
      • Schema.create_empty_if_none
      • Schema.sample
    • Conversion
      • Schema.to_sqlalchemy_columns
      • Schema.to_pyarrow_schema
      • Schema.to_polars_schema
    • Metadata
      • Schema.column_names
      • Schema.columns
      • Schema.primary_key
      • Schema.matches
  • Collection
    • Validation
      • Collection.validate
      • Collection.filter
      • Collection.is_valid
      • Collection.cast
      • dataframely.filter
      • dataframely.require_relationship_one_to_one
      • dataframely.require_relationship_one_to_at_least_one
    • I/O
      • Collection.write_parquet
      • Collection.sink_parquet
      • Collection.write_delta
      • Collection.read_parquet
      • Collection.scan_parquet
      • Collection.read_delta
      • Collection.scan_delta
      • Collection.serialize
      • dataframely.deserialize_collection
      • dataframely.read_parquet_metadata_collection
    • Data Generation
      • Collection.create_empty
      • Collection.sample
    • Operations
      • Collection.collect_all
      • Collection.join
      • dataframely.concat_collection_members
    • Metadata
      • Collection.members
      • Collection.member_schemas
      • Collection.required_members
      • Collection.optional_members
      • Collection.non_ignored_members
      • Collection.ignored_members
      • Collection.common_primary_key
      • Collection.matches
      • Collection.to_dict
  • Columns
    • Any
    • Array
    • Binary
    • Bool
    • Categorical
    • Date
    • Datetime
    • Decimal
    • Duration
    • Enum
    • Float
    • Float32
    • Float64
    • Int8
    • Int16
    • Int32
    • Int64
    • Integer
    • List
    • Object
    • String
    • Struct
    • Time
    • UInt8
    • UInt16
    • UInt32
    • UInt64
  • FilterResult
    • FilterResult
    • LazyFilterResult
    • CollectionFilterResult
    • FailureInfo
      • FailureInfo.invalid
      • FailureInfo.counts
      • FailureInfo.cooccurrence_counts
      • FailureInfo.__len__
      • FailureInfo.write_parquet
      • FailureInfo.sink_parquet
      • FailureInfo.read_parquet
      • FailureInfo.scan_parquet
      • FailureInfo.write_delta
      • FailureInfo.read_delta
      • FailureInfo.scan_delta
  • Miscellaneous
    • Config
    • Generator
    • dataframely.testing.create_schema
    • dataframely.testing.create_collection
  • API Reference
  • Collection
  • Data Generation

Data Generation#

Collection.create_empty()

Create an empty collection without any data.

Collection.sample([num_rows, overrides, ...])

Create a random sample from the members of this collection.

previous

dataframely.read_parquet_metadata_collection

next

Collection.create_empty

© Copyright 2025, QuantCo, Inc.

Created using Sphinx 8.2.3.

Built with the PyData Sphinx Theme 0.16.1.