core.wrds_sql_to_pq

core.wrds_sql_to_pq(
    sql,
    table_name,
    schema,
    *,
    wrds_id=None,
    data_dir=None,
    row_group_size=1048576,
    modified=None,
    alt_table_name=None,
    threads=3,
    tz='UTC',
    engine=None,
    adbc_batch_size_hint_bytes=None,
    adbc_use_copy=None,
    archive=False,
    archive_dir=None,
)

Run a SQL query against WRDS PostgreSQL and write the result to Parquet.

Parameters

Name Type Description Default
sql SQL query to execute against the WRDS PostgreSQL database. required
table_name Logical source table name used for the output parquet basename unless alt_table_name is supplied. required
schema Schema name used for the output parquet directory layout. required
wrds_id str WRDS user ID used to access WRDS services. This parameter is required and must be provided either explicitly or via the WRDS_ID environment variable. None
data_dir str Root directory of parquet data repository. None
row_group_size int Maximum number of rows in each written row group. 1048576
modified str Last modified string to embed in parquet metadata. None
alt_table_name str Basename of parquet file. Used when the file should have a different name than table_name. None
threads int Maximum DuckDB worker threads to use when engine="duckdb". 3
tz str Time zone assumption for naive PostgreSQL timestamps before normalizing parquet output to UTC. 'UTC'
engine (duckdb, adbc) Query execution engine used to run the WRDS PostgreSQL SQL. "duckdb"
adbc_batch_size_hint_bytes int ADBC batch size hint in bytes when engine="adbc". None
adbc_use_copy bool Explicitly enable or disable the PostgreSQL ADBC driver’s COPY optimization when engine="adbc". None

Returns

Name Type Description
pq_file str Name of parquet file created.