core.wrds_sql_to_pq
core.wrds_sql_to_pq(
sql,
table_name,
schema,
*,
wrds_id=None,
data_dir=None,
row_group_size=1048576,
modified=None,
alt_table_name=None,
threads=3,
tz='UTC',
engine=None,
adbc_batch_size_hint_bytes=None,
adbc_use_copy=None,
archive=False,
archive_dir=None,
)
Run a SQL query against WRDS PostgreSQL and write the result to Parquet.
Parameters
| sql |
|
SQL query to execute against the WRDS PostgreSQL database. |
required |
| table_name |
|
Logical source table name used for the output parquet basename unless alt_table_name is supplied. |
required |
| schema |
|
Schema name used for the output parquet directory layout. |
required |
| wrds_id |
str |
WRDS user ID used to access WRDS services. This parameter is required and must be provided either explicitly or via the WRDS_ID environment variable. |
None |
| data_dir |
str |
Root directory of parquet data repository. |
None |
| row_group_size |
int |
Maximum number of rows in each written row group. |
1048576 |
| modified |
str |
Last modified string to embed in parquet metadata. |
None |
| alt_table_name |
str |
Basename of parquet file. Used when the file should have a different name than table_name. |
None |
| threads |
int |
Maximum DuckDB worker threads to use when engine="duckdb". |
3 |
| tz |
str |
Time zone assumption for naive PostgreSQL timestamps before normalizing parquet output to UTC. |
'UTC' |
| engine |
(duckdb, adbc) |
Query execution engine used to run the WRDS PostgreSQL SQL. |
"duckdb" |
| adbc_batch_size_hint_bytes |
int |
ADBC batch size hint in bytes when engine="adbc". |
None |
| adbc_use_copy |
bool |
Explicitly enable or disable the PostgreSQL ADBC driver’s COPY optimization when engine="adbc". |
None |
Returns
| pq_file |
str |
Name of parquet file created. |