ibis.ibis_to_pq
ibis.ibis_to_pq(
table,
out_file,
*,
engine=None,
row_group_size=1024 * 1024,
threads=None,
tz='UTC',
adbc_batch_size_hint_bytes=None,
adbc_use_copy=None,
**writer_kwargs,
)Write an Ibis PostgreSQL table expression to a parquet file.
This helper compiles an Ibis PostgreSQL expression to SQL and runs it through the same PostgreSQL export engines used elsewhere in db2pq. The resulting Arrow stream is written directly to the destination Parquet file.
ibis_to_pq() currently supports Ibis expressions backed by a PostgreSQL connection. To use it, install the optional dependency:
pip install "db2pq[ibis]"
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| table | Ibis table expression backed by PostgreSQL. This may be a base table or a derived expression such as a filtered, selected, or mutated query. | required | |
| out_file | str or path - like | Destination parquet file path. | required |
| engine | (duckdb, adbc) | Query execution engine used to run the compiled PostgreSQL SQL. If omitted, uses the configured default engine from set_default_engine() / DB2PQ_ENGINE. |
"duckdb" |
| row_group_size | int | Maximum number of rows in each written Parquet row group. | 1024 * 1024 |
| threads | int | Maximum DuckDB worker threads to use when engine="duckdb". |
None |
| tz | str | Time zone assumption for naive PostgreSQL timestamps before normalizing Parquet output to UTC. | 'UTC' |
| adbc_batch_size_hint_bytes | int | ADBC batch size hint in bytes when engine="adbc". |
None |
| adbc_use_copy | bool | Explicitly enable or disable the PostgreSQL ADBC driver’s COPY optimization when engine="adbc". |
None |
| **writer_kwargs | Additional keyword arguments passed to pyarrow.parquet.ParquetWriter. This can be used to set options such as compression="zstd". |
{} |
Returns
| Name | Type | Description |
|---|---|---|
| pq_file | str | Name of parquet file created. |
Raises
| Name | Type | Description |
|---|---|---|
| TypeError | If the supplied Ibis expression is not backed by PostgreSQL, or if PostgreSQL connection information cannot be determined from the backend. |
Examples
>>> from db2pq import ibis_to_pq
>>> expr = con.table("my_table").filter(lambda t: t.id > 100)
>>> ibis_to_pq(expr, "my_table.parquet")
'my_table.parquet'>>> expr = con.table("my_table").select("id", "value")
>>> ibis_to_pq(expr, "my_table.parquet", compression="zstd")
'my_table.parquet'