This page groups together examples for the Parquet utility helpers: pq_list_files(), pq_last_modified(), pq_archive(), pq_restore(), and pq_remove().
The goal is different from the API reference. The API pages document arguments and return values. This page shows how the utilities fit together in the practical workflow of maintaining a local Parquet repository.
When to use this page
Use these helpers when you already have a Parquet repository and want to:
inspect what is in it
check which vintage of a table you currently have
archive a current active file before replacing it
restore an older archived vintage
remove active or archived files explicitly
The related API pages are:
Setup
The executable examples below assume access to a local Parquet repository with at least one known schema and table. Adjust these values to match your local setup before rendering if needed.
Inspect a schema directory
Start by listing the Parquet tables available in a schema.
['g_chars',
'names',
'g_idx_mth',
'io_qbuysell',
'r_giccd',
'adsprate',
'g_names_ix_cst',
'spind_mth',
'funda_adbc_decimal_33554432',
'g_company',
'co_adesind',
'funda_adbc_decimal_16777216',
'co_ifndq',
'secm',
'r_auditors',
'security',
'aco_pnfnda',
'g_funda',
'co_afnd2',
'company.parquet.bak2',
'fundq_fncd',
'g_idx_index',
'secd',
'sec_divid',
'funda_adbc_decimal_1048576',
'names_ix',
'funda_adbc_float64_1048576',
'seg_annfund',
'fundq',
'wrds_seg_customer',
'co_filedate',
'g_security',
'wrds_segmerged',
'funda_adbc_float64_16777216',
'funda_adbc_float64_33554432',
'g_secnamesd',
'g_names_ix',
'g_idxcst_his',
'g_exrt_dly',
'funda_adbc_decimal_4194304',
'funda_adbc_float64_4194304',
'r_datacode',
'r_fndfntcd',
'idx_daily',
'idxcst_his',
'g_secm',
'co_hgic',
'r_ex_codes',
'g_secd',
'g_names',
'funda_fncd',
'g_namesq',
'company',
'seg_customer',
'funda_adbc_float64_8388608',
'funda_adbc_decimal_8388608',
'sec_history',
'g_sec_divid',
'idx_ann',
'idx_index']
If you want to inspect the archive directory instead:
pq_list_files("comp" , archive= True )
['fundq_20260330T060000Z',
'funda_fncd_20260330T060000Z',
'g_secd_20260322T060000Z',
'idx_daily_20260330T060000Z',
'r_auditors_20260330T060000Z',
'company_20260331T060000Z',
'company_20260315T060000Z',
'funda_20260330T060000Z',
'g_secd_20260217T070000Z',
'company_20260303T070000Z',
'funda_20250315T064012Z',
'company_20260402T060000Z',
'company_20260105T070000Z',
'g_secd_20260216T070000Z',
'funda_20240614T064046Z',
'company_20260226T070000Z',
'g_secd_20250907T161453Z',
'funda_20260116T070000Z',
'company_20260323T060000Z',
'funda_20260407T060000Z',
'company_20260224T070000Z',
'funda_20260107T070000Z',
'g_secd_20260120T070000Z',
'company_20260225T000000Z',
'company_20260225T070000Z',
'company_20260218T070000Z',
'funda_20251109T064119Z',
'aco_pnfnda_20260330T060000Z',
'company_20260209T070000Z',
'company_20260322T060000Z',
'company_20260107T070000Z',
'funda_20250907T064233Z']
If your project uses a repository outside the default DATA_DIR, pass data_dir explicitly.
pq_list_files("ff" , data_dir= "~/Dropbox/pq_data" )
['factors_daily', 'industry48', 'factors_monthly']
Check the current active vintage
Use pq_last_modified() to inspect the embedded metadata on the active file.
pq_last_modified(table_name= "dsf" , schema= "crsp" )
'Daily Stock - Securities (Updated 2025-02-08)'
This is often the fastest way to confirm what vintage a local Parquet file represents before starting analysis.
You can also inspect a specific file path directly:
dsf_v2 = Path.home() / "Dropbox/pq_data/crsp/dsf_v2.parquet"
pq_last_modified(file_name= dsf_v2)
'Daily Stock File created by WRDS (Updated 2026-02-06)'
Inspect archived vintages
If you archive replaced files, you can ask for the archived versions of a table:
pq_last_modified(table_name= "funda" , schema= "comp" , archive= True )
0
funda_20240614T064046Z
funda
comp
2024-06-14 02:40:46-04:00
Last modified: 06/14/2024 02:40:46
local
1
funda_20250315T064012Z
funda
comp
2025-03-15 02:40:12-04:00
Last modified: 03/15/2025 02:40:12
local
2
funda_20250907T064233Z
funda
comp
2025-09-07 02:42:33-04:00
Last modified: 09/07/2025 02:42:33
local
3
funda_20251109T064119Z
funda
comp
2025-11-09 01:41:19-05:00
Last modified: 11/09/2025 01:41:19
local
4
funda_20260107T070000Z
funda
comp
2026-01-07 02:00:00-05:00
Merged Fundamental Annual File (Updated 2026-0...
local
5
funda_20260116T070000Z
funda
comp
2026-01-16 02:00:00-05:00
Merged Fundamental Annual File (Updated 2026-0...
local
6
funda_20260330T060000Z
funda
comp
2026-03-30 02:00:00-04:00
Merged Fundamental Annual File (Updated 2026-0...
local
7
funda_20260407T060000Z
funda
comp
2026-04-07 02:00:00-04:00
Merged Fundamental Annual File (Updated 2026-0...
local
That returns a table-like summary of the archived vintages for the requested dataset.
To inspect all archived files for a schema:
pq_last_modified(schema= "comp" , archive= True )
0
aco_pnfnda_20260330T060000Z
aco_pnfnda
comp
2026-03-30 02:00:00-04:00
Pension Annual Item (Updated 2026-03-30)
local
1
company_20260105T070000Z
company
comp
2026-01-05 02:00:00-05:00
Company (Updated 2026-01-05)
local
2
company_20260107T070000Z
company
comp
2026-01-07 02:00:00-05:00
Company (Updated 2026-01-07)
local
3
company_20260209T070000Z
company
comp
2026-02-09 02:00:00-05:00
Company (Updated 2026-02-09)
local
4
company_20260218T070000Z
company
comp
2026-02-18 02:00:00-05:00
Company (Updated 2026-02-18)
local
5
company_20260224T070000Z
company
comp
2026-02-24 02:00:00-05:00
Company (Updated 2026-02-24)
local
6
company_20260225T000000Z
company
comp
2026-02-24 02:00:00-05:00
Company (Updated 2026-02-24)
local
7
company_20260225T070000Z
company
comp
2026-02-25 02:00:00-05:00
Company (Updated 2026-02-25)
local
8
company_20260226T070000Z
company
comp
2026-02-26 02:00:00-05:00
Company (Updated 2026-02-26)
local
9
company_20260303T070000Z
company
comp
2026-03-03 02:00:00-05:00
Company (Updated 2026-03-03)
local
10
company_20260315T060000Z
company
comp
2026-03-15 02:00:00-04:00
Company (Updated 2026-03-15)
local
11
company_20260322T060000Z
company
comp
2026-03-22 02:00:00-04:00
Company (Updated 2026-03-22)
local
12
company_20260323T060000Z
company
comp
2026-03-23 02:00:00-04:00
Company (Updated 2026-03-23)
local
13
company_20260331T060000Z
company
comp
2026-03-31 02:00:00-04:00
Company (Updated 2026-03-31)
local
14
company_20260402T060000Z
company
comp
2026-04-02 02:00:00-04:00
Company (Updated 2026-04-02)
local
15
funda_20240614T064046Z
funda
comp
2024-06-14 02:40:46-04:00
Last modified: 06/14/2024 02:40:46
local
16
funda_20250315T064012Z
funda
comp
2025-03-15 02:40:12-04:00
Last modified: 03/15/2025 02:40:12
local
17
funda_20250907T064233Z
funda
comp
2025-09-07 02:42:33-04:00
Last modified: 09/07/2025 02:42:33
local
18
funda_20251109T064119Z
funda
comp
2025-11-09 01:41:19-05:00
Last modified: 11/09/2025 01:41:19
local
19
funda_20260107T070000Z
funda
comp
2026-01-07 02:00:00-05:00
Merged Fundamental Annual File (Updated 2026-0...
local
20
funda_20260116T070000Z
funda
comp
2026-01-16 02:00:00-05:00
Merged Fundamental Annual File (Updated 2026-0...
local
21
funda_20260330T060000Z
funda
comp
2026-03-30 02:00:00-04:00
Merged Fundamental Annual File (Updated 2026-0...
local
22
funda_20260407T060000Z
funda
comp
2026-04-07 02:00:00-04:00
Merged Fundamental Annual File (Updated 2026-0...
local
23
funda_fncd_20260330T060000Z
funda_fncd
comp
2026-03-30 02:00:00-04:00
Fundamental Annual Footnote and Data Code File...
local
24
fundq_20260330T060000Z
fundq
comp
2026-03-30 02:00:00-04:00
Merged Fundamental Quarterly File (Updated 202...
local
25
g_secd_20250907T161453Z
g_secd
comp
2025-09-07 12:14:53-04:00
Last modified: 09/07/2025 12:14:53
local
26
g_secd_20260120T070000Z
g_secd
comp
2026-01-20 02:00:00-05:00
Merged Global Security Daily File (Updated 202...
local
27
g_secd_20260216T070000Z
g_secd
comp
2026-02-16 02:00:00-05:00
Merged Global Security Daily File (Updated 202...
local
28
g_secd_20260217T070000Z
g_secd
comp
2026-02-17 02:00:00-05:00
Merged Global Security Daily File (Updated 202...
local
29
g_secd_20260322T060000Z
g_secd
comp
2026-03-22 02:00:00-04:00
Merged Global Security Daily File (Updated 202...
local
30
idx_daily_20260330T060000Z
idx_daily
comp
2026-03-30 02:00:00-04:00
Index Daily (Updated 2026-03-30)
local
31
r_auditors_20260330T060000Z
r_auditors
comp
2026-03-30 02:00:00-04:00
Auditors Reference Data (Updated 2026-03-30)
local
Archive the currently active file
You can archive a file manually even outside an update workflow.
from db2pq import pq_archive
pq_archive(table_name= "funda" , schema= "comp" )
Or archive an exact file path:
company = Path.home() / "Dropbox/pq_data/comp/company.parquet"
company_archive = pq_archive(file_name= company)
company_archive
'/Users/igow/Dropbox/pq_data/comp/archive/company_20260407T060000Z.parquet'
This is useful when you want to preserve the current active vintage before running an experimental refresh or downstream transformation.
Restore an archived vintage
To promote an archived file back into the active schema directory:
from db2pq import pq_restore
archive_files = pq_list_files("comp" , archive= True )
if archive_files:
pq_restore(archive_files[0 ], "comp" )
The archived basename may include or omit the .parquet suffix.
If an active destination file already exists, pq_restore() archives that file first by default before restoring the archived vintage.
Remove a file explicitly
Use pq_remove() when you want to delete an active or archived file rather than archive it.
from db2pq import pq_remove
pq_remove(table_name= "dsi" , schema= "crsp" )
'/Users/igow/Dropbox/pq_data/crsp/dsi.parquet'
Of course, I probably want that file, so let me use wrds_update_pq() to recover it!
from db2pq import wrds_update_pq
wrds_update_pq(table_name= "dsi" , schema= "crsp" )
Updated crsp.dsi is available.
Beginning file download at 2026-04-07 21:26:29 UTC.
Completed file download at 2026-04-07 21:26:32 UTC.
'/Users/igow/Dropbox/pq_data/crsp/dsi.parquet'
To remove an archived file:
pq_remove(
table_name= "funda_20260331T060000Z" ,
schema= "comp" ,
archive= True ,
)
Or remove a file by exact path:
'/Users/igow/Dropbox/pq_data/comp/archive/company_20260407T060000Z.parquet'
pq_remove(file_name= company_archive)