Notes
This site publishes a curated set of notes. Use the category filters to find notes by topic.
An additional note generated from a separate subproject is available here: Converting lazy data frames into Parquet files (Python version).
| Title | Date | Categories |
|---|---|---|
| Converting lazy data frames into Parquet files | Mar 22, 2026 | R, Parquet, db2pq, WRDS, Tidy Finance |
| Ball and Brown (1968): A reading guide and replication | Mar 15, 2026 | Accounting research, Replication, Python |
| Data management ideas for researchers | Feb 25, 2026 | Python, Parquet, db2pq, WRDS |
| Data management ideas for researchers (R version) | Feb 25, 2026 | R, Parquet, db2pq, WRDS |
| Data curation: The case of Call Reports | Feb 18, 2026 | Data curation, Polars, DuckDB |
| Data collection (with spreadsheets) | Feb 5, 2026 | Data curation, Spreadsheets |
| Data curation and the data science workflow | Jan 29, 2026 | Data curation, Australia, ASX, SIRCA |
Some benchmarks with comp.g_secd
|
Jan 21, 2026 | SAS, WRDS, CRSP, Parquet, Python |
| The best of both worlds: Using modern data frame libraries to create pandas data | Jan 20, 2026 | WRDS, Polars, Ibis, pandas |
| Using SAS to create pandas data | Jan 20, 2026 | SAS, pandas, wrds2pg |
| Shared code | Jan 15, 2026 | research, web data |
| Reproducible data collection | Jan 5, 2026 | Reproducibility, Research methods |
| Writing better SQL without writing SQL | Dec 17, 2025 | SQL, dbplyr |
Responsive open-source software: Two examples from dbplyr
|
Dec 17, 2025 | Data curation, dbplyr, SQL, DuckDB |
Responsive open-source software: Two examples from dbplyr
|
Dec 17, 2025 | dbplyr, SQL |
| Analysis of IPOs on the ASX | Sep 12, 2025 | Australia, IPOs |
| SIRCA ASX End of Day (EOD) collection | Sep 11, 2025 | Australia, SIRCA, ASX, CSV, Parquet, dbplyr |
| SIRCA Mergers and Acquisitions collection | Sep 11, 2025 | Australia, M&A, ASX, SIRCA, Parquet |
| Defining winter and summer in Oxford | Mar 10, 2025 | Weather, Oxford, Python |
| Stock returns on Yahoo Finance | Feb 26, 2025 | Yahoo, finance |
| Retail sales | Feb 14, 2025 | |
| Getting SEC EDGAR XBRL data | Dec 2, 2024 | |
| ACNC Registry data: Arrow version | Oct 1, 2024 | Australia, Arrow |
| Does @Beardsley_2021 show anything? | Oct 1, 2024 | Research methods |
| A quick look at City of Melbourne bike data | Sep 19, 2024 | |
| Working with date and times | Aug 13, 2024 | Datetimes, DuckDB |
| Defining winter and summer in Boston | Apr 20, 2024 | Weather, Boston |
| Sunrise and sunset times | Apr 20, 2024 | Datetimes, Weather, Australia, Melbourne, Boston |
| Defining winter and summer in Sydney | Apr 20, 2024 | Weather, Australia, Sydney |
| Defining winter and summer in Melbourne | Apr 20, 2024 | Weather, Australia, Melbourne |
Trading days per year (crsp.dsf)
|
Apr 10, 2024 | CRSP, WRDS |
| Data visualization challenge | Apr 5, 2024 | Data visualization, ggplot2 |
| Improving performance of SQLite data | Dec 29, 2023 | Tidy Finance, SQLite |
| Calculating betas using DuckDB | Dec 23, 2023 | Tidy Finance, DuckDB, WRDS, Finance |
| The Gino-Colada Affair | Oct 1, 2023 | Reproducibility, Research methods |
| The elephant in the room: p-hacking and accounting research | Aug 8, 2023 | Research methods, p-hacking |
| Adding delisting returns to monthly data | Apr 7, 2023 | SAS, Stock returns, CRSP |
| Should Bao et al. (2020) be retracted? | Oct 13, 2022 | Research methods, Machine learning |
No matching items