1
Introduction
1.1
A taxonomy of data
1.1.1
Project-specific versus general data
1.2
Why a relational database?
1.2.1
Alternative 1: Data files in statistical package of choice
1.3
Why PostgreSQL?
2
Bad practices in research
2.1
Manual steps in analysis
2.2
Manual modification of data
2.3
Bad (or no) documentation
2.4
Poor version control
2.5
Limited sharing of code and data
2.6
No data exploration
2.7
Casual approach to merging data sets
3
The backbone: The relational database
3.1
Handling large-ish data sets
4
Getting data into PostgreSQL
4.1
Data from WRDS
5
Identifiers
5.1
Firm identifiers: A quiz
5.2
CRSP’s link tables
6
Application: Firm performance over time
7
Application: Event returns
8
Application: Processing textual data
9
Application: Hand-collection of data
9.1
Google Sheets
9.2
Text annotation
Modern Research Computing
Chapter 6
Application: Firm performance over time