In May of 2024, I posted on LinkedIn a brief note about working with data from Form APs filed with the PCAOB. In a comment on my LinkedIn posting, Olga Usvyatsky suggested that “the variation is not limited to firms’ names - for instance, I find errors in reporting CIK codes of the clients intriguing.” I thought it would be interesting to investigate the issue raised by Olga, but doing so would be greatly facilitated by an alternative source for data on auditor-client relationships.
Recently, I discovered that it is relatively straightforward to process XBRL data filed using SEC EDGAR using data sets prepared by the SEC and posted on its website. There are two data sets: the Financial Statements and Financial Statement and Notes data sets, with the latter being roughly ten times as large as the former. For the task we consider here, we need to use the Financial Statement and Notes data set.
In essence, I compare the auditors listed in firms’ 10-K filings with data on Form APs, with a focus on my success rate in matching the two. In Table 5, we can see that the auditor with the greatest number of non-matches is B F Borgers CPA PC, an auditor featured in a tongue-in-cheek Financial Times article by George Steer that showed that the auditor Ben Borgers had used 14 different names—including the name “Ben F orgers”—on Form AP filings. While more research would be needed to investigate the reasons for “missing” data, having Borgers emerge as the “winner” yet again suggests that missing Form AP filings might be another red flag worth pursuing.
This note was written using Quarto and compiled with RStudio, an integrated development environment (IDE) for working with R. The source code for this note is available here and the latest version of this PDF is here.
XBRL tags
A key concept in XBRL is the tag. Each value will be associated with, inter alia, a tag that indicates what the value represents. For example, a value might be tagged as AssetsCurrent to indicate that the value represents the total of current assets.
To understand tags related to auditors, I begin by examining the tags that begin with the text Auditor. From Table 1, it can be seen that there are three such tags in common use: AuditorName, AuditorLocation, and AuditorFirmId.
From Table 1, it appears that not every filing with non-missing AuditorName has non-missing AuditorFirmId. Table 2 provides more data on the distribution of missing values for these two fields, where has_name and has_id indicate non-missing values of AuditorName and AuditorFirmId, respectively.
Table 3 provides some information on some cases where AuditorName is present, but AuditorFirmId is not. While there’s no clear pattern to these data, they do suggest that firms are not always diligent in including AuditorFirmId in XBRL filings.
From Table 4, it can be seen that most filings containing information in AuditorFirmId are on variants of Form 10-K. So I focus on Form 10-K filings (and variants) in the analysis in this note.
From the Form APs, I collect data on audit firm IDs, the CIKs of issuers and fiscal period-end dates.
I then merge data from these 10-K filings with data on Form APs using accession numbers and period-end dates. Table 5 shows that roughly 1–2% of filings on Form 10-Ks involving Big Four auditors (firm_id values of 34, 42, 238, and 185), Grant Thornton (248) or BDO (243) appear not to have corresponding matches in the Form APs data.
While there might be a perfectly innocent explanation, it might be worth digging deeper to understand why (say) Gries & Associates, Yusufali & Associates, and Heaton & Company have such high rates of unmatched filings.