Holding the Government Accountable

Spending Data Has Improved—Probably

(Illustration: CJ Ostrosky / POGO)

It is that time of year again, when offices of inspectors general across the government issue audit results for federal spending data. The spending data covers the trillions of dollars agencies spend each year on contracts, grants, loans, direct assistance, and other awards. Congress and the public need to be able to rely on agency spending data to track the money and hold the agencies accountable. While the audits indicate improvements in data quality among many agencies, significant changes to the methodology make it difficult to gauge exactly how much progress has been made and how many problems remain unaddressed.

The instructions changed significantly between the 2017 and 2019 audits, which means that the reported results aren’t consistent over time and can not be compared to assess progress accurately.

The Digital Accountability and Transparency Act (DATA Act) of 2014 sought to improve the quality of spending data that agencies post on USASpending.gov, a government website that provides the public detailed access to federal agency spending information. The law required that government-wide standards be established to ensure agencies are reporting reliable spending information that would match up across government. The standards are an important step to address quality problems that have long plagued public spending data and that make it difficult to hold agencies accountable for their spending decisions.

Since consistent implementation is critical to the effort, Congress included a requirement that each agency’s office of inspector general conduct audits of their agency’s data to assess compliance with the law. The DATA Act required three audits over a six-year period issued every other year. The 2019 audits are the second set being done, with the final round scheduled for 2021. The law importantly also required a detailed review of a statistically valid-sized sample of agencies’ spending transactions—individual contract purchases or grant payments—that evaluates the data’s timeliness, completeness, and accuracy.

The Council of the Inspectors General on Integrity and Efficiency was tasked under the law with establishing guidelines and methodology for the audits. These instructions are meant to ensure that each agency audit, though being done independently, would be consistent in approach with the other audits. However, the instructions changed significantly between the 2017 and 2019 audits, which means that the reported results aren’t consistent over time and can not be compared to assess progress accurately.

In the 2017 audit instructions, inspector general offices were directed to focus on the data quality for transactions—considering the transaction to be an error if it contained any individual data point that was incomplete, late, or inaccurate. Inspectors general then calculated the number of transactions that had any data quality errors. The council’s guidelines specifically told agencies that “Accuracy is measured as the percentage of transactions that are complete and agree with the systems of record or other authoritative sources.”

In the 2019 guidance, the inspectors general council changed course significantly and instructed inspectors general to focus on “data elements”—the individual cells of data within a record that gave all the details about a contract or grant, including who made the payment, how much it was for, who received it, and when it was made. Inspectors general were directed to calculate the error rate of individual data fields in each record tested and then calculate the average of those rates. There are up to 57 standardized data elements that could be tested in individual records, which turns a sample of a few hundred transactions into a review of thousands of data elements.

As a pure statistical approach, the error rate of data elements is valid and useful. The transaction error rate resulted in a pass/fail test: One error and the whole record was marked as wrong. If there were two errors or even ten errors in the record it didn’t change the result as the record couldn’t be marked as being more wrong. So the transaction error rate approach essentially discounts multiple errors in a single record. Calculating based on data elements means that every mistake counts toward a higher error rate.

However, examining each individual cell in a data record means the error rate will be calculated with a much larger base. Remember that error rates are essentially the number of errors divided by the number of items checked. So making the base about 50 times larger will necessarily produce a lower error rate in almost every agency.

And these lower numbers could easily result in a perception problem for those unaware of the specific methodology. People could be misled by those low numbers into thinking that an agency has a small or even insignificant data quality problem. But that may not reflect the reality.

Consider a newspaper that has errors in almost every article. Then the newspaper conducts a review and finds that, although each article did have an error, only 1 word out of every 100-words in an article was wrong or inaccurate. The newspaper would be able to report that it only had a 1% error rate. Most people would consider a 1% error rate to be minor and acceptable. But most people would also consider a paper with an error in every article to be of poor quality and unreliable. This is the tension between a data element error rate and a transaction error rate.

Many users of the federal spending data would see each transaction as a whole article, and that an error—any error—in the transaction detracts from the overall reliability of the record. It doesn’t matter that it is just one error across dozens of fields. That’s because each transaction is a story in itself, a story about which agency paid which recipient a certain amount of money for specific work in a particular place during a set time period. Get one of those facts wrong and the story is wrong.

This potential perception problem could be exacerbated by the data quality scale the inspectors general council established: The council defined error rates 20 percent or lower as higher quality data, between 21 and 40 percent as moderate quality data, and above 41 percent as lower quality. But with a high number of data elements being checked, often thousands, it would take an extreme number of errors to get over 41 percent. For instance, if every transaction had 10 data elements wrong out of 50 being checked, that would count as a 20 percent data element error rate and would qualify as “higher quality” data. It is hard to defend 10 errors per transaction as higher quality.

The 2017 audits uncovered widespread and severe data quality problems. The Project On Government Oversight reviewed 41 audit reports (all the audits that we could locate at the time since not all agencies published them at the same time) and found high error rates in timeliness, completeness, and accuracy of spending transactions across many of the agencies. While all three aspects of data quality had problems, accuracy had the highest error rates. Several agencies discovered that 90 percent or more of their transactions had accuracy errors. The Government Accountability Office also did a government-wide review of data quality and estimated that “only between 0 and 1 percent of award records were fully consistent” with agency sources.

The 2019 audits are producing what appear to be significantly better results. POGO sought to review audits for the same 41 agencies and was able to locate reports for 25 of the agencies. Once again, publication of the audits seems sporadic. Under this new methodology, almost all of the agencies—22 out of the 25 we reviewed—achieved the “higher quality data” standard with error rates below 20%. The three agencies that had either moderate or poor data quality results were the Departments of Treasury and Defense, and the National Science Foundation. The audit for the Department of Defense, which reported separate error rates for contracts and grants, notably had a 59% timeliness error rate and a 33.9% accuracy error rate for data elements in their grants awards.

Several inspector general offices acknowledged up front the shift in methodology and that the error rates in 2017 and 2019 could not be compared. The Department of Energy’s (DOE) audit, for instance, clearly noted “the results of our current review may not be fully comparable to our prior audit due to a change in the methodologies.” The DOE audit also included numbers on both transaction errors and data element errors, making the difference in scale between the two approaches very clear. The audit reported data element error rates for accuracy, completeness, and timeliness of 3%, 1.5%, and 1.7% respectively. But those same errors produce transaction error rates of 49%, 28%, and 28% for the same data qualities.

Other audits, however, seemed to imply the 2017 and 2019 error rates could be compared or that the new lower error rates were based on transactions. The inspector general audit of the Railroad Retirement Board touted the error rates, saying, “The RRB improved its accuracy error rate for the sample from 91 percent to 0.43 percent in two years.” The report noted that the prescribed sampling approach contributed to the significant decline in error rate but strongly implied that the two widely different numbers represented a fair measure of improvements by the agency.

Similarly, the Department of the Interior reported that “Based on DOI's FY 2017 error rate of 38 percent; an estimated average error rate of 38 percent was expected for FY 2018. However, we were very pleased to learn that DOI's highest error rate was only 11 percent during this evaluation period. This represents a significant improvement.”

The Department of the Treasury’s audit included confusing language, stating, “We tested the 148 non-IRS records and determined that 10 percent are incomplete, 14 percent are inaccurate, and 28 percent are untimely.” This wording would indicate that the reported error rates apply to the full transactions. But other data in the audit make it clear that these were the data element error rates. The Treasury audit doesn’t include the actual transaction error rate, but does note high accuracy error rates for data elements. For instance, more than half of the contract transactions did not have the correct information on where the work was being done, and half of the records had inaccurate data on the name of the parent company getting the contract. These errors would almost certainly translate into transaction error rates of 50% or higher.

Congress required the three sets of audits over six years to gauge progress on data quality. But the change in methodology negates our ability to accurately track progress over the first four years. Ideally, the 2021 audits will include both data element error rates and transaction error rates, allowing the policymakers and the public to compare the performance that year with where agencies were in both 2017 and 2019.