Note: this summary is a joint posting by Doug Koplow (Earth Track) and Ron Steenblik (OECD)
As of the end of June, the Subsidyscope website has been discontinued. Subsidyscope was an initiative of the Pew Charitable Trusts to examine federal subsidies across multiple sectors of federal activity. The project, which included significant involvement by the Sunlight Foundation, ran from 2008-2012.
During 2013, there was no new work other than updating the project's tax expenditure database. The various datasets developed by the project have been archived offline. Reports and commentary commissioned by Pew remain accessible via Pew's main website, here.
Both of us were on their Advisory Board, along with a number of other people focused on subsidy and budget transparency. Collectively, the two of us have been working on improving subsidy transparency for several decades at this point. We thought it might be useful to share some of our takeaways from what Subsidyscope tried to do, the common challenges that these types of initiatives often face, and ways to move subsidy transparency mainstream.
1) Engage the wonks, make data variance visible
Specialized knowledge of arcane data systems and data sources was extremely important to bring in early. Pew did a nice job in this area, engaging people (as staff and advisors) with a broad array of experience in subsidy measurement, data sources, and the politics of subsidy reform. This group was important not only in identifying potential data sources, but also in understanding the limitations of those data sets and finding people who could help solve the problems that inevitably arose.
Trying to pull systematic data from the government was a useful exercise for a foundation to take on. The process highlighted the types of problems that would need to be solved were this type of reporting to become routine, and created an initial map of how standardizing more detailed subsidy reporting might proceed.
The challenges can be fairly sobering. At one advisory board meeting, participants listed four or five different federal budgeting systems in use, each generating somewhat different spending figures. It is noteworthy that this disparity applied to the easy subsidies: cash flows being used in a particular budget year. Support provided through credit or insurance markets, tax expenditures, mandates and regulatory exemptions all require more complicated assessments of activity relative to a hypothetical counter-factual marketplace, and therefore are more difficult to measure.
Were there a way to easily compare multiple data sources for the same spending area, we expect that the simplest types of data disagreements would quickly be reconciled and that the variance between systems would decline over time. Achieving this type of transparent comparability was one of the main drivers behind Subsidyscope's efforts to integrate tax expenditure data from the Joint Committee on Taxation (JCT) and the Treasury. Their tables provided a snap-shot not only of differences between expected losses from the same provision across estimators, but also how projected values by the same estimator changed year-to-year. There are a number of common factors that may result in year-to-year changes, including modifications to eligibility, faster market growth, or changing assumptions on how frequently a tax subsidy is being used. Rarely are the numerical shifts accompanied with explanations for the changes, however. Subsidyscope's presentation format made differences in estimated revenue losses easy to see for the first time - making it possible to begin pushing back to root causes.
In an ideal world, this work would have led to changes in behavior and reporting in both the JCT and the Treasury - creating a more visible data set on tax expenditures going forward. For, as with the budget data variance, unless these comparisons are standardized, routinized, and made accessible to the public, the pressure to identify and correct estimation error (or at least to explain its causes) quickly dissipates.
Unfortunately, we have not seen a shift in best practices by the tax expenditure estimation teams. The end of the Subsidyscope initiative will lessen the pressure for better tax expenditure reporting. At present, no other groups provide the comparative data on tax expenditure that Subsidyscope did. In truth, though, there is really no reason for reporting not to be improved within the government. With about $1 trillion/year in federal tax expenditures, is it really too much to ask that the JCT and the Treasury publish their revenue-loss estimates in machine-readable Excel files rather than difficult-to-work-with PDFs? Or for them to acknowledge, and to routinely report, on how and why their estimates differ from each other, over time, and from actual claims on tax returns?
2) Some of the most important findings
of Subsidyscope relate to process rather than data sets
Looking in detail across multiple sources of government data provided a great opportunity for Pew and Sunlight researchers to identify gaps, discrepancies, and errors in data sources; and to compare and contrast the ease and accuracy with which different parts of the federal government made information available. In the process of trying to develop work products and reports, these obstacles were frustrating - indeed infuriating - to overcome. However, when one looks at Subsidyscope as a test of what it takes to bring federal transparency to the next level, looking for patterns in these obstacles and describing how they were (or could be) surpassed becomes vitally important.
As the project was winding down and people involved with the detailed research were beginning to move to other tasks and opportunities, Doug suggested that the staff do an internal summary of these challenges and lessons while the memories were still fresh. Our understanding is that such reviews were conducted, and (we hope) shared with Pew's Board.
These summaries would have been quite interesting for the Advisory Board, and even the public to see as well. Perhaps Pew will consider making a version public. A core objective of Subsidyscope was for its work to spark further analysis, disclosure, and reform by other organizations and agencies. The learning from Pew's four year effort would likely be useful to a broad audience, and perhaps enable some of them to avoid pitfalls as well.
3) Limits of foundations
Both Pew and Sunlight are well funded, highly competent organizations. But trying to build detailed, recurring, core data on government operations is likely beyond the scope of what even large foundations can do. The work is immense, and the greatest leverage to adjust data collection in ways that streamline informational availability and improve accuracy exists "up-stream" within government agencies - not among users of existing government datasets.
A useful analogy for the path that subsidy reporting needs to follow is corporate reporting before the Securities and Exchange Commission. At that time, what data there were on corporations and their activities were spotty, inconsistent, and often held by insiders only. Published reports were unreliable.
Foundations could not have changed that situation. The cost of initially collecting data is high; but so too is the cost of keeping a data set current year-after-year. Yet ongoing information over a long period of time is required if people's expectations and decision-making criteria are to be changed. Subsidyscope is not alone in this regard: many, many NGO data initiatives are funded for a few years, after which their data go stale.
Had Pew funded a test project on corporate reporting in the 1920s for a period of four years, merely demonstrating that better data was logical and welfare-enhancing would not have been sufficient to trigger broad change in reporting. When corporate reporting did evolve -- through regular mandatory public reporting of financial data for traded companies, independent audits by third parties, and civil and criminal penalties for willful misrepresentation, it was the result of government requirements, not NGO or foundation case studies. It was a structural change in the way a large group of firms had to operate.
But equally important to remember is that detailed case studies such as what Pew conducted via Subsidyscope can greatly inform the types of regulatory changes needed; and that wide-ranging societal benefits from standardized corporate reporting resulted. Markets became more transparent, the cost of raising capital fell, and public investors faced a much lower risk (not zero, of course) of being swindled in financial markets. As such, the system became at least partially self-reinforcing. Even the firms themselves saw benefits from the new approach; certainly third-party auditors had an interest in the new system surviving. And so it did.
A similar set of changes are needed in the subsidy-reporting arena. And, as with corporate reporting, many of the initial components of these changes will be mandatory, not voluntary. This is a continuation of prior practice, not a departure: though US subsidy reporting is far from perfect or complete, it is salient that some of the biggest steps forward in transparency have been regulatory and legislative in nature. This includes annual reporting of tax expenditures, audit requirements on government agencies, inspector generals within agencies to root out fraud and abuse, and the tracking and reporting of credit subsidies.
4) Measuring success
We'd be interested to hear how Subsidyscope's funders measure the success and failures of their four year effort. Every project has both elements, of course; and both are important to learn from. But one critical evaluative factor we think needs to be included in projects such as this one is whether the effort directly or indirectly changed the reporting practices of data generators such that transparency would remain improved even after the initiative ended. Assembling existing data in new ways from the outside is important. But moving best practices upstream, as well as establishing a recurring process for internal data evaluation and correction, is equally so.
A second critical success factor is whether the work conducted within a specific geographic and temporal setting is able to be linked with, and integrated into, related work on a similar topic conducted by other parties. The ability to integrate learning and information is what enables a narrow set of information to build over time into something broad enough to shift decision making within an economy; and to extend the usefulness of the work beyond the termination point of a specific project.
Cross-country collaboration on subsidy transparency and reform, including data integration, is ever more achievable. This is largely the combined result of the internet and the growing number of researchers working in the subsidy area. We hope to see this potential increasingly leveraged going forward.
For many years, the main route for convergence and standardization of subsidy reporting was through inter-governmental organizations. In the area of agricultural support, for example, the Organisation for Economic Co-operation and Development (OECD) started the ball rolling in the 1980s with its standardized reporting on government support for major crops and livestock commodities. A similar reporting format was later adopted by the World Trade Organization (WTO) in the mid-1990s. The requirement to notify the WTO about government support to industry led to some convergence among countries in how the data on support were compiled at the national level as well. However, the weak enforcement mechanism for WTO reporting of industry supports has limited the efficacy of this change lever.
Increasingly, non-governmental organizations such as the Global Subsidies Initiative and the International Budget Partnership are demonstrating their capability to collect information on subsidies and compile them in a standardized format. Their value-added is clear, though as down-stream users their ability to alter the way data is collected and reported by the government itself is limited.
Subsidyscope's focus was on the United States. But additional outreach to similar initiatives in other countries could well have leveraged their domestic work - both by providing mutual support on common challenges and by helping to achieve improved coherence between the US results and work conducted abroad.
5) Rescuing data from archival oblivion
Project wind-down normally involves a good deal of filing and archiving. Data records are "preserved" on computer drives, but in terms of policy usefulness, they are highly perishable commodities. Older data become less relevant to current policy decisions, but if integrated into a time series or comparative metrics, can remain important for an extended period of time.
Yet once archived, knowledge about the data sets is rapidly lost. Fixes done to make them workable, sources of raw data, programming routines to integrate or normalize information from multiple sources, even definitions of field names get misplaced. Much of the meta data on these data sets, so critical in extending their use, resides in the heads of the original research team. As these people move on to new positions, this important information disperses with them. Data storage protocols are always changing as well, and electronic data from 15 years ago may be difficult even to read into a computer today; data from 30 years ago are nearly irrretrievable.
Again, these are not problems unique to the Subsidyscope effort, by any means. The Open Knowledge Foundation focuses on the issue of "data salvage" and reuse straight out, and across a variety of research fields. But the question we pose is whether there is a way to archive subsidy data differently going forward - so the work by Subsidyscope, or by environmental NGOs, or indeed by analysts within the governments generating the raw data themselves, can be more readily integrated as a module into a bigger data "whole" rather than sent to data oblivion. Ideally, these disparate pieces could be structured in such a way so that they can be easily integrated into a larger dataset, and thereby contribute to a broadening picture of government subsidies.
 In a 2007 paper Doug prepared for the OECD, he advocated requiring a formal variance report each year by JCT and Treasury any time their estimates were more than $100 million off from what was actually claimed on tax returns. A similar report should be required whenever JCT and Treasury estimates for the same provision differ by more than $50 million. The goal isn't to have perfect accuracy in estimation, but to establish a process by which errors are better understood and the cost of tax breaks can be estimated with greater precision over time.