This article comes from the winter 2008 edition of the Nonprofit Quarterly. It was first published online on December 21, 2008.

As the nonprofit sector comes of age, one of its central challenges is how to use data more effectively. Only by analyzing organizational—and sector-wide—metrics can nonprofits identify areas of improvement in performance, staffing, compensation, and other areas central to nonprofits’ work.

But simply gathering more data isn’t enough. The sector also suffers from inadequate funding for research and, as a result, can treat sector data as proprietary rather than an information source open to the public—a model that undercuts the value of this data as a public good. This article explores how the proprietary model has undermined data quality and inhibited avenues to make industry data available as a vital public resource.

For 25 years, data on the activities of nonprofit organizations has supported research on the sector to facilitate collaborative activity between organizations, such as public education, advocacy, and clearinghouse functions. But with major technology innovations over the past decade and the burgeoning of the sector, the necessity for nonprofits to make regular and timely use of information about their line of work has increased. The field has the opportunity to develop real-time access to current economic program performance information to provide nonprofit decision makers with access to critical data, as is the practice in private-sector industries such as steel, cereal, and apparel. To plan, strategize, and benchmark performance in their fields, nonprofit organizations must employ data on input and output, salary trends, and other areas central to their activities—instead of relying on guesswork and supposition.

In 1996, these high hopes were captured in the mission statement of the National Center for Charitable Statistics (NCCS), a program at the Urban Institute’s Center on Nonprofits and Philanthropy.

The mission of the National Center for Charitable Statistics is to encourage, collect, publish, house and/or sponsor the longitudinal collections of statistics and other quantitative information to help describe, define, and quantify the independent sector, to serve as a bridge between practitioners and scholars in the development and dissemination of knowledge to the sector, and to inform public policy decision makers.

Getting Better Data

But unlocking the information can be a hurdle, which in some ways is counterintuitive. Indeed, more information on the nonprofit sector is available in more formats than ever, but the data infrastructure is hobbled by too little funding and slow adoption of electronic reporting. These new formats don’t equal broader access. In contrast to the private sector, nonprofit organizations are largely responsible for data collection and dissemination—without financial support for their efforts. The cost of an army of data-entry operators who can process hundreds of thousands of PDFs of IRS 990 returns is significant, and recovering these costs is a major headache for all data creators.

The basic set of information on nonprofits starts with the annual Internal Revenue Service Transaction files, and the Exempt Organizations IRS Master Data File. Since the early days, nonprofit data collectors have agreed on the importance of matching all organizational data with a single identification number—the IRS-assigned Employer Identification Number (EIN)—to facilitate the building, exchange, and comparison of longitudinal data sets on financially active U.S. nonprofits.
Using IRS Form 990 as the starting point has clear advantages by establishing a national filing requirement enforced by federal authorities and perjury penalties. Nevertheless, researchers have long admitted serious flaws concerning information about nonprofits based on 990 filings, including the following:

  • The data is incomplete. Organizations with less than $25,000 in financial activity have not been required to file, and religious organizations have been exempt.
  • The data isn’t timely. Form 990 returns are due five and a half months after the end of the tax year, and three-month extensions are easy to get and commonly requested. Filing and entering returns by the receiving bodies can take two to four months, making scanned images of the returns for anything close to a full population of organizations available about a year after that and before data entry can begin. Compiled information on a full set of organizations may not occur until two years after the reporting period.
  • The data lacks precision. Reporting sometimes obscures key facts, such as by conflating revenues from program service fees from private payers with most government payments.
  • Data quality is variable. Despite lengthy IRS instructions, organizations’ and auditors’ reporting can be inconsistent.

But several changes on the horizon may address these issues:

  • Removal of defunct organizations. The new 990-N requires nonfilers to file annually online to confirm their continued existence and contact information (via an “e-postcard”). In addition to providing an address, contact person, and so forth, this process requires organizations to provide a Web address (if the organization has one), and an EIN. Based on returns of the 990-Ns that are due in 2008, the early indications are that the vast majority of nonfilers are defunct.
  • New Form 990. The newly redesigned and expanded Form 990 for 2008 (due to be filed in 2009) requires additional information, such as separating private-service fees from government payments and greater detail on governance practices.
  • Data reduction. Some information will be reduced, resulting from changes, such as an increase in the reporting threshold for the full form to $500,000 (organizations with less than that amount can complete the shorter 990-EZ) and an increase in the salary disclosure from $50,000 to $100,000 (which makes comparing salary data among organizations difficult because salaries for many positions will no longer be reported).
  • E-filing. Electronic form filing should reduce the time and expense of entering data in research databases. The IRS made e-filing generally mandatory for exempt organizations with more than $10 million in assets and is required beginning with returns for tax years ending in December 2006. The IRS anticipates expanding this requirement to other organizations.

Data Gathering and Ownership

Outside the IRS, three organizations have taken on the bulk of the work to maintain the databases on nonprofit organizations:

  • First established as a program of Independent Sector, the National Center for Charitable Statistics provides information on 800,000 organizations as well as access to 140 customizable data files through the NCCS Data Web, a national repository, and other NCCS online tools.
  • Formed in 1994, GuideStar USA reports information on 1.7 million nonprofits, with free, basic searchable profiles of organizations, and for $1,000 a year, the premium service targets professional users, from financial firms to philanthropic foundations. In 2006, subscriptions and licensing fees provided 64 percent of GuideStar’s total revenue, with the other 36 percent coming from foundation grants.
  • Established in 1956, the Foundation Center maintains data on 90,000 grantmakers and 900,000 grants, which are made available through Foundation Directory Online (an annual subscription is $195 to $1,295). The organization also offers publications and materials at 345 cooperating libraries, categorized grants housed in the Philanthropy Data Factory, and a variety of publications, including benchmarking of foundation practices and studies on newer forms of giving. Of the Foundation Center’s $24 million in 2006 revenue, 48 percent came from products and services, and the remainder from foundation grants.

During an era of funder enthusiasm for the breadth of scope and impact of these organizations’ nascent infrastructure projects, all three organizations launched with substantial foundation support. As information was transformed by increasingly accessible computing and storage capacity, national industry information became universally available, which sparked research, associations, and advocacy activities.

While these three organizations are united by agreements of mutual cooperation, each also receives incentives (and has substantial financial pressure) to create proprietary information to recover its costs. Accessibility of data has also given birth to a small army of vendors competing for attention and resources and has generated additional databases, such as the following:

  • The Economic Research Institute generates salary surveys and data on nonprofit and for-profit wage and compensation levels.
  • Charity Navigator, the American Institute of Philanthropy, and the BBB Wise Giving Alliance compile analyses and ratings of nonprofits based on fundraising efficiency and other measures.
  • The Center for Effective Philanthropy (CEP) conducts surveys of foundation grantees to generate individual-foundation and composite analyses of grant recipients’ perspectives on foundation operations

The challenge of how to finance these data efforts creates distortions of access and distribution. Which information should be free, what should be collected, and who will pay for it? At the urging of one of its major donors, the Center on Nonprofits and Philanthropy raised a substantial endowment to sustain core NCCS operations. GuideStar has experimented with several earned-income models to sustain the low- and no-cost access to 990s and basic financial analyses that it offers registered users, partly by selling premium services to the financial services industry.

Other entities also generate data on the nonprofit sector. They include the Bridgespan Group, national nonprofit infrastructure organizations (such as the annual management member surveys by the Council on Foundations [COF], which focuses on larger foundations, and the Association of Small Foundations [ASF], which produces annual information on its membership of foundations with typically few or no employed staff), and for-profit consultants.

The challenge is to find funding for maintaining core databases such as GuideStar and NCCS at nominal cost and to create incentives to make other databases, such as COF’s, ASF’s, and CEP’s, available to those other than members or contract-paying foundations.

Public Funding for Nonprofit Research and Data Collection

The dilemma of long-term funding for resources like sector data is that foundations see themselves less as sustainers and more as pioneers and adventurers that explore new ground and then move on.

For industry after industry, the U.S. departments of Commerce, Energy, Transportation, Labor, Health and Human Services, and the Small Business Administration, assume the responsibility of making data available.
To maintain the core activities for an ongoing, widely accessible base of reliable and timely information on nonprofits might cost $15 million a year, which is a modest investment given the scale, scope, and expectations of the U.S. nonprofit sector.

The rationale for this kind of funding is both complex and simple. The complexity stems from the variegated nature of the nonprofit sector itself. Unlike many industrial sectors, nonprofits are quite diverse. Nonprofits’ 501(c)(3) status is broad based; there is only a tenuous connection, for example, between Harvard University, which has a $34 billion endowment, and a surviving-on-a-shoestring tiny nonprofit that files a 990-N electronic postcard.

The simplicity stems from the emerging strength of the nonprofit sector. When industries succeed in getting the federal government to devote IRS, Labor, or Commerce attention to generating databases, the result is definitional profile and importance. By virtue of being accorded government-subsidized data collection and compilation, an industry, in essence, becomes recognized, important, and analyzed.

As this issue of the Nonprofit Quarterly notes, the nonprofit sector has gained heft and clout as a result of its national growth and its delivery of programs and services. Indeed, but for the sector’s presence, these services might not have reached disadvantaged and disenfranchised populations so effectively. Consider the on-the-ground work of charities in the wake of September 11, Hurricane Katrina, and the Southeast Asian tsunami. Because of the growing legitimacy of the nonprofit sector and by virtue of its expanding role in U.S. society, federal funding for the nation’s nonprofit infrastructure—including nonprofit database generation and use—should be encouraged. Only with new avenues of funding can sector research and data serve their ultimate purpose: providing the information necessary for organizations to succeed at their public-benefit missions.

Informing Practice with Research

In addition to the problems concerning nonprofit data collection and dissemination, the tools and practical applications to make this information useful have lagged.

Nonprofit organizations require several streams of information to carry out their essential functions. While most organizations’ informational needs can be satisfied within their own systems such as databases, spreadsheets, correspondence, documents, and files, organizations need information to understand and make decisions about the larger operating environment in which they work. As a result, nonprofits—as well as foundations and regulators—are consumers of industry-wide data to inform eight functions: resource acquisition, resource allocation, organizational planning, governance and management, human resources, higher education, public policy, and public education.

To that end, we propose the following as a beginning framework for how nonprofit organizations—as well as researchers and policy makers—can conceptualize potential research themes that tap existing nonprofit sector databases.