Making Quantitative Program Measures Useful: An Orientation to Evaluation

We live in a time when quantitative information is increasingly demanded, yet at the same time often questioned. This has particular import for nonprofits. Agency directors are concerned that the standards used to evaluate their programs may be arbitrary, or they worry about the level of effort required to come up with appropriate measures of program success. Yet as an evaluator for the past 30 years, I have found that when the project staff and administrators identify ways in which their mission can be expressed in quantitative terms the benefits can be enormous. These efforts yield useful information about the work being performed, suggest new issues and directions that deserve attention, and, perhaps most important, give you the hard evidence you need to satisfy funders.

There are two keys to making quantitative indicators/measures more acceptable and less risky. First is thoughtful attention to where these measures belong in an evaluation schema and second is careful consideration of causal relations and the validity of the indicators themselves. Because issues of causality and validity arise in all evaluations, those will be the focus of this article. However, it is necessary to mention that there are a range of approaches to evaluating programs, each of which has a different purpose and often a different level of complexity. Before thinking about measures and indicators, you must determine what type of evaluation you want to accomplish, so before I discuss how to construct useful measures and indicators, let me first describe the different types of evaluations.

The Schema of Evaluation

The first and most important step in considering an evaluation approach is to learn what types of evaluations exist. A number of experts have categorized and described these evaluation approaches, often using differing labels; however, regardless of the name, the basis for categorization is usually derived from how the evaluation will be used. The major approaches and their purposes are listed below in roughly an order of less to more rigor and scientific validity.

Type of Evaluation and Purpose[1]
Responsive/Clarifying
To answer questions or solve problems the organization is having in running a particular program–often focuses on specific issues, not whole programs.

Accountable/Reporting
To provide information for the organization and/or its funders as to how it has used the resources allocated; often designed to count services delivered, types of activities accomplished; also cost/expenditure reporting.

Goal/Process Oriented
To provide information on to what extent the organization has met predetermined goals (e.g., hours of service, number of persons served); usually oriented to process indicators, not outcomes.

Program Outcome/Impact
To determine if the original problem which the program was designed to solve has been solved, at least for actual program recipients (e.g., better health, higher reading scores, better self-esteem, less joblessness); can also include assessing the overall impact on the population in need; outcome and impact evaluations often distinguish between short-term goals (to be accomplished by the end of a client’s participation in a program) and long-term goals (what happens one, two or even ten years later).

Experimental
To advance our state of knowledge about causal relationships between interventions and impacts or outcomes; usually an evaluation that combines process and outcome measures and requires a control group or information from other sources beyond current program in order to generate “proof” about the effectiveness of interventions.

The type of evaluation activity an agency employs will, of course, depend on the purpose, audience, funder requirements, and resources the agency has. Experimental evaluations are the most costly and difficult to conduct. Unless an organization has special research/demonstration funding–usually from the government or a large, national foundation–these evaluations are rarely within the scope of service delivery organizations. Nevertheless, organizations that have been accustomed to providing accountability reports are now finding that goal-oriented or program outcome-oriented evaluation strategies–ones that go beyond simple counting exercises–are becoming required by funders and the competitive environment itself. These usually require a more sophisticated understanding of quantitative indicators.

While quantitative indicators can be used in all five of the evaluation types listed here, it is important to note that qualitative methods (such as observations, interviews, focus groups, and document reviews) can also be used to collect information. My focus in the rest of this article is on how organizations can think through the establishment of valid quantitative indicators for evaluation purposes and design indicators that help them move from simple measures of accountability to real measurement of program achievements.

Understanding Program Logic Models and Setting Goals: What is Related to What?
Once the nonprofit executive has decided where on the evaluation continuum the task falls, the next task is to learn that all numbers are not alike. The two major distinctions identified here involve the difference between process or input issues and outcome or impact issues. In the traditional research paradigm, these are called the independent variables and the dependent variables.

The process/input/independent variables are those that are theoretically within the program’s control, in terms of its program activities and staff. Performance goals in this area might focus on how many consumers are served, how many hours of service are delivered, how many referrals are made, and so forth. These are the typical fodder for goal-oriented evaluations. In other words, these type of goals deal with what the program does. Most programs already have reasonably well established ways to set goals in the area of process or input and have typically been operating with such goals in place for many years.

As noted earlier, however, the current climate requires a more sophisticated and risky (from the point of view of the nonprofit organization) set of goals–outcome or impact goals. These goals focus on what the organization is supposed to accomplish with its service delivery activities. Specifying these outcome goals is difficult because they need to be related to the inputs and there needs to be some expectation and understanding that the outcomes can be reached by the stated inputs. This cause-effect relationship is often called the “logic model” for the program.

While it is not usually explicitly the focus of program evaluation to prove the cause-effect relationship (this is what the most sophisticated level of evaluation, the experimental evaluation, does), understanding the logic model concept is vitally important to creating realistic expectations for a program. When a nonprofit organization does not have a well-developed logic model, or when traditional approaches to service delivery fail to be relevant because the community population or problems have changed, it can indeed be difficult to set and meet appropriate outcome or impact goals.

For example, many children’s programs are intended to promote learning of social, academic, and other skills. But these skills generally aren’t acquired by accident–a specific curriculum must be used to ensure that measurable skills change does occur. In discussing the goals of a program, staff may realize that while they really hope to see better reading scores among their after-school child care participants, the program emphasis is really on arts and recreation–not reading. The lesson of this example is don’t promise to improve reading scores when you really run a “fun and games” program for kids. Instead, relate your outcome measures to how much fun kids have, what types of art they produced, how meaningful that art was for kids who struggle academically, and so on.

Another trap that programs fall into is that of heightened expectations on the part of the board, staff, or funders. Where previously it might have been acceptable to merely meet licensing standards and perhaps keep parents satisfied, in order to compete more successfully for funds, consumers, or state contracts, programs feel they must now measurably demonstrate how their day care programs support parental employment, promote the development of children with special needs, or prevent child health problems. Certainly not all programs can meet these goals, and those that do must emphasize special services on the input side in order to hope to achieve quantitative outcome standards. These services might include special courses for teacher aides who are parents, self-help skills-building curricula for children with special needs, or specific links with community health centers for immunization and screening resources.

Thus, the challenge to nonprofit organizations is to carefully evaluate what types of input they are willing and able to mount, and to understand what types of outcomes and impacts these organizational activities are likely to produce given the population served, the type of activity offered, and the context in which the activity is being delivered. And the definition of success must reflect the reality of the challenge faced. Achieving developmental gains for children in a day care program may be more feasible than obtaining permanent housing for homeless families in an era when low-income housing subsidies are shrinking and rents are escalating. Setting appropriate goals that indicate meaningful accomplishments for the organization not only sustains the organization’s credibility, it also encourages morale and a sense of achievement among staff. Understanding both process and outcome goals, and the link between them, is necessary for an organization to conduct both goal-oriented and outcome/impact evaluation.

Are the Goals Valid?

A second major issue having to do with the quantification of organizational processes and outcomes is whether the definitions or indicators used are valid. In other words, if the organization’s goal is to provide job training for unemployed individuals, what type of indicator provides an accurate assessment of the outcomes of job training and placement that is related to the processes the organization put in place? One indicator might be the number of persons placed, while another might be the number of persons placed in full-time jobs who remained employed at least one year. Still another might be the assessment of skills that participants acquire. Is job placement a valid measure of a job training program that does not have a programmatic component to assist in the job search, or resources to subsidize employers when the economy is falling off? On the other hand, is job placement alone a valid measure of program success if individuals do not maintain the jobs for a reasonable period of time or if the jobs are not full-time jobs? Does placement alone indicate success, or is achievement of some broader job readiness skill (e.g., completing an ESL course; obtaining a GED; obtaining word processing skills) a better indicator?

Simple counting–such as number of placements–often does not capture what the desirable outcomes are. An organization that focuses on quality of job placement and longevity in employment will be short-changed by a simplistic quantitative outcome, whether that measure is selected by the organization or imposed by a funder. It simply does not provide a valid measure of the outcome of that organization’s activities. However, it is possible to carefully craft indicators to avoid some of these pitfalls, allowing organizations to clarify their goals and set explicit directions for success.

Developing Questions from Within the Organization

In the previous sections I have described some of the issues that organizations must grapple with in order to identify useful quantitative indicators for program evaluation. Now I turn to the process an organization must use to develop such indicators. Invariably, board and management staff feel they are in the best position to identify and establish quantitative indicators or to negotiate with funders about standards against which the organization will be measured. However, line staff often have theories and insights as to why organizational outcomes are or are not being met, as well as ideas on how to realign activities and outcomes. Therefore, I argue strongly for involving all levels of staff in the process by which program processes and outcomes are discussed and explored.

The process of identifying the logic models of an organization’s programs also helps to improve and clarify for the entire agency what is realistic to achieve, what works best and least, and where a shift in focus or resources may be necessary. In my experience, the introduction of quantitative information about program processes and outcomes can also be an educational and motivational experience for staff. For example, in a recent evaluation of a substance abuse prevention program with which I was involved, data collected by the outside evaluation team about which youth used which drugs generated a two-hour discussion at a staff meeting. Staff were fascinated about why certain age, gender, or racial/ethnic groups were involved with different substances; they asked for additional information and data from the evaluation, wanted to pursue new questions, and they discussed how the agency could more effectively accomplish its prevention goals. The result was a request by the staff for more, not less, evaluation once they saw the value of the information already available–information that collectively described their population and how they were being less successful in some areas, while more successful in others.

Why Executive Directors (and Funders) Should Welcome the Use of Quantitative Indicators
The positive side of this exploration of process/input and outcome/impact indicators of program performance is that the staff, board, and other constituencies can use the process to explore what goals the organization wants to support, and identify what it will take to achieve them. As noted above, particularly for staff who work in the trenches, it can be very rewarding and stimulating to see, in concrete terms, the big picture of what the agency as a whole is achieving. And if changes or improvements are necessary, they are less likely to be seen as personal or contentious if the entire agency approaches the establishment of quantitative performance indicators as a way to enhance and build the organization.

The real key to making quantitative indicators useful for overall program management, however, is the feedback loop. There must be regular reports back to staff and managers that summarize the quantitative agency indicators. A process must be in place to analyze why things are working or not working, and specific adjustments or plans must be put in place to maintain, adjust, or improve performance. With an equal emphasis on both process or input indicators, and outcome or impact indicators, the use of quantitative information in organizations can be an effective management tool, as well as an evaluative tool, that serves both internal and external audiences.

As stated earlier, another essential aspect to making quantitative indicators more acceptable and less risky for nonprofit organizations is thoughtful and careful attention to the questions of causal relations and the validity of the indicators. Furthermore, government and private funders need to accept that rigid adherence to cut-offs or arbitrary compliance evidence will defeat the positive aspects of organizational self-inquiry and self-generation that arise around the use of quantitative data. Punitive action should only be taken after gross failure to meet stated goals or multi-year failures occur, with evidence that the feedback and program adjustment cycles were not undertaken. Achieving measurable success with the type of intractable and complex issues that most nonprofit organizations tackle requires clear and honest inquiry, not shallow or meaningless goals. Establishing a climate of organizational improvement based on empirical information well serves both organizations and the funding community.

Endnote

1. The categories and descriptions were derived from F. H. Jacobs, “The Five-Tiered Approach to Evaluation,” in H. B. Weiss & F. H. Jacobs, eds. Evaluating Family Programs (Aldine, 1988); and B. M. Stecher and W. A. Davis, How to Focus an Evaluation (Sage, 1987).

About the Author

Carole Upshur, Ed.D. is a professor in the College of Public and Community Service at University of Massachusetts Boston and director of the Public Policy Ph.D. Program. She works in policy, research and evaluation roles within community organizations and state government–primarily in community mental health services; early intervention services for children with disabilities; services for adults with developmental disabilities; education policy education, tracking and services for Latino students; and health care access for minorities. She has written several books, book chapters, and numerous articles and reports.