Whether you work at a small operating foundation or for the world’s largest grantmaking one, “scale” and “sustainability” are two words that likely have dominated more than a few staff meetings. That’s not surprising, since both are important indicators of an investment’s impact. But how do we decide whether something we fund is scalable and sustainable?

Five years ago, I ran the Chez Panisse Foundation, an organization in Berkeley, California, that helps young people connect what they eat to the health of their environment. Our goals for The Edible Schoolyard were ambitious: to redesign school lunch programs and create kitchen gardens in every school in America. The program’s founder, author and chef Alice Waters, built a model program that integrated academic curriculum with hands-on learning. Holding to very specific design principles, Waters carefully considered every detail, from the way children worked together in the garden, to how they cleaned up, to what they talked about while chopping vegetables. Today, there are only two official edible schoolyards, and the foundation (now called The Edible Schoolyard Project) continues to fund the original program at a Berkeley middle school.

Was our work a success or a failure?

It depends. If the goal of the Chez Panisse Foundation was to replicate the model “as is,” we failed utterly. But if our goal was to adopt, adapt, or even reinvent the model, our work was a wild success. The Edible Schoolyard created a movement that continues to grow. It has spawned thousands of kitchen gardens and inspired dozens of urban school districts to improve meals for their students. Today, all Berkeley public schools have kitchen-garden programs, and all students get freshly cooked meals with healthy local ingredients. In short, we transformed the system in one school district and created a model for the country.

Cynthia Coburn, a professor at the School of Education and Social Policy at Northwestern University, has studied scale—the spreading of practices to greater numbers of people and organizations—and sustainability—the ability to maintain a change of practice over time. She says measuring a foundation’s success at scale depends on three simple questions whose answers may vary with projects and change over time:

  1. What are you trying to scale? (A program, framework, a set of design principles?)
  2. Who is your target audience, and what is the context for implementation and scale?
  3. What are you trying to make happen? (Do you seek adaptation, adoption, replication or reinvention?)

Common Approaches to Scale

At the Bill & Melinda Gates Foundation, we are five years into a 15-year strategy to improve college readiness in K-12 education, and we are asking ourselves these same questions. We are committed to supporting innovations that work not just once or twice, but persistently—innovations that improve the lives of as many as possible for as long as possible. Since launching our strategy, we’ve become more deliberate about seeking solutions that travel well. It’s not an easy task. There are pockets of excellence in U.S. education, but they don’t spread as quickly as we would hope or survive as long as we would like.

As we work to overcome those problems, we have learned a lot about scale and sustainability. And we now know that if we are going to dramatically accelerate change in public schools, traditional approaches to scale and sustainability will no longer work.

Grantmakers typically take one of three approaches to scaling.

The first is to fund an initiative that works and then phase it into a growing number of sites. This strategy of “piloting to scale” makes sense; it’s usually wise to try something out before you take it on the road. Conditions can change in the process, however, and piloting takes time and limits investments to certain places.

The second strategy is to invest heavily to perfect an initiative in a few sites, then spread the lessons in the hope that others will follow. The Edible Schoolyard is a perfect example of this “proof-point” approach. The challenge is to be smart about replication and to be clear about what exactly is scalable. It’s also worth noting that when you put a site on a pedestal in this way, others can knock it off. (“Of course they were able to do that, given all the money they got!”) Or the initiative can simply fail, sinking its perceived value even if it had nothing to do with the site’s larger problems.

A third approach to scaling is to direct investments based on national, state, or district policies. We’ve seen again and again—through No Child Left Behind, teacher quality initiatives, and now Common Core State Standards—that the classic “policy play” can move large numbers of states and districts to action. Policy can often give funders their best chance at systems-level change, but if it lacks evidence of success or support for implementation, it doesn’t lead to sustainable scaling. It might even cause backlash.

A Fourth Path:  Disciplined Design

What we’ve learned at the Gates Foundation is that achieving scale and sustainability often requires a fourth approach—one that I call “disciplined design.” You start with a conceptual framework, or a set of design principles, informed by practice and research. Then you support grantees as they apply these in a variety of cases, monitoring implementation to see what needs to be changed. Often, you discover a lever that markedly accelerates the work. Notably, this approach gives practitioners—in our case, teachers—a strong voice: the educators themselves recommend changes based on their experience, and you adjust the framework or build out those components that seem to stick.

Disciplined design requires research and evidence, but it also welcomes new ideas and unintended consequences. It allows for messiness, iteration, and deep inquiry into what exactly works and why. When funders take a disciplined design approach, they connect dots (partners, programs, problems) on multiple fronts and view individuals, systems, and networks as partners who are all critical to scale and sustainability. They accept that scale is not a linear process and that sustainability doesn’t happen by chance. Researcher Diana Laurillard, in her book on teaching and technology, says, “Teaching is a design science in a sense that its aim is to keep improving its practice, in a principled way, building on the work of others.” Shouldn’t this kind of science be our approach as funders?

In using disciplined design to scale our work, we are learning several important things:

Design matters.

No matter what the approach, no initiative will scale well without thoughtful design. Some say that school contexts and teachers’ experiences are so distinct that they can’t possibly use the same tools. Others argue that a good tool can work the same way for everyone. The truth lies in the middle. The most successful tools—the ones that work best and scale best—hew to a consistent set of design principles and practices, but are flexible enough so users can adapt them to their own needs. The users must be able to help design the tools, not just be told to use them. Tools (and frameworks) must be tested across a broad swath of users and organizations, and improved along the way.

One of our Gates grantees is the Literacy Design Collaborative (LDC), a tool to help teachers design high-quality lessons. We started with a basic framework to guide development of the tool. Teachers helped build out the framework, creating user-friendly tools based on it by co-designing tasks and templates and helping to improve them. Their enthusiasm ensured that the project would spread—to places we never could have reached without them. In 2012, LDC and a sister project, the Mathematics Design Collaborative (MDC), were taking hold in just four states. A year later, they had reached an additional 130,000 teachers in 23 states.

Language matters.

If you want a network to be able to scale an initiative, everyone needs to speak the same language. Teachers in the LDC make their own decisions about what texts and instructional strategies to use, but their vocabulary is consistent across all sites; a “module” and a “template task” mean the same thing to all of them. Common language helped us scale LDC across a diverse set of networks and districts from Georgia to Colorado to California.

Time matters.

Users need time to incorporate new tools into their practice. In our case, that means teachers need time during the school day to consult with colleagues in their content area or grade level. We have funded a group of districts that are finding creative ways to carve out at least one full day a week for their teachers to learn how to get better at what they do. In at least two of these districts—Fresno, California, and Bridgeport, Connecticut—this time has substantially increased teacher engagement and collaboration. This is significant, because when the players are more engaged, reforms are much more likely to be widespread, successful, and sustainable.

People matter.

People, not programs or institutions, are the agents of scale. And these ambassadors don’t have to be the actual leaders of a grantee organization. A teacher who is deliberate about improving, seeks out resources to do so, and shares what works can be the most powerful person to scale a valuable tool. With the LDC, the educators we called “founding partners” carried the work across their networks and trained their colleagues at their schools. It’s important to identify and cultivate these early adopters.

Networks matter.

It’s more efficient and effective to design and scale an initiative with predictable partners as well as consider some adjacent networks. Early partners in education initiatives tend to be state departments of education and school districts. But at Gates, we’ve increasingly relied on different types of networks. Our partners in the MDC and LDC included geographic networks like the Southern Regional Education Board, professional networks like the National Writing Project, and service providers like Scholastic. Rather than investing in just 20 school districts to scale technical assistance and advocacy related to the Common Core State Standards, we invested in networks that reached thousands of districts. And we see teachers as partners who can reach many more.

Stories matter.

“Foundation speak” too often separates us from each other and undermines our efforts to convey important messages. We invent new words to reframe the debate, sometimes to the detriment of our cause. What is the larger narrative? How is the general public framing the debate? What are the values that underlie the issue we care so much about? Holding focus groups to frame “our” issue, we fail to listen to the themes dominating social and traditional media and other outlets. By listening better, and understanding the influencers, we may capture a broader set of constituents and attract some unlikely partners. If we are to scale our work, our value proposition must resonate with the people we want to do that work—in our case, teachers and school leaders.

Markets matter.

Innovation is often constrained by a lack of resources or tools. Often the reason is a dysfunctional market: suppliers don’t understand users’ needs, or bureaucracy prevents them from reaching them. At Gates, we address both supply and demand. Often what suppliers need is better research about demand; they need to know more about what school leaders need to do their best work. With better information, we can create incentives for multiple players to fill these information gaps. We can hold design challenges, for instance, or make equity investments. Funders can also hold convenings to connect users with innovators, then support rigorous evaluations of those innovations. With better market dynamics, we believe that the best innovations will gain traction—and can scale.


We’re Still Learning

As Coburn says in “Spread and Scale in the Digital Age,” there are four approaches to scale: adoption, when organizations or people embrace a tool; replication, when they use it in a prescribed way; adaptation, when they modify tools for local needs; and reinvention, when they use the tools as a springboard for innovation.

Most foundations have at some point embraced replication; they want to find a promising model and fund other sites that can implement that model faithfully. At first, this was the path of the Chez Panisse Foundation. But relying on lockstep replication can backfire. For starters, tools are never foolproof. What works in one place may not work in another because of differences in local context.

Using disciplined designs and considering other factors like time, language, stories, people, markets, and networks might make the work more complicated but the implementation more scalable and sustainable in the end.