The Receipts

When the evidence is thin

Rees Calder · 18 April 2026 · 6 min read

A lot of impact writing assumes you get to choose between options with clean randomised controlled trials behind them.

In practice you almost never do. GiveWell lists fewer than a dozen top charities. The global set of interventions with multiple independent RCTs numbers in the low hundreds. The total space of things humans spend money on trying to do good is vast, and most of it has no causal evidence of any quality at all.

That doesn't mean the rest is worthless. It means the honest question is how to decide when the data is thin, and that question deserves a better answer than "only fund the things GiveWell funds."

What we know about what we don't know

Start with an uncomfortable fact. Even the well-evidenced interventions have wider error bars than people pretend.

GiveWell's own 2023 moral weights document acknowledges roughly 2-3x uncertainty in the cost-effectiveness estimates of its top charities. That means the real number for Against Malaria Foundation might be $2,000 per life saved or $15,000, depending on parameter choices. The $5,000 figure is a central estimate in a distribution, not a measurement.

The Cochrane Collaboration, which reviews medical evidence more rigorously than any other body, classifies most of its reviews as "low" or "very low" quality evidence. A 2016 meta-analysis of Cochrane reviews found that only 13.5% of interventions had "high quality" evidence behind them, and that was in medicine, which is the best-studied domain in human history.

The rest of the world is messier. Poverty interventions, climate interventions, animal welfare interventions, policy advocacy, research funding. Almost all of it is making decisions on partial information.

The three failure modes

When evidence is thin, people usually go wrong in one of three ways.

Failure mode one: pretend the evidence is thicker than it is. Cherry-pick the one favourable study. Cite the intervention's own impact report as if it were external validation. Treat expert opinion in the funder's network as independent confirmation.

This is how most charity marketing works. It's also how a lot of well-meaning donor advice works. The tell: confident cost-per-outcome numbers with no mention of uncertainty.

Failure mode two: refuse to decide. Conclude that since no intervention has gold-standard evidence for your specific context, nothing can be said, and therefore default to either (a) not giving or (b) giving to whoever you already liked.

This is the more seductive failure. It wears the costume of epistemic humility. In practice it's the cheap exit from a hard problem.

Failure mode three: treat "no RCT" as "no evidence." Discount anything that can't be randomised, which includes most policy work, most movement building, most research. This collapses the field of possible impact to the narrow subset of interventions amenable to controlled trials, which is not the same as the subset of interventions that matter.

A rougher, more honest framework

Here's a better approach, sketched by Holden Karnofsky in the Open Philanthropy Project's "hits-based giving" essay and refined over the last decade by several EA research orgs.

Use the tiers GiveWell itself uses internally. Adapt the language.

Tier 1: Robust causal evidence. Multiple RCTs, replicated by independent teams, in the context you care about. Roughly: AMF, Malaria Consortium, Deworm the World, some cash-transfer programmes. Small set. Trust the number to within 2-3x.

Tier 2: Plausible-but-thin evidence. One or two RCTs, or strong observational data with clear mechanism. Many global health interventions, some education programmes, cash-plus programmes. Update from the Tier 1 baseline, don't replace it.

Tier 3: Theory-of-change plus track record. No causal evidence available and may never be. Policy advocacy, research funding, capacity building, movement infrastructure. Judge on: credibility of mechanism, quality of implementation, track record of the specific team. Expect wide error bars. Give weight to the quality of the people, because at this tier the mechanism is "these people will make good calls under uncertainty."

Tier 4: Pure bets. Early-stage ideas, new fields, moonshots. Think of your portfolio here as a venture fund. Most will fail. The ones that work pay for the ones that don't.

The mistake most donors make is refusing to fund Tier 3 and 4, because the evidence is thinner, and then watching the actual biggest leverage (policy wins, field-building, research breakthroughs) happen on someone else's dime.

How to weigh it in practice

Three heuristics that hold up reasonably well when the data is thin.

Prefer interventions with natural feedback loops. If a programme gets rapid data on whether it's working, even imperfect data, it will improve. Programmes that don't measure anything and can't measure anything are betting on theory alone.

Prefer teams with skin in the game. People running the intervention who depend on its success for reputation, funding, or livelihood are more likely to spot when it isn't working than detached advocates.

Prefer reversible allocations. A one-year grant you can not renew is a better bet than a multi-year commitment in thin-evidence space. Optionality is worth real money when you're operating on partial information.

The uncomfortable part

None of this gets you to certainty. The whole framework is an explicit admission that most giving will happen under material uncertainty, and some of it will be wrong.

That's the honest frame. Expected-value thinking with calibrated uncertainty, not false precision. If you only give when the evidence is gold-standard, you are de facto giving less than you could, or you are letting your definition of "evidence-based" do a lot of convenient work.

The middle road is clearer-eyed and more demanding. Weight what you fund by how robustly you know it works, keep a portfolio that spans tiers, and expect some of it to fail. That's closer to what smart philanthropy actually looks like.

Sources used: GiveWell Moral Weights and CEA Framework (2023), Cochrane Collaboration review quality data (Ioannidis et al, 2016), Open Philanthropy Hits-Based Giving essay (Karnofsky, 2016), Rethink Priorities methodological notes (2022-2024). Full links in the planning doc.

Keep reading

Get the next one in your inbox. Every Tuesday. Free.

All articles