Skip to main content
B
Behind the CMO

The Performance Illusion

We measured everything and learned nothing. Here's how that happened.TAG: ai-tech

The Performance Illusion

(Was this newsletter forwarded to you? Sign up here.)

Everyone at the company has this dashboard bookmarked. The CMO. The CEO. The CFO. It updates every four hours. Pretty charts. The most important one is top-left: daily cost-per-sale by channel, rolling seven-day average, YoY comparison. Labeled “Marketing Efficiency”.

Everyone has an opinion about what it says. Nobody, in private, actually trusts it.

The CMO suspects channel attribution drifted after the last website migration. The CRO suspects the lead quality mix is shifting in ways the chart doesn't show. The CFO suspects the conversion definitions got misaligned two quarters ago. The CEO suspects all of the above, which is why the meetings open with four minutes of dashboard review and then make a decision that has nothing to do with what the dashboard said.

The dashboard, which was built, funded, and maintained to drive decisions, doesn't drive decisions.

It justifies them.

A decade, working backward

I wrote about the current state of attribution in The Deal You Lost Was Never Real. The scoreboard is fiction. What I didn't cover there is how we got here. The field has been running a confidence game on itself for fifteen years. Each generation of attribution methodology made us feel more rigorous without making us any better at deciding.

Last-click (2010-2015). The last ad clicked gets 100% of the credit. Obviously bad. Everyone knew it was bad and defended it anyway, because last-click had one virtue every subsequent method has lacked. It was legible. Any marketer could explain it to any CFO in ten minutes. iOS 14.5 broke the cookie chain in 2021, and last-click became structurally impossible.

MTA (2015-2020). Multi-touch fractionalized credit across the funnel. U-shaped, time-decay, position-based, algorithmic. Either rule-based or a black box. Rule-based MTA encoded the biases of whoever chose the rules. Algorithmic MTA found correlations and called them causal. Nobody seriously believed the numbers. The dashboards still rendered them to two decimals. Companies started accepting the two-decimal precision as meaningful.

Clean rooms (2020-2023). Privacy-safe, walled-garden-compliant. Google's clean room told you paid search worked. Amazon's told you Amazon ads worked. Meta's told you Meta ads worked. Each query was constrained, each sample was controlled, and each output flattered the partner. I watched a Fortune 500 team try to reconcile three different clean rooms for six months. The recommendation: "Each channel appears to be the most important channel, and we can't tell which one is actually true."

MMM (2024-present). Aggregate regression. Privacy-safe. Confidence intervals. On paper, rigorous. In practice, retrofitted storytelling. The modeler gets paid by the CFO's office. Any output the CFO doesn't like gets sent back as a "data quality issue." The model that comes back on round two is a different model, built to produce a different answer. I've sat in those rework meetings.

Whoever controls the model controls the budget. The CFO controls the model.

The rigor is real in the lab. In the field, the rigor is theater.

The category error

Attribution is a budget-justification tool, not a decision tool.

We've been pretending it was the second while optimizing for the first. The two are in direct tension. A tool that justifies last quarter's decisions can't also reveal those decisions were wrong. A tool that reveals decisions were wrong gets sent back for rework. So the tool justifies, or the tool doesn't survive.

Once you see this, the decade makes sense. The dashboards weren't for deciding. They were for justifying. The field's refusal to admit this has cost us the capacity to do the actual measurement that would tell us whether anything we were doing worked.

What rigor actually is

Three methodologies can tell you whether marketing is working. Holdout tests, geo-experiments, and kill tests. Each produces a specific answer. Each requires something the dashboard doesn't. A willingness to learn the truth.

Holdout tests. Random sample, withhold marketing spend, compare to treated. The only attribution method that actually proves causality. Almost nobody runs them. When I ask why, the answers are "we don't have the tech," or "the media partners won't allow it" or "too expensive." The first two are usually solvable. The third is misdirection. Holdouts aren't financially expensive. They're politically expensive. If you run one on paid search and discover the holdout converts at 93% of treated, you've just proven paid search is worth seven percent of what you're spending. Then you have to cut it or explain to the CFO why you won't. Most CMOs prefer not to know.

Geo-experiments. Pick a geography. Reduce spend. Compare to a matched control. Meta's own lift studies are geo-based. The best measurement practitioners I know (Jon Evans at Uncommon, the Haus team, Nielsen's newer practice) use geos as the backbone. Geos don't render well in a dashboard. They produce one answer per quarter, not seventeen answers every four hours. The people who use them trust the answers. The people who build dashboards mostly don't know how to read them.

Kill tests. Pause a channel entirely for a defined period. Watch what happens. Revenue doesn't fall? You've proven the channel doesn't work. Revenue does fall? You've measured the actual contribution, which no model has ever done. Marc Pritchard's P&G cut in 2017 was, in effect, a kill test. They pulled $200M from digital over brand safety and transparency concerns, and reach went up 10% (Adweek). Pritchard has been publicly skeptical of attribution ever since. Byron Sharp and the Ehrenberg-Bass team have been pointing at the same thing for a decade in quieter language.

Why we don't run them

The tests aren't hard. They're trivial. Pick a channel. Tell the agency to stop spending for a period. Watch the business.

The tests are politically dangerous.

Run a kill test and revenue doesn't fall, you've proven your channel doesn't work. Now you have to cut it, which gives the CFO ammunition, which makes your next budget cycle worse. Run a kill test and revenue does fall, you've lost revenue to prove a point, and now you have to explain to the CEO why you chose to lose revenue for research. Both outcomes are career-expensive.

The third outcome, where you don't run the test and everyone nods along to the MMM deck, protects your job.

So we build another dashboard.

This is the structural reason attribution has gotten worse for a decade. The incentive is to not know. Every methodology that lets us not know has been adopted enthusiastically. Every methodology that forces us to know has been quietly declined.

A few CMOs run holdouts and kill tests anyway. They're uniformly the CMOs with the longest tenures. Some of them you've heard of. Mark Ritson, perpetually. Les Binet. Jon Evans. They aren't better at dashboards. They chose to know.

The one question

The question isn't "what does the model say?"

The question is: if I turn this off, what breaks?

Every other question is a variation on "what has the model been trained to tell me?" The counterfactual is the whole game. A tool that doesn't address the counterfactual is a narrative tool, not a measurement tool.

Narrative tools aren't useless. They let you report to the board. Justify your budget. Tell a coherent story at the QBR. They don't help you decide where the next dollar goes. Structurally can't. They were built for a different job.

Pick a channel you've been spending on for more than a year. Ask yourself what you know about its contribution. Not what the dashboard says. What you know, with enough confidence that you'd defend it in an argument with someone smarter than you. If the answer is "the MMM says elasticity is 0.12," you don't know. You know what the model says. If the answer is "we ran a 90-day holdout last Q3 and the treated group converted 14% higher at p<0.05," you know.

Most CMOs can answer only with a model output. The ones who answer with a test result are the ones the CFO has learned to trust.

What to do this year

Stop pretending the dashboard is a decision tool.

Pick one channel. Run one real kill test. Not "we'll spend 30% less for a month." Zero spend, 60-90 days, a matched control. Pick a channel you're least confident in so the downside is smallest. Tell the CFO before you do it. Tell them you'll act on the result.

Next year, run two. The year after, three.

Over five years, you'll know things about your marketing that no dashboard has ever told you. The $400K you would have spent on an MMM rebuild will still be in your budget. You'll be using it on tests.

Most CMOs will read this and do nothing. That's fine. The ones who do will, in five years, be the ones nobody can argue with.

The dashboard with the seventeen charts isn't wrong. It's just not answering the question. Nobody ever asked it to.

If you don't want the truth, build a dashboard.

If you do, turn something off and see what breaks.

Don't miss the next one

Get Behind the CMO in your inbox

Tactical frameworks on Monday. Strategic deep-dives midweek. Free forever.

Strategic marketing intelligence. Weekly.

Frameworks and insights for modern marketing leaders who refuse to settle.