Probable Modeling

Focus on probability and use models right

Apr 17, 2024

I hate bad modeling. I consider it an abomination.

By bad modeling, I mean models that depict what people want to be true, vs. what is actually true. Or models that focus on potential and then get all hand wavy about probability.

“There are a billion people buying widgets every year, with just $1 from each we will have a billion dollar business!”

Yeah, impressive potential, but how probable is it?

I would assume not very, until proven otherwise.

Most detailed models will be expressed in a spreadsheet, such as estimates for a new product, partnership, campaign etc. It could have multiple rows and columns representing calculations across multiple months or years. It might even be color coded for readability.

The story generally starts with real data, i.e. a billion people buying widgets, but the problem is that somewhere along the line it often turns into science fiction.

How does science differ from science fiction?

There’s no fiction in science.

Most models contain key variables that the whole thing hinges on. Those variables become fiction when you input assumptions and accept them as true without validating the probability for them to actually be true.

You are better off accepting that your input assumptions are unlikely to be true, and then spend time to prove what they actually are.

To illustrate this, consider one of my favorite jokes.

A physicist, a chemist and an economist are stranded on a deserted island with no food.

A can of bean washes ashore, and the three starving experts debate how they can open it.

Physicist: “I know, we’ll use gravity! I can climb a tree and drop the can on a boulder to crack it open.”

Chemist: “No way, the beans could splatter everywhere! Just give me a few days, and I’ll use seawater to weaken the can so we can more easily open it.”

Economist: “You are both making this much too complicated! I’ve got the answer. First, assume we have a can opener…”

I mean, it’s funny because “assuming a can opener” is obviously ridiculous. But what if it were less obvious?

What if the three spotted a crate on top of a mountain and they needed to decide if it would be better to try to open the can of beans now or to send someone up the mountain to discover if the crate contains anything that could help?

The crate represents an unknown opportunity, but even then I would argue that they are better off assuming a low probability that the crate has what they need vs. relying on it, especially if the mountain is high and the path is risky.

In short, unvalidated assumptions in models are dangerous because they can skew the whole model and lead you to invest too much in the wrong thing.

But also because they distract you from the reality you actually have to work with.

Over the years I have seen so many models with unvalidated assumptions stacked on top of each other, like an endgame of Jenga, but instead of wooden blocks you have a tower of stacked can openers… which may or may not be there.

An abomination.

So how should we use models?

I have found models to be most powerful when you use them as part of a discovery process. Treat them as an exercise to identify killer assumptions, i.e. ones that will make it all work if they are true, but that would upend your efforts if they are not.

Then focus on validating and determining your ability to manipulate the underlying probabilities of those assumptions.

As an example, below is a very simplified model for a proposed distribution partnership, where your product is offered as a free trial to new customers of the partner.

Number of new customers x Offer opt-in rate x Activation rate x Renewal rate (post-trial)

If this is all brand new (unproven), you should assume a heavy discount rate—a low probability—for each of those variables until you can 1) validate what they actually are, and/or 2) convincingly manipulate them to improve.

For example, you could:

Get data from your partner on the performance of similar offers, to establish a ballpark
Conduct quick test campaigns and user research with the partners customers
Ask the partner to compensate you for the free trial, to gauge how certain they are of success
Explore cost-effective ways to remove friction from the experience, e.g. bundling the free trial automatically, making the offer an opt-out etc.

Using what you learn, either update the model, or focus on efficient investments to improve your probability of success.

Or walk away, that’s an option, too.

What you shouldn’t do is start a heavy engineering investment to support the partnership, use the model to determine your quarterly goals, etc.

To borrow an analogy from my old colleague Erik Simanis, your goal is to determine if your efforts can “take a punch in the mouth” and keep standing. And to identify if there are key elements you need to strengthen or obstacles you need to remove.

The job of a model is to help you discover the truth, not to stand in for the truth.

“All truths are easy to understand once they are discovered; the point is to discover them.” - Galileo

Addendum: What about forecasting?

I am of the camp that forecasting should only use data that is known to be true or likely to be true.

For new products or initiatives, that’s often not the case, and while I’ve seen teams use some variant of high, medium, and low demand estimation to model this, most of the time they are not nearly conservative enough.

For initiatives with unvalidated assumptions, your worst case scenario is not achieving 30% of your forecast, it’s achieving 0%. And in my experience 0% is closer to the truth than you initially think.

So don’t include assumptions that are unproven and material in your forecast models, especially if you intend to measure performance based on them.

"Assumptions are made, and most assumptions are wrong." - Albert Einstein

Probable Wisdom

Discussion about this post