CYS A/B Testing: Variables, Baseline, Try, Timeline, Trends


Digital marketing optimization is hard. Set up is easy(ier). Results usually come quick with Adwords. But what do you do next? How much is possible? What's the best keyword? Bidding strategy? Time of day?

Set up is easy. Set it up, make sure you have conversion tracking dialed in, let it run. Making it better (or fixing it when something goes sideways) is much, much harder. 

The problem is a lack of a system, a framework--for how to answer the very simple question--is it working? The options in Adwords alone are staggering, much less if you're faced with a more complex marketing mix from multiple channels. It's like this:

In our agency I've found it's been pretty straightforward to teach the newcomers the research and set up process. This concept of "What to do next" really developed organically as a result of our agency growing, and me needing to teach more folks that we were hiring how to think about Adwords--beyond just the initial set up.

Also I have to say that at times over the years personally I've been dismayed by a nagging feeling that I didn't always understand which levers were working, and why.

So we started using a spreadsheet--as we search marketers so often do. The idea is pretty simple goes like this:


  • VARIABLES: Before you can hope to impact something, you have to define what that something is. In most cases we're focused on cost-per-lead as our main variable, but we've used this same system to track CPC's or CTR, engagement stats like bounce rate or time of site. 


  • BASELINE: Establish a baseline for how things have performed historically, for the variables you just decided on. The more data the better. We typically like to start with a combination of two out of three of 1.) 45 days, 2.) $500 in ad spend, and 3.) 10 conversions


  • TRY: Now the fun part. Try something. Think that shiny new automated bidding strategy in Adwords is just the ticket? Well fire it up!


  • TIMELINE: This is connected generally back to baseline. If you have a 6 month baseline with $20k in adspend, and test for 2 days with $200 is not an acceptable comparison. The fun part here is the sample size can change but the idea of comparing apples to apples remains persistent. And marking down the date of when we did something just makes everything so much more clear


  • TRENDS: Well...what's happening? Even if you think you need to wait longer for your timeline to be closer to your baseline, how do things look? Way better? Way worse? Trending better or worse every time you check? These are clues. Listen to them. 


This is what it ends up looking like:


I can tell you after using this process on a few $100k in adspend over about a year now that we're still not always sure upfront which changes are going to drive the outcomes we want--but we do for sure have a system in place to be confident if we're on the right track or not.

And I think the beauty goes even deeper. Because this process of tracking changes and outcomes builds a narritive around a particular campaign that can we understand in reverse.

It's like a historical record of all things tried. Paths walked then abandoned for some reason. And in this way, we truely do know something about these complex systems and how we can impact them.