Multivariate Testing: Promises and Pitfalls for High-Traffic Websites

Multivariate testing offers high-traffic websites the ability to find the right combination of features and creative ideas to maximize conversion rates. However, it is not sufficient to simply throw a bunch of ideas into a pot and start testing. This article answers the question, What is a multivariate test?, explains the advantages and pitfalls of multivariate testing, and offers some new ideas for the future.
If you run a relatively high-traffic site, consider this question: Will I profit from running multivariate tests?
Before we dive into the question, let’s be sure to define the terms. I’ll talk about the dangers of doing an multivariate test (MVT) and when you should consider using them.

What Is Multivariate Testing?

Multivariate testing is a technique for testing a hypothesis in which multiple variables are modified.
Multivariate testing is distinct from A/B testing in that it involves the simultaneous observation and analysis of more than one outcome variable. Instead of measuring A against B, you are measuring A, B, C, D & E all at once.
Whereas A/B testing is typically used to measure the effect of more substantial changes, multivariate testing is often used to measure the incremental effect of numerous changes at once.
This process can be further subdivided in a number of ways, which we’ll discuss in the next section.

Multivariate, Multi-variant or Multi-variable

For this article, we are focusing on a specific way of testing in which elements are changed on a webpage. Before we dive into our discussion of multivariate testing, we should identify what we are talking about and what we are not talking about.
One of the frequent items tested is landing page headlines. Getting the headline on the page can significantly increase conversion rates for your landing pages. When testing a headline, we often come up with several variants of the words, testing them individually to see which generates the best results.

A multi-variant test with multiple variants of one variable or element.

This is a multi-variant test. It changes one thing–one variable–but provides a number of different variants of that element.
Now suppose we thought we could improve one of our landing pages by changing the “hero image” as well as the headline. We would test our original version against a new page that changed both the image and the headline.

An example of a multi-variable test. Here we are testing the control against a variation with two changes, or two variables.

This is a multi-variable test. The image is one variable and the headline is a second variable. Technically, this is an AB test with two variables changing. If Variation (B) generated more leads, we wouldn’t know if the image or the headline were the biggest contributors to the increase in conversions.
To thoroughly test all combinations, we would want to produce multiple variations, each with a different variant of the variable.

Two variables with two variants each yield four page variations in this multivariate testing example.

In the image above, we have four variations of the page, based on two variables (image and headline) each having two variants. Two variables times two variants each equals four variations.
Confused yet?
A multivariate test, then, is a test that tests multiple variants of variables found on a page or website.
To expand on our example, we might want to find the right hero image and headline on our landing page. Here’s the control:

The Control in our multivariate test is the current page design.

We will propose two additional variants of the hero image–for a total of three variants including the control–and additional two variants of the headline, three including the control.
Here are the three images:

We want to vary the hero image on our page. This variable in our test has three variants.

Here are three headlines, including the existing one.

1. Debt Relief that Works
2. Free Yourself from the Burden of Debt
3. Get Relief from Debt

A true multivariate test will test all combinations. Given two variables with three variants each, we would expect nine possible combinations: three images x three headlines.
Here’s another example that will help you understand how variables, variants and variations relate. An ecommerce company believes that visitors are not completing their checkout process for any of three reasons:

1. The return policy is not visible
2. They are required to register for an account
3. They don’t have security trust symbols on the pages

While these all seem like reasonable things to place in a shopping cart, sometimes they can work against you. Providing this kind of information may make a page cluttered and increase abandonment of the checkout process.
The only way to know is to test.
How many variables do we have here? We have three: privacy policy, registration and security symbols.
How many variants do we have? We have two of each variable, one variant in which the item is shown and one variant in which it is not shown.
This is 2 x 2 x 2, or eight combinations. If we had three different security trust symbols to choose from, we would have four variants, three choices and none. That is 2 x 2 x 4, or sixteen combinations.
We’ll continue to use this example as we explore multivariate testing.

Why Multivariate Testing Isn’t Valuable In Most Scenarios

A multivariate test seeks to test every possible combination of variants for a website given one or more variables.
If we ran an MVT for our ecommerce checkout example above, it would look something like this:

Variations multiply with multivariate tests requiring more traffic and conversions.

There are many reasons that multivariate testing is often the wrong choice for a given business, but today, I’m going to focus on five. These are the five reasons multivariate tests (MVTs) are not worth doing compared to A/B/n tests:

1. A lack of time or traffic
2. Crazy (and crappy) combinations
3. Burning up precious resources
4. Missing out on the learning process
5. Failing to use MVT as a part of a system

Let’s take a closer look at each reason.

1. Multivariate Tests Take a Long Time or a Whole Lot of Traffic

Traffic to each variation is a small percentage of the overall traffic. This means that it takes longer to run an MVT. Lower traffic means it takes longer to reach statistical significance, and we can’t believe the data until we reach this magical place.
Statistical Significance is the point at which we are confident that the results reported in a test will be seen in the future, that winning variations will deliver more conversions and losing variations would deliver fewer conversions over time. Read 2 Questions That Will Make You A Statistically Significant Marketer or hear the audio.
Furthermore, statistical significance is really measured by the number of successful transactions you process.
For example, MXToolbox offers free tools for IT people who are managing email servers, DNS servers and more. They also offer paid plans with more advanced features. MXToolbox gets millions of visitors every month, and many of them purchase the paid plans. Even with millions of visits, they don’t have enough transactions to justify multivariate testing.
This is why MVTs can be done only on sites with a great deal of traffic and transactions. If not the tests take a long time run.

2. Variations Multiply Like Rabbits

As we saw, just three variables with two variants resulted in eight variations, and adding two more security trust symbols to the mix brought this to sixteen combinations. Traffic to each variation would be reduced to just 6.25%.
Multivariate testing tools, like VWO and Optimizely offer options to test a sample of combinations — called Partial, or Fractional Factorial testing — instead of testing them all, which is called Full Factorial testing. We won’t dive into the mathematics of Full Factorial and Partial Factorial tests. It gets a little messy. It’s sufficient to know that partial factorial (fractional factorial) testing may introduce inaccuracies that foil your tests.
What’s important is that more variations mean larger errors… because statistics.
Every time you add another variation to an AB test, you increase the margin of error for the test slightly. As a rule, Conversion Sciences allows no more than six variations for any AB test because the margin of error becomes a problem.
In an AB test with two variations, we may be able to reach statistical significance in two weeks, and bank a 10% increase in conversions. However, in a test with six variations, we may have to run for four weeks before we can believe that the 10% lift is real. The margin of error is larger with six variations requiring more time to reach statistical significance.
Now think about a multivariate test with dozens of variations. Larger and larger margins of error mean the need for even more traffic and some special calculations to ensure we can believe our results aren’t just random.
Ultimately, most of these variations aren’t worth testing.
All eight variations in our example make sense together. As you add variations, however you can end up with some crazy combinations.
Picture this:
It’s pouring down rain. You are camping with your son.
While huddled in your tent, you fire up your phone’s browser to find a place to stay. While flipping through your search results on Google, your son proclaims over your shoulder, “That one has a buffet! Let’s go there, Dad!”
Ugh.
The last time he ate at an all-you-can-eat buffet, he was stuck in the restroom for an hour. Not a pretty picture.
Then again, neither is staying out in the wretched weather. So you click to check out the site.
Something is off.
The website’s headline says, “All you can eat buffet.” But nothing else seems to match. The main picture is two smiling people at the front desk, ready to check you in.

As you scroll to the bottom, the button reads “Book Your Massage Today”.
Is this some kind of joke?
As strange as this scenario sounds, one problem with MVTs is that you will get combinations like this example that simply don’t make sense.
This leaves you with two possibilities:

1. Start losing your customers to variations you should not even test (not recommended).
2. Spend some of your time making sure each variation makes sense together.

The second option will take more time and restrict your creativity. But even worse, now you need more traffic in order for your test.
With an A/B/n test, you pick and choose which tests you like and which to exclude.
Some may argue it can be time-consuming to create each A/B/n variation while a multivariate test is an easy way to test all variations at once.
Think of a multivariate test as a system that automatically creates all possible combinations to help you find the best outcome. So on the surface, it sounds appealing.
But as you dig into what’s really going on, you may think twice before using an MVT.

3. Losing Variations are Expensive

Optimization testing can be fun. The chance of a breakthrough discovery that could make you thousands of dollars is quite appealing. Unfortunately, those variations that underperform the Control reduce the number of completed transactions and fewer transactions means less revenue.
Every test — AB or Multivariate — has a built in cost.
Ideally, we would let losing variations run their course. Statistically, there is a chance they will turn around and be a big winner when we reach statistical significance. At Conversion Sciences, we monitor tests to see if any variations turn south. If a losing variation is costing us too many conversions, we’ll stop it before it reaches statistical significance. This is how we control the cost of testing.

1. We can control the “cost” of an AB test.
2. We can direct more traffic to the other variations, meaning the test will take less time to reach significance.

When tests run faster, we can test more frequently.
On the other hand, multivariate tests run through all variations, or a large sample of variations. Losers run to statistical significance and this can be very expensive.
Lars Lofgren, former Director of Growth at KISSmetrics, mentioned that if a test drops below a 10% lift, you should kill it. Here’s why:
What would you rather have?

• A confirmed 5% winner that took 6 months to reach
• A 20% winner after cycling through 6-12 tests in that same 6 month period

Forget that 5% win, give me the 20%!
So the longer we let a test run, the higher that our opportunity costs start to stack up. If we wait too long, we’re forgoing serious wins that we could of found by launching other tests.
If a test drops below a 10% lift, it’s now too small to matter. Kill it. Shut it down and move on to your next test.
Keeping track of all the MVT variations isn’t easy to do (and also is time consuming). But time spent on sub-par tests are not the only resource you lose either.

4. It’s Harder to Learn from Multivariate Tests

Optimization works best when you learn why your customers behave the way that they do. Perhaps with an MVT you may find the best performing combination, but what have you learned?
When you run your tests all at one time, you miss out on understanding your audience.
Let’s take the example from the beginning of this article. Suppose our multivariate test reported that this was the winning combination:

If this combination wins, can we know why?

What can we deduce from this? Which element was most important to our visitors? The return policy? Removing the registration? Adding trust symbols?
And why does it matter?
For starters, it makes it easier to come up with good test hypotheses later on. If we knew that adding trust symbols was the biggest influence, we might decide to add even more trust symbols to the page. Unfortunately, we don’t know.
When you learn something from an experiment, you can apply that concept to other elements of your website. If we knew that the return policy was a major factor, we might try adding the return policy on all pages. We might even test adding the return to our promotional emails.
Testing is not just about finding more revenue. It is about understanding your visitors. This is a problem for multivariate tests.

5. Seeing What Sticks Is Not An Effective Testing System

Multivariate tests are seductive. They can tempt you into testing lots of things, just because you can. This isn’t really testing. It’s fishing. Throwing a bunch of ideas into a multivariate test means you’re testing a lot of unnecessary hypotheses.
Testing follows the Scientific Method:

1. Research the problem.
2. Develop hypotheses.
3. Select the most likely hypotheses.
4. Design experiments to test your hypotheses.
5. Run the experiment in a controlled environment.
7. Develop new hypotheses based on your learnings.

The danger of a multivariate test is that you skip steps 3, 4 and 7, that you:

1. Research the problem
2. Develop hypotheses.
3. Throw them into the MVT blender
4. See what happens.

The question is never what can you do, but what SHOULD you do.
Just because I can test a massive amount of permutations does not mean that I am being efficient or getting the return on my efforts that I should. We can’t just ignore the context of the output to make you feel better about your results.

You will get a result no matter what you do, the trick is constantly getting better results for fewer resources.

When used with the scientific method, an A/B/n test can give you the direction you need to continually optimize your website.

Machine Learning and Multivariate Testing

Multivariate testing is now getting a hand from artificial intelligence. For decades, a kind of program called a neural network has allowed computers to learn as they collect data, making decisions that are more accurate than humans using less data. These neural networks have only been practical in solving very specific kinds of problems.
Now, software company Sentient Ascend has brought a kind of neural network into the world of multivariate testing. It’s called an evolutionary neural network or a genetic neural network. This approach uses machine learning to sort through possible variations, selecting what to test so that we don’t have to test all combinations.
These evolutionary algorithms follow branches of patterns through the fabric of possible variations, learning which are most likely to lead to the highest converting combination. Poor performing branches are pruned in favor of more likely winners. Over time, the highest performer emerges and can be captured as the new control.
These algorithms also introduce mutations. Variants that were pruned away earlier are reintroduced into the combinations to see if they might be successful in better-performing combinations.
This organic approach promises results faster and with less traffic.

Evolutionary neural networks allow testing tools to learn what combinations will work without testing all multivariate combinations.

With machine learning, websites that had too little traffic for pure multivariate testing can seriously consider it as an option.

Final Thoughts: Is There Ever a Case For doing MVTs?

There are instances when introducing many variables is sometimes difficult to avoid or better to focus on.
Chris Goward of WiderFunnel gives four advantages to doing MVTs over A/B/n tests:

1. Easily isolate many small page elements and measure their individual effects on conversion rate
2. Measure interaction effects between independent elements to find compound effects
3. Follow a more conservative path of incremental conversion rate improvement
4. Facilitate interesting statistical analysis of interaction effects

He later admits, “At WiderFunnel, we run one Multivariate Test for every 8-10 A/B/n Test Rounds.”
Both methods are valuable learning tools.

It is a bit of heated subject between optimization experts. I’d be curious to hear from you about your ideas and experience on what matters the most.

The Ultimate A/B Testing Guide: Everything You Need, All In One Place

Welcome to the ultimate A/B testing guide!

In this post, I’m going to cover everything you need to know about A/B testing (also referred to as “split” testing), from start to finish. Here’s what we’ll cover:

By the end of this guide, you’ll have a thorough understanding of the entire AB testing process and a framework for diving deeper into any topic you wish to further explore.

In addition to this guide, we’ve put together an intuitive 9-part course taking you through the fundamentals of conversion rate optimization. Complete the course, and we’ll review your website for free!

1. The Basic Components Of A/B Testing

AB testing, also referred to as “split” or “A/B/n” testing, is the process of testing multiple variations of a web page in order to identifying higher performing variations and improve the page’s conversion rate.

Over the last few years, AB testing has become “kind of a big deal”.

Online marketing tools have become more sophisticated and less expensive, making split testing a more accessible pursuit for small and mid-sized businesses. And with traffic becoming more expensive, the rate at which online businesses are able to convert incoming visitors is becoming more and more important.

The basic A/B testing process looks like this:

1. Make a hypothesis about one or two changes you think will improve the page’s conversion rate.
2. Create a variation or variations of that page with one change per variation.
3. Divide incoming traffic equally between each variation and the original page.
4. Run the test as long as it takes to acquire statistically significant findings.
5. If a page variation produces a statistically significant increase in page conversions, use it it replace the original page.
6. Repeat

Have you ever heard the story of someone changing their button color from red to green and received a \$5 million increase in sales that year?

As cool as that sounds, let’s be honest: it is not likely that either you or I will see this kind of a win anytime soon. That said, one button tweak did result in \$300 million in new revenue for one business, so it is possible.

AB testing is a scientific way of finding out if your tweak that leads to a boost in conversions is actually significant, or just a random flux.

AB testing (AKA “split testing”) is the process of directing your traffic to two or more variations of a web page.

AB testing is pretty simple to understand:

A typical AB test uses AB testing software to divide traffic.

Our testing software is the “Moses” that splits our traffic for us. Additionally, you can choose to experiment with more variations than an AB test. These tests are called A/B/n tests, where “n” represents any number of new variations.

The goal of AB testing is to measure if a variation results in the more conversions.

So that could be an “A/B/C” test, an “A/B/C/D” test, and so on.

Here’s what an A/B/C test would look like:

The more variations we have in an AB test, the more we have to divide the traffic.

Even though the same traffic is sent to the Control and each Variation, a different number of visitors will typically complete their task — buy, signup, subscribe, etc. This is because a many leave your site first.

We research our visitors to find out what might be making them leave before converting. These are our test hypotheses.

The primary point of an AB test is to discover what issues cause visitors to leave. The issues above are common to ecommerce websites. In this case we might create additional variations:

1. One that adds a return policy to the page.
2. One that removes the registration requirement.
3. One that adds trust symbols to the site.

By split testing these changes, we see if we can get more of these visitors to finish their purchase, to convert.

How do we know which issues might be causing visitors leave? This is done by researching your visitors, looking at analytics data, and making educated guesses, which we at Conversion Sciences call “hypotheses”.

In this example, adding a return policy performed best. Removing the registration requirement performed worse than the Control.

In the image above, the number of visitors that complete a transaction is shown. Based on this data, we would learn that adding a return policy and trust symbols would increase success over the Control or removing registration.

The page that added the return policy is our new Control. Our next test would very likely be to see what happens when we add trust symbols to this new Control. It is not unlikely that combining the two could actually reduce the conversion rate. So we test it.

Likewise, it is possible that removing the registration requirement would work well on the page with the return policy, our new Control. However, we may not test this combination.

With an AB test, we try each change on it’s own variation to isolate the specific issues and decide which combinations to test based on what we learn.

The goal of AB testing is to identify and verify changes that will increase a page’s overall conversion rate, whether those changes are minor or more involved.

I’m fond of saying that AB testing, or split testing, is the “Supreme Court” of data collection. An AB test gives us the most reliable information about a change to our site. It controls for a number of variables that can taint our data.

2. The Proven AB Testing Framework

Now that we have a feel for the tests themselves, we need to understand how these tests fit into the grand scheme of things.

There’s a reason we are able to get consistent results for our clients here at Conversion Sciences. It’s because we have a proven framework in place: a system that allows us to approach any website and methodically derive revenue-boosting insights.

Different businesses and agencies will have their own unique processes within this system, but any CRO agency worth it’s name will follow some variation of the following framework when conducting A/B testing.

For a closer look at each of these nine steps, check out our in-depth breakdown here: The Proven AB Testing Framework Used By CRO Professionals

3. The Critical Statistics Behind Split Testing

You don’t need to be a mathematician to run effective AB tests, but you do need a solid understanding of the statistics behind split testing.

An AB test is an example of statistical hypothesis testing, a process whereby a hypothesis is made about the relationship between two data sets and those data sets are then compared against each other to determine if there is a statistically significant relationship or not.

To put this in more practical terms, a prediction is made that Page Variation #B will perform better than Page Variation #A, and then data sets from both pages are observed and compared to determine if Page Variation #B is a statistically significant improvement over Page Variation #A.

That seems fairly straightforward, so where does it get complicated?

The complexities arrive in all the ways a given “sample” can inaccurately represent the overall “population”, and all the things we have to do to ensure that our sample can accurately represent the population.

Let’s define some terminology real quick.

Population and Variance.

While it appears that one version is doing better than the other, the results overlap too much.

The “population” is the group we want information about. It’s the next 100,000 visitors in my previous example. When we’re testing a webpage, the true population is every future individual who will visit that page.

The “sample” is a small portion of the larger population. It’s the first 1,000 visitors we observe in my previous example.

In a perfect world, the sample would be 100% representative of the overall population.

For example:

Let’s say 10,000 out of those 100,000 visitors are going to ultimately convert into sales. Our true conversion rate would then be 10%.

In a tester’s perfect world, the mean (average) conversion rate of any sample(s) we select from the population would always be identical to the population’s true conversion rate. In other words, if you selected a sample of 10 visitors, 1 of them (10%) would buy, and if you selected a sample of 100 visitors, then 10 would be buy.

But that’s not how things work in real life.

In real life, you might have only 2 out of the first 100 buy or you might have 20… or even zero. You could have a single purchase from Monday through Friday and then 30 on Saturday.

This variability across samples is expressed as a unit called the “variance”, which measures how far a random sample can differ from the true mean (average).

This variance across samples can derail our findings, which is why we have to employ statistically sound hypothesis testing in order get accurate results.

For example:

How AB Testing Eliminates Timing Issues

One alternative to AB testing is “serial” testing, or change-something-and-see-what-happens testing. I am a fan of serial testing, and you should make it a point to go and see how changes are affecting your revenue, subscriptions and lead.

There is a problem, however. If you make your change at the same time that a competitor starts an awesome promotion, you may see a drop in your conversion rates. You might blame your change when, in fact, the change in performance was an external market force.

AB testing controls for this.

In an AB test, the first visitor sees the original page, which we call the Control. This is the “A” in the term “AB test”.The next visitor sees a version of the page with the change that’s being tested. We call this a Treatment, or Variation. This is the “B” in the term AB test. We can also have a “C” and a “D” if we have enough traffic.

The next visitor sees the control and the next the treatment. This goes on until we enough people have seen each version to tell us which they like best. We call this statistical significance. Our software tracks these visitors across multiple visits and tells us which version of the page generated the most revenue or leads.

Since visitors come over the same time period, changes in the marketplace — like our competitor’s promotion — won’t affect our results. Both pages are served during the promotion, so there is no before-and-after error in the data.

Another way variance can express itself is in the way different types of traffic behave differently. Fortunately, you can eliminate this type of variance simply by segmenting traffic.

How Visitor Segmentation Controls For Variability

An AB test gathers data from real visitors and customers who are “voting” on our changes using their dollars, their contact information and their commitment to our offerings. If done correctly, the makeup of visitors should be the same for the control and each treatment.

This is important. Visitors that come to the site from an email may be more likely to convert to a customer. Visitors coming from organic search, however, may be early in their research, with not as many ready to buy.

If you sent email traffic to your control and search traffic to the treatment, it may appear that the control is a better implementation. In truth, it was the kind of traffic or traffic segment that resulted in the different performance.

By segmenting types of traffic and testing them separately, you can easily control for this variation and get a much better understanding of visitor behavior.

Why Statistical Significance Is Important

One of the most important concepts to understand when discussing AB testing is statistical significance, which is ultimately all about using large enough sample sizes when testing. There are many places where you can acquire a more technical understanding of this concept, so I’m going to attempt to illustrate it instead in layman’s terms.

Imagine flipping a coin 50 times. While from a probability perspective, we know there is a 50% chance of any given flip landing on heads, that doesn’t mean we will get 25 heads and 25 tails after 50 flips. In reality, we will probably see something like 23 heads and 27 tails or 28 heads and 22 tails.

Our results won’t match the probability because there is an element of chance to any test – an element of randomness that must be accounted for. As we flip more times, we decrease the effect this chance will have on our end results. The point at which we have decreased this element of chance to a satisfactory level is our point of statistical significance.

In the same way, when running an AB tests on a web page, there is an element of chance involved. One variation might happen to receive more primed buyers than the other or perhaps an isolated group of visitors happen to have a negative association with an image used on one page. These chance factors will skew your results if your sample size isn’t large enough.

While it appears that one version is doing better than the other, the results overlap too much.

It’s important not to conclude an AB test until you have reach statistically significant results. Here’s a handy tool to check if your sample sizes are large enough.

For a closer look at the statistics behind A/B testing, check out this in-depth post: AB Testing Statistics: An Intuitive Guide For Non-Mathematicians

4. How To Conduct Pre-Test Research

The definition of optimization boils down to understanding your visitors.

In order to succeed at A/B testing, we need to be creating variations that perform better for our visitors. In order to create those types of variations, we need to understand what visitors aren’t liking about our existing site and what they want instead.

Aka we need research.

For a close look at each of these sections, check out our full writeup here: AB Testing Research: Do Your Conversion Homework

5. How To Create An A/B Testing Strategy

Once we’ve done our homework and identified both problem areas and opportunities for improvement on our site, it’s time to develop a core testing strategy.

An A/B testing strategy is essentially a lens through which we will approach test creation. It helps us prioritize and focus our efforts in the most productive direction possible.

There are 7 primary testing strategies that we use here at Conversion Sciences.

1. Gum Trampoline
2. Completion Optimization
3. Flow Optimization
4. Minesweeper
5. Big Rocks
6. Big Swings
7. Nuclear Option

Since there is little point in summarizing these, click here to read our breakdown of each strategy: The 7 Core Testing Strategies Essential To Optimization

6. “AB” & “Split” Testing Versus “Multivariate” Testing

While most marketers tend to use these terms interchangeably, there are a few differences to be aware of. While AB testing and split testing are the exact same thing, multivariate testing is slightly different.

AB and Split tests refer to tests that measure larger changes on a given page. For example, a company with a long-form landing page might AB test the page against a new short version to see how visitors respond. In another example, a business seeking to find the optimal squeeze page might design two pages around different lead magnets and compare them to see which converts best.

Multivariate testing, on the other hand, focuses on optimizing small, important elements of a webpage, like CTA copy, image placement, or button colors. Often, a multivariate test will test more than two options at a time to quickly identify outlying winners. For example, a company might run a multivariate test cycling 6 different button colors on its most important sales page. With high enough traffic, even a 0.5% increase in conversions can result in a significant revenue boost.

Multivariate testing works through all possible combinations.

While most websites can run meaningful split tests, multivariate tests are typically reserved for bigger sites, as they require a large amount traffic to produce statistically significant results.

For a more in-depth look at multivariate testing, click here: Multivariate Testing: Promises and Pitfalls for High-Traffic Websites

7. How To Analyze Testing Results

After we’ve run our tests, it’s time to collect and analyze the results. My co-founder Joel Harvey explains how Conversion Sciences approaches post-test analysis below:

When you look at the results of an AB testing round, the first thing you need to look at is whether the test was a loser, a winner, or inconclusive.

Verify that the winners were indeed winners. Look at all the core criteria: statistical significance, p-value, test length, delta size, etc. If it checks out, then the next step is to show it to 100% of traffic and look for that real-world conversion lift.

In a perfect world you could just roll it out for 2 weeks and wait, but usually, you are jumping right into creating new hypotheses and running new tests, so you have to find a balance.

Once we’ve identified the winners, it’s important to dive into segments.

• Mobile versus non-mobile
• Paid versus unpaid
• Different browsers and devices
• Different traffic channels
• New versus returning visitors (important to setup and integrate this beforehand)

This is fairly easy to do with enterprise tools, but might require some more effort with less robust testing tools. It’s important to have a deep understanding of how tested pages performed with each segment. What’s the bounce rate? What’s the exit rate? Did we fundamentally change the way this segment is flowing through the funnel?

We want to look at this data in full, but it’s also good to remove outliers falling outside two standard deviations of the mean and re-evaluate the data.

It’s also important to pay attention to lead quality. The longer the lead cycle, the more difficult this is. In a perfect world, you can integrate the CRM, but in reality, this often doesn’t work very seamlessly.

For a more in-depth look at post test analysis, including insights from the CRO industry’s foremost experts, click here: 10 CRO Experts Explain How To Profitably Analyze AB Test Results

8. How AB Testing Tools Work

The tools that make AB testing possible provide an incredible amount of power. If we wanted, we could use these tools to make your website different for every visitor to your website. The reason we can do this is that these tools change your site in the visitors’ browsers.

When these tools are installed on your website, they send some code, called JavaScript along with the HTML that defines a page. As the page is rendered, this JavaScript changes it. It can do almost anything:

• Change the headlines and text on the page.
• Hide images or copy.
• Move elements above the fold.

Primary Functions of AB Testing Tools

AB testing software has the following primary functions.

Serve Different Webpages to Visitors

The first job of AB testing tools is to show different webpages to certain visitors. The person that designed your test will determine what gets showed.

An AB test will have a “control”, or the current page, and at least one “treatment”, or the page with some change. The design and development team will work together to create a different treatment. The JavaScript must be written to transform the control into the treatment.

It is important that the JavaScript work in on all devices and in all browsers used by the visitors to a site. This requires a committed QA effort.

Conversion Sciences maintains a library of devices of varying ages that allows us to test our JavaScript for all visitors.

Split Traffic Evenly

Once we have JavaScript to display one or more treatements, our AB testing software must determine which visitors see the control and which see the treatments.

Typically, every other user will get a different page. The first will see the control, the next will see the first treatment, the next will see the second treatment and the fourth will see the control. Around it goes until enough visitors have been tested to achieve statistical significance.

It is important that the number of visitors seeing each version is about the same size. The software tries to enforce this.

Measure Results

The AB testing software tracks results by monitoring goals. Goals can be any of a number of measurable things:

1. Products bought by each visitor and the amount paid
2. Subscriptions and signups completed by visitors
3. Forms completed by visitors

Almost anything can be measured, but the most important are business-building metrics such as purchases, subscriptions and leads generated.

The software remembers which test page was seen. It calculates the amount of revenue generated by those who saw the control, by those who saw treatment one, and so on.

At the end of the test, we can answer one very important question: which page generated the most revenue, subscriptions or leads? If one of the treatments wins, it becomes the new control.

And the process starts over.

Do Statistical Analysis

The tools are always calculating the confidence that a result will predict the future. We don’t trust any test that doesn’t have at least a 95% confidence level. This means that we are 95% confident that a new change will generate more revenue, subscriptions or leads.

Sometimes it’s hard to wait for statistical significance, but it’s important lest we make the wrong decision and start reducing the website’s conversion rate.

Report Results

Finally, the software communicates results to us. These come as graphs and statistics.

AB Testing Tools deliver data in the form of graphs and statistics.

It’s easy to see that the treatment won this test, giving us an estimated 90.9% lift in revenue per visitor with a 98% confidence.

This is a rather large win for this client.

Selecting The Right Tools

Of course, there are a lot of A/B testing tools out there, with new versions hitting the market every year. While there are certainly some industry favorites, the tools you select should come down to what your specific businesses requires.

In order to help make the selection process easier, we reached out to our network of CRO specialists and put together a list of the top-rated tools in the industry. We rely on these tools to perform for multi-million dollar clients and campaigns, and we are confident they will perform fo you as well.

Check out the full list of tools here: The 20 Most Recommended AB Testing Tools By Leading CRO Experts

9. How To Build An A/B Testing Team

The members of a CRO team.

Conversion Sciences offers a complete turnkey team for testing. Every team that will use these tools must have competent people in the following roles, and we recommend you follow suit in building your own teams.

Data Analyst

The data analyst looks at the data being collected by analytics tools, user experience tools, and information collected by the website owners. From this she begins developing ideas, or hypotheses, for why a site doesn’t have a higher conversion rate.

The data analyst is responsible for designing tests that prove or disprove a hypothesis. Once the test is designed, she hands it off to the designer and developer for implementation.

Designer

The designer is responsible for designing new components for the site. These may be as simple as creating a button with a different call to action, to completely redesigning a landing page for conversion.

The designer must be experienced enough to carefully design the changes we are testing. We want to change the element we are testing and nothing else.

Developer

Our developers are very good at creating JavaScript that manipulates a page without breaking anything. They are experienced enough to write JavaScript that will run successfully on a variety of devices, operating systems and browsers.

QA Tech

The last thing we want to do is break a commercial website. This can result in lost revenue and invalidate our tests. A good quality assurance person checks the JavaScript and design work to ensure it works on all relevant devices, operating systems and browsers.

Getting Started on AB Testing

Conversion Sciences invites all businesses to work AB testing into their marketing mix. You can start by working with us and then move the effort in-house.

Get started with our 180-day Conversion Catalyst program, a process designed to get you started AND pay for itself with newly discovered revenue.

AB Testing Research: Do Your Conversion Homework

IF we had to pick one thing that has made Conversion Sciences a successful AB testing agency, it would be this: We are very good at picking what to test.
This isn’t because the team is made up of geniuses (except you Brian, we all know you’re a genius). It’s because we have a consistent methodology for conducting AB testing research. In other words, we do our homework.
Like we talked about in our our rundown of the best AB testing tools, “your AB tests are only as good as the hypotheses you are testing.”
With the proper research, we can consistently make better hypotheses, leading to more profitable testing results and a better experience for our visitors.

100 Million Neurons in Our Guts

Would you believe there are 100 million neurons in the human gut? This concentration is second only to our brains, even prompting scientists to refer to it as our “second brain”. While we don’t use our gut to make conscious decisions, it can greatly influence our mental state and is likely the reason we have “gut reactions” or “gut feelings”.
There are times when “going with your gut” makes sense. That time is when you don’t have any other options. If there is no information available to you, your gut may be a good second opinion for your brain.
On the web, there is rarely a need to go with your gut. Let’s redefine these terms.
Whenever someone says “My gut reaction is…” you should hear, “I don’t really know. Let’s do some more research.”
Whenever someone says, “I have a gut feeling that…” you should hear, “I don’t have enough information. How can we better inform ourselves before making this decision?”
We are living in a golden age of digital marketing information. With such easy access to research methods, there is no good reason to ever go from the gut on web design, copywriting, value proposition, etc. You don’t need your intestines to design your website.
After all, the primary output of a healthy gut is… well… crap.
What research skills can keep you from resorting to your colon for inspiration? To answer that question, we worked with KlientBoost to capture many of the key AB testing research methods and enjoy the satisfying feeling of winning AB tests.

Research + Framework = Growth

AB testing research feeds an AB testing framework for test results that are consistently positive and repeatable. Feed it well, and it will poop out revenue growth month after month. This is the only resemblance to your gut I could think of.
It doesn’t make sense to test an idea without some evidence that it will make a difference. Good research is full of nutrients, vitamins and fiber. And that is the last time I’ll refer to the digestive system in this article.

The Heart of AB Testing Research

Before we get into the details, it’s important to understand the core of testing research, and ultimately, the core of conversion optimization itself:

“The definition of optimization boils down to understanding your visitors.” – Brian Massey

Optimization is just a fancy word for bettering our understanding of our customers and giving them more of what they want.
Behavioral data is the best, most reliable research you can get. With it, we can eliminate tripping points and optimize the experience. We find this in our analytics databases. But much of our AB testing research will not be behavioral, and that is fine.
In general, there are two types of research:

1. Quantitative Research
2. Qualitative Research

Understanding Quantitative Research

Quantitative data is generated from large sample sizes. Quantitative data tells us how large numbers of visitors and potential visitors behave. It’s generated from analytics databases (like Google Analytics), trials, and AB tests.
The primary goal of evaluating quantitative data is to find where the weak points are in our funnel. The data gives us objective specifics to research further.
There are a few different types of quantitative data we’ll want to collect and review:

• Backend analytics
• Transactional data
• User intelligence

Understanding Qualitative Research

Qualitative data is generated from individuals or small groups. It is collected through heuristic analysis, surveys, focus groups, phone or chat transcripts, and customer reviews.
Qualitative data can uncover the feelings your users experience as they view a landing page and the motivations behind how they interact with your website.
Qualitative data is often self-reported data, and is thus suspect. Humans are good at making up rationalizations for how they behave in a situation. However, it is a great source of test hypotheses that can’t be discerned from quantitative behavioral data.
While quantitative data tells us what is happening in our funnel, qualitative data can tell us why visitors are behaving a certain way, giving us a better understanding of what we should test.
There are a number of tools we can use to obtain this information:

• Surveys and other direct feedback
• Customer service transcripts
• Interviews with sales and customer service reps
• Session Recording

Usability and User Experience

Two of our key objectives in going through all this data is to evaluate our website’s Usability and User Experience.
Usability deals with how easy it is for someone to learn and use the functions of our site. If we can make any part of that customer journey easier or more intuitive, we are increasing Usability.
User Experience deals with the emotions and attitudes users experience as they use our site. If we can make the customer journey more enjoyable, we are improving User Experience.
While these two concepts often go hand in hand, they are not the same, and both need to be kept in mind when collecting data.

The Importance of Segmentation

It’s not enough to simply know that “visitors” are doing ____ when they visit a given webpage or flow through a given funnel.
Which visitors?

• Are they on mobile or desktop?
• Are they here via paid ads or organic search?
• Are they using Chrome or Firefox?
• Did they click-through via a Facebook post or a Tweet?
• Are they a new or returning visitor?

In order to properly understand and evaluate our visitors and customers, divide them into strategic segments to understand the differences across each segment.
It’s especially important to know what these key segments are before we run our AB tests, because otherwise, our tests won’t tell us anything about them.

Follow the Proven System A/B Testing Agencies Use

AB testing research is a fundamental part of any proven CRO framework, and it’s an important part of what separates an ROI-generating A/B testing agency from a waste of money.
As you finish up the year and move into 2017, it’s time to take things up a notch. In 2016, hundreds of businesses drastically improved their bottom lines via a proven, systematic CRO process.
Why not join the party? Take our free gift and click here to schedule a call with one of our CRO professionals.

References

Think Twice: How the Gut’s “Second Brain” Influences Mood and Well-Being

8 Ecommerce Testing Examples You Should Have Tried Already

Preston Pilgrim presents 8 successful ecommerce testing examples, highlighting some fairly easy-to-implement wins. This is the type of stuff that you should probably have tested already on your site. If you havent’ tried these, there’s no time like the present.

Ecommerce and conversion CRO are the ultimate match: Lots of moving parts. High traffic volume. When it comes to ROI, the sky’s the limit.
The following examples provide a great overview of what success can look like when you are executing on a proven AB testing framework.

1. Intuit Increases Conversions With Proactive Live Chat

Intuit Enterprise introduced proactive chat in various spots on their website. A small pop-up window with a call to action stating, “Chat now with a QuickBooks Sales Consultant” allowed potential buyers to instantly gain answers to their questions, clearing away roadblocks on the path to a sale.
Adding chat to the checkout process resulted in a 20% increase in conversions and a 43% increase in average order value.

Intuit proactively offers live chat.

Most impressively, when Intuit added proactive chat to the product comparison page, sales increased by 211%.

Proactively offering live chat increased sales for Intuit by 211%

THE LESSON: When you’re thinking about adding chat, think about the areas where your customers are the most likely to have questions. This may be when comparing products, or could be regarding payment options. Put some thought into the correct placement and watch the conversions skyrocket.
On a cautionary note, sometimes, displaying a “live chat” that isn’t actually live can alienate visitors. While it might work for certain audiences, for others, it can come across as misleading. If you are going to display a “live chat” without actually offering live chat services, just make sure to first verify that they aren’t alienating your visitors.

SmileyCookie is a niche ecommerce store that sells cookies and gourmet gift baskets. While the site was getting a fairly solid conversion rate, they decided to spend some time optimizing the header bar that was normally used for seasonal promotions.
This is a great place to test your value propositions.

They tested a number of different value propositions, including the following:

• Order Today -> Ships Next Business Day
• \$6.99 Ground Shipping For Your Entire Order
• FREE SHIPPING on any order over \$40

Tests included focusing on sales, pricing and value.

It turns out that immediacy was an important value to the largest segment of their visitors. The winning variation “Order Today -> Ships Next Business Day” delivered an impressive 12.61% conversion rate at 95% confidence. Even more importantly, the success of this test brought in an additional \$1.01 per visitor for SmileyCookie.
THE LESSON: The winning variation gives us some insight into what many ecommerce shoppers are looking for: fast shipping & handling. Any time you can lower uncertainty and solidify expectations, conversions tend to improve.
But an even more important takeaway is SmileyCookie’s investment in testing 5 different options for a large impact area like the site promotional header. No matter how good your predictions are, you will almost always get better results when you are able to test multiple options as opposed to just two.

3. Express Watches Boosts Conversions With Trust Building

Express is a UK company that sells Seiko watches. It’s an industry where buyers have legitimate concerns about purchasing online. What if the watch I buy is a fake?
In answer to this, ExpressWatches A/B tested placing a “price guarantee” stamp or an “authorized dealer” stamp on the product page.

If you’re buying a high-end watch, is price really the most important issue? ExpressWatches questioned “Never Beaten on Price”

Replacing “Never Beaten on Price” with a Seiko authorized dealer badge “borrows” brand trust from Seiko.

This stamp of “authenticity” garnered a sizable 107% jump in sales.
THE LESSON: If you’re operating in an industry with fraud risk, proving your authenticity can go a long way. In this case, ExpressWatches “borrowed” trust by tapping into Seiko’s brand authority. What associations logos, media logos and consumer review logos do you qualify to display?
While this split test worked, it’s not a guarantee it will work in all industries. You will need to run your own split tests, try different variations of how you display authenticity and see how it affects your conversions.
This most likely wouldn’t have a significant effect on conversions when dealing with lower priced “commodity” items, but with larger purchases, this type of split test can go a long way!

4. SAP BusinessObjects Increases Conversions With Prominent CTA Button

SAP BusinessObjects replaced its original small blue text “add to cart” link with a large, orange button.
Original:

Where would you click to get a Crystal Reports trial?

Alternate version:

BusinessObjects made the call to action the most visually prominent element on the page.

And conversions increased by 32%!
THE LESSON: Make your call to action count by making it visually prominent. Don’t make customers hunt for your “buy” button. Make it front and center and easy to click through and the sales will naturally increase.
I also want to conclude here that just because you make something bigger and stand out more, it doesn’t mean it will convert higher. In some cases, decreasing the size of the button or CTA has had positive impacts on conversions. The main takeaway you need to grab from this case study is to always split test your main CTA’s and buttons. Good designers know how to make something visually prominent.

1. Button size-Bigger doesn’t always mean more visible.
2. Button placement-Place it so that the eye is drawn to it.
3. White space can make a call to action stick out.
4. Button color-Should be different from the page template
5. Button text that echoes the value proposition.

5. Horloges Increases Average Order Value With Guarantee

Horloges.nl is a watch dealer in the Netherlands. Their banner originally had information about overnight shipping (“Morgen al in huis”), free shipping (“Gratis verzending”) and their status as an “official” G-Shock dealer.

Horloges.nl changed their banner to make it smaller, and include a 2-year guarantee on watches.

This change caused the average order value to increase by 6%, and total conversions improved by 41%.
THE LESSON: Adding a guarantee is an excellent way to increase new customers’ comfort level in purchasing from a new vendor. Consider changing banner ads to make them simpler and easier to read in order to increase their efficacy.
This test changed several variables: the offers, the layout, and the font styling. When you test, start with guarantees and once you find one that works, test out how to present that guarantee on your page. Where can you place it? Where does it stand out? Should you only put it on products pages? These are all things you might want to split test.

6. MALL.CZ Increases Conversions With Larger Images

MALL.CZ is the second largest ecommerce retailer in the Czech Republic. Much like Amazon, they sell a wide variety of products, including kitchen supplies and electronics. Product images are an important component in their purchase process, and the company was curious how increasing image size might influence sales.
MALL.CZ’s original product descriptions emphasized lengthy copy and had a smaller product photo.

The category page for Mall.cz may be difficult to scan with competing images, buttons and badges.

MALL.CZ then tried altering the descriptions to emphasize a large product image above the text:

Increasing image size made the category page visually easier to scan.

This variation resulted in a 9.46% increase in sales.
THE LESSON: It can pay off to play around with image sizes and layout. There is some evidence to show that in other circumstances, larger images can actually deter sales, so clearly the issue is product/industry dependent. But if you’re looking for a factor to experiment with, image size has the potential for big rewards.
Aside from image size, playing around with different images in general is another great idea for running split tests.
Option 1: Split test image size, find out what converts at a higher rate.
Option 2: Split test different product images and angles and see what converts higher.
Option 3: Test a 3-D, click-to-rotate type image display, where viewers can get a 360 degree look at the product.

7. Express Watches Further Improves Conversions With Social Proof

Few things can impart safe buying like reading a positive review from a fellow customer. Knowing that other people have purchased the product and had a good experience with the product and retailer is a large factor in influencing customers to click “add to cart”.
Express Watches conducted a customer survey and discovered that their customers wanted to know information about price comparison, if the company was trustworthy, and whether the watch would be genuine and not “fake”.

Is a five-start rating enough without the actual reviews on this product page?

ExpressWatch tested this more aggressive presentation of reviews on their product page.

Express decided to answer those questions by adding a Customer Reviews widget on their product pages. Adding the reviews widget, ExpressWatch saw a 58.39% increase in sales.
THE LESSON: Social proof is a proven way to overcome visitor objections and push the sale closer to checkout will increase sales. Clear roadblocks by decreasing customers’ worries and adding symbols of trustworthiness.
When you think about the logic of the buyer, it makes sense. When you are injured or sick and you need to see a doctor or therapist, you are most likely going to ask your close relatives or friends if they have any recommendations on a doctor they have used in the past that they have had success with. If they give you a recommendation with positive feedback, it’s most likely you are going to choose that doctor over any other ones in the area. The takeaway, positive feedback can increase conversions tremendously.
Just realize that if you’re showing your reviews to the public, too many negative reviews can lower your conversion rate. At the same time, having a 100% positive review rate will also lower conversions, as this tends to be viewed as being manipulated in some way. Negative reviews are an important part of visitor research.
The best review profile looks like what you’d expect to see if a bunch of people tried out a great product. Most would love it. Some would be unimpressed. And one or two would trash it out of spite.

8. Corkscrew Wines Increases Sales With Prominent Discounts

What good is a sale if you’re not advertising it enough? The whole point of launching a sale is to move product, and in order to do that, customers need to be aware that there’s a deal out there that can’t be missed.
Corkscrew Wines experimented with adding sale information in a red circle, front and center over the product image.

The price is discounted on this product page, but the visitor has to do the math.

If you’re offering a discount, don’t hide it.

The result? A whopping 148.3% increase in sales. Both product description pages showed the same price- but the second simply highlighted the savings.
THE LESSON: Let people know when they’re saving money. They’ll want to buy more!
Again, there are many different ways to split test this. In this particular case they are displaying the discount on the bottle and title. Play around with this, try displaying it in different locations, try larger images, try different colors that stand out.

Conclusion

Hopefully, today’s ecommerce testing examples have provided you with some pointers for you next batch of split tests.
Remember that your customers ultimately hold the key to increased sales, and any worthwhile AB testing framework starts with getting a thorough understanding of how they are engaging with your site and what they are feeling in the process.
For further reading, check out Conversion Sciences’ rundown of The 7 Core AB Testing Strategies Fundamental to CRO Success.

Preston Pilgrim is a marketer at Acro Media, a digital agency focused on optimizing Drupal point of sale product pages, contact pages, homepages, and more. Learn more from Preston’s expertise via the Acro Media blog.

The 7 Core A/B Testing Strategies That Are Fundamentally Essential To CRO Success

One of these A/B testing strategies is right for your website, and will lead to bigger wins faster.

We have used analysis and testing to find significant increases in revenue and leads for hundreds of companies. For each one, we fashion unique AB testing strategies for each defining where to start and what to test.

However, we will virtually always build out that unique testing strategy off one of seven core strategies that I consider fundamenal to CRO success.

If you are beginning your own split testing or conversion optimization process, this is your guide to AB testing strategies. For each these seven strategies, I’m going to show you:

1. When to use it
2. Where on the site to test it
3. What to test
4. Pitfalls to avoid
5. A real-life example

Let’s get started.

1. Gum Trampoline

We employ the gum trampoline approach when bounce rates are high, especially from new visitors. The bounce rate is the number of visitors who visit a site and leave after only a few seconds. Bouncers only see one page typically.

As the name implies, we want to use these AB testing strategies to slow the bouncing behavior, like putting gum on a trampoline.

We want more visitors to stick to our site and not bounce.

When to Use It

You have a high bounce rate on your entry pages. This approach is especially important if your paid traffic (PPC or display ads) is not buying.

You have run out of paid traffic for a given set of keywords.

Where to Test

Most of your attention will be focused on landing pages. For lead generation, these may be dedicated landing pages. For ecommerce sites, these may be category pages or product pages.

What to Test

The key components of any landing page include:

2. The form that allows the visitor to take action. This may be just a button.
3. The proof you use on the page that it’s a good decision.
4. The trust you build, especially from third-party sources.
5. The images you use to show the product or service. Always have relevant images.

Be Careful

Reducing bounce rate can increase leads and revenue. However, it can also increase the number of unqualified visitors entering the site or becoming prospects.

Example

In the following example, there is a disconnect between the expectation set by the advertisement (left side) and the landing page visitors see when they click on the ad (right side).

Paid ads are often a fantastic tool for bringing in qualified traffic, but if the landing page isn’t matched to the ad, visitors are likely to immediately bounce from the page rather than attempting to hunt for the treasure promised in the ad.

In order to apply gum to this trampoline, Zumba would need to take ad click-throughs to a page a featuring “The New Wonderland Collection”, preferably with the same model used in the ads. The landing needs to be designed specifically for the type of user who would be intrigued by the ad.

2. Completion Optimization

The Completion strategy begins testing at the call to action. For a lead-generation site, the Completion strategy will begin with the action page or signup process. For an ecommerce site, we start with the checkout process.

When to Use It

The Completion strategy is used for sites that have a high number of transactions and want to decrease the abandonment rate. The abandonment rate is the percentage of visitors who start a checkout or registration process, but don’t complete it. They abandon the process before they’re done.

Where to Test

This process starts at the end of the process, in the shopping cart or registration process.

What to Test

There are lots of things that could be impacting your abandonment rate.

• Do you need to build trust with credit logos, security logos, testimonials or proof points?
• Are you showing the cart contents on every step?
• Do you require the visitor to create an account to purchase?
• Do your visitors prefer a one-step checkout or a multi-step checkout?
• Are you asking for unnecessary information?

Once you have reduced the abandonment rates, you can begin testing further upstream, to get more visitors into your optimized purchase or signup process.

Be Careful

Testing in the cart can be very expensive. Any test treatments that underperform the control are costing you real leads and sales. Also, cart abandonment often has its roots further upstream. Pages on your site that make false promises or leave out key information may be causing your abandonment rates to rise.

For example, if you don’t talk about shipping fees before checkout, you may have lots of people staring the purchase process just to find out what your shipping fees are.

Example

As we’ve talked about before, best practices are essentially guesses in CRO. We know, as a general rule, that lowering checkout friction tends to improve conversion rates and lower abandonment. But sometimes, it’s actually perceived friction that impacts the checkout experience above and beyond the real level of friction.

For example, one of our clients upgraded their website and checkout experience in accordance with best practices.

• The process was reduced from multiple steps to a single step.
• The order is shown, including the product images.
• The “Risk-free Guarantee” at the top and “Doctor Trusted” bug on the right reinforces the purchase.
• Trust symbols are placed near the call-to action button.
• All costs have been addressed, including shipping and taxes.

The new checkout process should have performed better, yet it ended up having a significantly higher bounce rate than the previous checkout process.

Why?

After looking at previous checkout experience, we realized that despite it actually requiring more steps (and clicks) on the part of the user, the process was broken up in a such a way that the user perceived less friction along the way. Information was hidden behind each step, so that they user never ultimately felt the friction.

Step #1:

Paypal payment method step 1

Step #2:

Paypal billing information

This is just one of many reasons running AB tests is mandatory, and it’s also a good example of how beneficial it can be for certian business to start with the checkout process, as dicated by the LT strategy.

3. Flow Optimization

The Flow approach is essentially the opposite of the Completion strategy. With this strategy, you’re trying to get more visitors into the purchase process before you start optimizing the checkout or registration process.

When to Use It

This strategy is typically best for sites with fewer transactions. The goal is to increase visits to the cart or registration process so we start Completion testing at the bottom of the funnel.

Where to Test

Testing starts on entry pages, the pages on which visitors enter the site. This will typically include  the home page and landing pages for lead-generating sites. For ecommerce sites category pages and product pages get intense scrutiny to increase Add to Cart actions.

What to Test

With this strategy, we are most often trying to understand what is missing from the product or service presentation.

• What questions are going unanswered?
• What objections aren’t being addressed?
• What information isn’t presented that visitors need?
• Is the pricing wrong for the value presented?

We will test headlines, copy, images and calls to action when we begin the GT strategy.

Be Careful

Even though we aren’t optimizing the checkout or registration process, avoid testing clicks or engagement metrics. Always use purchases or leads generated as the primary metric in your tests. It’s too easy to get unqualified visitors to add something to cart only to see abandonment rates skyrocket.

Example

Businesses that benefit from the GT strategy typically need to relook at their central value proposition on poorly converting landing pages.

For example, when Groove decided it’s 2.3% homepage conversion rate wasn’t going to cut it anymore, it began the optmization process by revamping its value proposition. The existing page was very bland, with a stock photo and a weak headline that didn’t do anything to address the benefits of the service.

Groove SaaS and eCommerce Customer Support Value Proposition

The new page included a benefits-driven headline and a well-produced video of a real customer describing his positive experience with Groove. As a result, the page revamp more than doubled homepage conversions.

Groove created a ‘copy first’ landing page based on feedback from customers

The point here is that fixing your checkout process isn’t going to do you a ton of good if you aren’t getting a whole lot of people there in the first place. If initial conversions are low, it’s better to start with optimizing your core value proposition than go fishing for problems on the backend of your funnel.

4. Minesweeper

Minesweeper optimization strategies use clues from several tests to determine where additional revenue might be hiding.

Some sites are like the Minesweeper game that has shipped with Windows operating systems for decades. In the game you hunt for open squares and avoid mines. The location of minds is hinted at by numbered squares.

In this game, you don’t know where to look until you start playing. But it’s not random. This is like an exploratory testing strategy.

When to Use It

This testing strategy is for sites that seem to be working against the visitor at every turn. We see this when visit lengths are low or people leave products in the cart at high rates. Use it when things are broken all over the site, then dive into one of the other strategies.

As testing progresses, we get clues about what is really keeping visitors from completing a transaction. The picture slowly resolves as we collect data from around the site.

Where to Test

This strategy starts on the pages where the data says the problems lie.

What to Test

By its nature, it is hard to generalize about this testing strategy. As an example, we may believe that people are having trouble finding the solution or product they are looking for. Issues related to findability, or “discoverability” may include navigation tests, site search fixes, and changes to categories or category names.

Be Careful

This is our least-often used strategy. It is too scattershot to be used frequently. We prefer the data to lead us down tunnels where we mine veins of gold.

However, this is the most common of optimization strategies used by inexperienced marketers. It is one of the reasons that conversion projects get abandoned. The random nature of this approach means that there will be many tests that don’t help much and fewer big wins.

Example

You wouldn’t expect a company pulling in \$2.1 Billion in annual revenue to have major breaks in it’s website, yet that’s exactly what I discovered a few years back while attempting to make a purchase from Fry’s Electronics. Whenever I selected the “In-store Pickup” option, I was taken to the following error screen.

This is one of the most important buttons on the site, doubly so near Christmas when shipping gifts becomes an iffy proposition. Even worse, errors like this often aren’t isolated.

While finding a major error like this doesn’t necessarily mean you need to begin the Minesweeper optimization strategy, it’s always important to fix broken pieces of a site before you even begin to look at optimization strategies.

5. Big Rocks

Adding new features — “big rocks” — to a site can fundamentally change its effectiveness.

Almost every site has a primary issue. After analysis, you will see that there are questions about authority and credibility that go unanswered. You might find that issues with the layout are keeping many visitors from taking action.

The Big Rocks testing strategy adds fundamental components to the site in an effort to give visitors what they are looking for.

When to Use It

This strategy is used for sites that have a long history of optimization and ample evidence that an important component is missing.

Where to Test

These tests are usually site-wide. They involve adding fundamental features to the site.

What to Test

Some examples of big rocks include:

• Ratings and Reviews for ecommerce sites
• Live Chat
• Product Demo Videos
• Faceted Site Search
• Recommendation Engines
• Progressive Forms
• Exit-intent Popovers

Be Careful

These tools are difficult to test. Once implemented, they cannot be easily removed from the site. Be sure you have evidence from your visitors that they want the rock. Don’t believe the claims of higher conversions made by the Big Rock company salespeople. Your audience is different.

Example

A good example of the Big Rocks strategy in action comes from Adore Me, a millennial-targeted lingerie retailer that catipulted it’s sales by installing Yotpo’s social-based review system. The company was relying primarily on email and phone for customer feedback and identified ratings and user reviews as its “big rock” to target.

The revamped customer engagement system helped spawn tens of thousands of new reviews and also facilitated a flood of user-generated content on sites like Instagram without Adore Me even having to purchase Instagram ads. Type in #AdoreMe and you’ll find thousands of unsponsored user-generated posts like these:

This is a great example of a how certain AB testing strategies can help facilitate different business models. The key is identify the big opportunities and then focusing on creating real, engaging solutions in those areas.

6. Big Swings

Taking big swings can lead to home runs, but can also obscure the reasons for wins.

A “Big Swing” is any test that changes more than one variable and often changes several. It’s called a big swing because it’s when we swing for the fences with a redesigned page.

When to Use It

Like the Big Rock strategy, this strategy is most often used on a site that has a mature conversion optimization program. When we begin to find the local maxima for a site, it gets harder to find winning hypotheses. If evidence suggests that a fundamental change is needed, we’ll take a big swing and completely redesign a page or set of pages based on what we’ve learned.

Sometimes we start with a Big Swing if we feel that the value proposition for a site is fundamentally broken.

Where to Test

We often take big swings on key entry pages such as the home page or landing pages. For ecommerce sites, you may want to try redesigning the product page template for your site.

What to Test

Big Swings are often related to layout and messaging. All at once, several things may change on a page:

• Copy
• Images
• Layout
• Design Style
• Calls to Action

Be Careful

Big swings don’t tell you much about your audience. When you change more than one thing, the changes can offset each other. Perhaps making the headline bigger increased the conversion rate on a page, but the new image decreased the conversion rate. When you change both, you may not see the change.

Example

Neil Patel is one of those marketers who likes to use the Big Swings strategy on a regular basis. For example, he has tried complete homepage redesigns for Crazy Egg on several occasions.

The first big redesign changed things from a short-form landing page to a very long-form page and resulted in a 30% increase in conversions.

The next big redesign scrapped the long page for another short page, but this time with concise, targeted copy and a video-driven value proposition. This new page improved conversions by another 13%.

And of course, Neil didn’t stop there. Crazy Egg’s homepage has changed yet again, with the current iteration simply inviting users to enter their website’s URL and see a Crazy Egg’s user testing tools in action on their own site. How well is it converting? No clue, but if I know Neil, I can promise you the current page is Crazy Egg’s highest performer to date.

Sometimes the only way to improve conversions is to swing for the fences and try something new.

7. Nuclear Option

I’ll mention the nuclear option here, which is a full site redesign. There are only two good reasons to do an entire site redesign:

1. You’re changing to a new backend platform.
2. You’re redoing your company or product branding.

All other redesign efforts should be done with conversion optimization tests, like Wasp Barcode.

We even recommend creating a separate mobile site rather than using responsive web design.

You should speak to a Conversion Scientist before you embark on a redesign project.

Which A/B Testing Strategy Is Right For You?

Every website is different. The approach you take when testing a site should ultimately be determined by the data you have. Once you settle on a direction it can help you find bigger wins sooner.

21 Quick and Easy CRO Copywriting Hacks

Keep these proven copywriting hacks in mind to make your copy convert.

• 43 Pages with Examples
• Assumptive Phrasing
• "We" vs. "You"
• Pattern Interrupts
• The power of Three

Photo Credit: megankhines via Compfight cc, Genista via Compfight cc, chemisti via Compfight cc

The Proven AB Testing Framework Used By CRO Professionals

There is no shortage of AB testing tips, tricks, and references to statistical significance. Here is a proven AB testing framework that guides you to consistent, repeatable results.

How do conversion optimization professionals get consistent performance from their AB testing programs?
If you are looking for a proven framework you can use to approach any website and methodically derive revenue-boosting insights, then you will love today’s infographic.
This is the AB testing framework industry professionals use to increase revenue for multi-million dollar clients:

The Purpose of an AB Testing Framework

It’s easy to make mistakes when AB testing. Testing requires discipline, and discipline requires guiding processes that enforce some level of rigor.
This framework ensures that you, the marketer-experimenter, keep some key principles in mind as you explore your website for increased revenue, leads and subscriptions.

• Don’t base decisions on bad data.
• Create valid hypotheses.
• Design tests that will make a difference.
• Design tests that deliver good data.
• Interpret the test data accurately.

This is the framework CRO professionals use to stay on their best game.

1. Evaluate Existing Data

Here are the first two questions you need to ask when approaching a new site.

1. What data is currently available?
2. How reliable is this data?

In some cases, you will have a lot to work with in evaluating a new site. Your efforts will be primarily focused on going through existing data and pulling out actionable insights for your test hypotheses.
In other cases, you might not have much to work with or the existing may be inaccurate, so you’ll need to spend some time setting up new tools for targeted data collection.

Data Audit

The data audit identifies data that is available to the data scientist. It typically includes:

1. Behavioral analytics package
2. Existing customer data, such as sales
3. Marketing studies completed
4. UX Studies completed
5. Product Reviews
6. Live Chat Transcripts
7. Customer surveys completed

All of these data sources are helpful in developing a rich list of hypotheses for testing.

Analytics Audit

Since our analytics database is the all-important central clearinghouse for our website, we want to be sure that it is recording everything we need and accurately.
Often, we forget to track some very important things.

• Popover windows are invisible to most analytics packages without some special code.
• Links away from the site are not tracked. It’s important to know where your leaks are.
• Tabbed content lets the visitor get specifics about products and is often not tracked.
• Third party websites, such as shopping carts can break session tracking without special attention.
• Interactions with off-site content are often masked through the use of iframes.

These issues must be addressed in our audit.

Integrations

It is important that as much data as possible is collected in our analytics database. We never know what questions we will have.
For post-test analysis (see below), we want to be sure our AB testing tool is writing information to the analytics database so we can recreate the test results there. This allows us to drill into the data and learn more about test subjects’ behaviors. This data is typically not available in our testing tools.

Data Correlations

Finally, we want to be sure that the data we’re collecting is accurate. For example, if our site is an ecommerce site, we want to be sure the revenue reported by our testing tool and analytics database is right. We will do a correlation calculation of the revenue reported by analytics with the actual sales of our company.
The same kind of correlation can be done for lead generation and phone calls.
We can also use multiple sources of data to validate our digital laboratory. Does the data in analytics match that reported by our testing tool? Is the number of ad clicks reported by our advertising company the same has the number seen in analytics?
Once we have confidence in our setup, we can start collecting more data.

2. Collect Additional Quantitative & Qualitative Data

Once we understand the data already available to us, we’ll need to set up and calibrate tools that can acquire any additional data needed to run effective split tests. For our testing tool, we may choose to run an AA test.
There are two important types of data that give us insight into optimizing a site.

1. Quantitative Data
2. Qualitative Data

Quantitative data is generated from large sample sizes. Quantitative data tells us how large numbers of visitors and potential visitors behave. It’s generated from analytics databases (like Google Analytics), trials, and AB tests.
The primary goal of evaluating quantitative data is to find where the weak points are in our funnel. The data gives us objective specifics to research further.
There are a few different types of quantitative data we’ll want to collect and review:

• Backend analytics
• Transactional data
• User intelligence

Qualitative data is generated from individuals or small groups. It is collected through heuristic analysis, surveys, focus groups, phone or chat transcripts, and user reviews.
Qualitative data can uncover the feelings your users experience as they view a landing page and the motivations behind how they interact with your website.
Qualitative data is often self-reported data, and is thus suspect. Humans are good at making up rationalizations for how they behave in a situation. However, it is a great source of test hypotheses that can’t be discerned from quantitative behavioral data.
While quantitative data tells us what is happening in our funnel, qualitative data can tell us why visitors are behaving a certain way, giving us a better understanding of what we should test.
There are a number of tools we can use to obtain this information:

• Session recording
• Customer service transcripts
• Interviews with sales and customer service reps
• User testing, such as the 5 second test

3. Review All Website Baselines

The goal of our data collection and review process is to acquire key intelligence on each of our website “baselines”.

1. Sitewide Performance
2. Funnel Performance
3. Technical Errors
4. Customer Segments
5. Channel Performance

Sitewide Performance is your overall website user experience. It includes general navigation and performance across devices and browsers.
Funnel Performance deals specifically with the chain of conversion events that turns visitors into leads and then customers. It will include landing pages, optin forms, autoresponders, cart checkouts, etc.
Technical Errors are the broken parts on your website or elsewhere in the user experience. These don’t need to be optimized. They need to be fixed.
Customer Segments deals with how different key customer segments are experiencing your site. It’s important to understand the differences in how long-time users, new visitors, small ticket buyers, and big ticket purchasers are engaging with your site.
Channel Performance deals with how various traffic acquisition channels are converting on your site. It’s important to understand the differences between how a Facebook driven view costing you \$0.05 and an Adwords driven view costing \$3.48 are converting when they reach your site.

4. Turn Data Into Optimization Hypotheses

Once you have a thorough, data-backed understanding of the target website, the next step is to design improvements that you hypothesize will outperform the current setup.
As you evaluate these changes for potential testing, run them through the following flowchart:

You’ll quickly build a list of potential changes to test, and then you’ll need to prioritize them based on your overall testing strategy.

5. Develop A Testing Strategy

AB testing is a time-consuming process that consumes limited resources. You can’t test everything, so where do you focus?
That will depend on your testing strategy.
Ultimately, you will need to develop a tailored strategy for the specific website you are working with and that website/business’ unique goals, but here a few options to choose from.

Flow vs. Completions

One of the first questions you’ll have to ask is where to start. There are two broad strategies here:

1. Increase the flow of visits to conversion points (shopping cart, registration form, etc.)
2. Increase the completions, the number of visitors who finish your conversion process by buying or registering.

If you find people falling out of the top of your funnel, you may want to optimize there to get more visitors flowing into your cart or registration page. This is a flow strategy.
For a catalog ecommerce site, flow testing may occur on category or product pages. Then tests in shopping cart and checkout process will move faster due to the higher traffic.

Gum Trampoline Strategy

Employ the gum trampoline approach when bounce rates are high, especially from new visitors. The bounce rate is the number of visitors who visit a site and leave after only a few seconds. Bouncers only see one page typically.
With this strategy, you focus testing on landing pages for specific channels.

Minesweeper Strategy

This strategy is for sites that seem to be working against the visitor at every turn. We see this when visit lengths are low or people leave products in the cart at high rates.
For example, we might try to drive more visitors to the pricing page for an online product to see if that gets more of them to complete their purchase.

Big Rocks Strategy

This strategy is used for sites that have a long history of optimization and ample evidence that an important component is missing. Add fundamental components to the site in an effort to give visitors what they are looking for.
Examples of “big rocks” include ratings and reviews modules, faceted search features, recommendation engines, and live demos.

Nuclear Strategy

This strategy includes a full site redesign and might be viable if the business is either changing their backend platform or completely redoing branding for the entire company or the company’s core product.
The nuclear strategy is as destructive as it sounds and should be a last resort.
For additional strategies and a more in-depth look at this topic, check out 7 Conversion Optimization Strategies You Should Consider by Brian Massey.

Once our hypotheses are created and our goals are clearly defined, it’s time to actually the run the AB tests.
Having the right tools will make this process infinitely easier. If you aren’t quite sure what the “right tool” is for your business, check out this article:
The Most Recommended AB Testing Tools By Leading CRO Experts
But even with the right tools, designing an AB test requires a decent amount of work on the user’s end. Tests need to be designed correctly if you want to derive any meaningful insights from the results.
One piece of this that most people are familiar with is statistical significance. Unfortunately, very few people actually understand statistical significance at the level needed to set up split tests. If you suspect that might be you, check out AB Testing Statistics: An Intuitive Guide For Non-Mathematicians.
But there’s a lot more to designing a test than just statistical significance. A well-designed AB test will include the following elements:

• Duration-How long should the test run?
• Goal-What are we trying to increase?
• Percentage of traffic-What percentage of our traffic will see the test?
• Targeting-Who will be entered into the test?
• Treatment Design-The creative for the test treatments.
• Test Code-Moves things around on the page for each treatment.
• Approval-Internal approval of the test and approach.

Tests should be setup to run for a predetermined length of time that incorporates the full cycle of visitor behavior. A runtime of one calendar month is a good rule of thumb.
Test goals, targeting, a display percentages should all be accounted for.
Once the test is designed properly, it’s finally time to actually run it.

7. Run & Monitor Your AB Tests

Running an AB test isn’t as simple as clicking “Run” on your split testing software. There are two critical things that need to happen once the test begins displaying page variations to new visitors.

1. Monitor initial data to make sure everything is running correctly
2. Run quality assurance throughout the testing period

Once the test begins, it’s important to monitor conversion data throughout the funnel, watch for anomalies, and make sure nothing is setup incorrectly. You are running your tests on live traffic after all, and any mistake that isn’t quickly caught could result in massive revenue loss for the website being tested.
As the tests run, we want to monitor a number of things.
There are a number of things we need look at:

1. Statistical significance
2. Progression throughout the test
3. Tendency for inflated testing results
5. Conversion rate vs. revenue

Statistical significance is the first thing we have to look at. A statistically insignificant lift is not a lift. It’s nothing.
But even if our results are significant, we still have to look at the progression of data throughout the testing process. Did the variant’s conversion rate stay consistently higher than the control? Or did it oscillate above and below the control?
If the data is still oscillating at the end of the test period, we might need to continue testing, even if our software is telling us the results are statistically significant.
It’s also important to understand that any lift experienced in testing will almost always be overstated. On average, if a change creates a 30% lift in testing, the actual lift is closer to 10%.
Finally, it’s helpful to run quality assurance throughout the test period, ensuring that split tests are displaying properly across various devices and browsers. Try to break the site again, like you did during the initial site audit, and make sure everything is working.
Once the tests have run through the predetermined ending point, it’s time to review the results.

8. Assess Test Results

Remember that an AB test is just a data collection activity. Now that we’ve collected some data, let’s put that information to work for us.
The first question that will be on our lips is, “Did any of our variations win?” We all love to win.
There are two possible outcomes when we examine the results of an AB test.

1. The test was inconclusive. None of the alternatives beat the control. The null hypotheses was not disproven.
2. One or more of the treatments beat the control in a statistically significant way.

In the case of an inconclusive test, we want to look at individual segments of traffic. How are specific segments of users engaging with the control versus the variant? Some of the most profitable insights can come from failed tests.
Segments to compare and contrast include:

1. Return visitors vs. New visitors
2. Chrome browsers vs. Safari browsers vs. Internet Explorer vs. …
3. Organic traffic vs. paid traffic vs. referral traffic
4. Email traffic vs. social media traffic

These segments will be different for each business, but provide insights that spawn new hypotheses, or even provide ways to personalize the experience.
In the case of a statistical increase in conversion rate, it’s very important to analyze the quality of new conversions. It’s easy to increase conversions, but are these new conversions buying as much as the ones who saw the control?
Ultimately, we want to answer the question, “Why?” Why did one variation win and what does it tell us about our visitors?
This is a collaborative process and speculative in nature. Asking why has two primary effects:

1. It develops new hypotheses for testing
2. It causes us to rearrange the hypothesis list based on new information

Our goal is to learn as we test, and asking “Why?” is the best way to cement our learnings.

9. Implement Results: Harvesting

This is the step in which we harvest our winning increases in conversion, and we want to get these changes rolled out onto the site as quickly as possible. The strategy for this is typically as follows:

1. Document the changes to be made and give them to IT.
2. IT will schedule the changes for a future sprint or release.
3. Drive 100% of traffic to the winning variation using the AB testing tool. We call this a “routing test.”
4. When the change is released to the site by IT, turn off the routing test.

It is not unusual for us to create a new routing test so that we can archive the results of the AB test for future reference.
As another consideration, beware of having too many routing tests running on your site. Conversion Sciences reports that some smaller businesses rely on the routing tests to modify their test, and have dozens of routing tests running. This can cause a myriad of problems.
In one case, a client made a change to the site header and forgot to include the code that enabled the AB testing tool. All routing tests were immediately turned off because the testing tool wasn’t integrated.
Conversion rates plummeted until the code was added to the site. In one sense, this is a validation of the testing process. Conversion Sciences dubbed it a “Light Switch” test.

Conclusion

This is the framework CRO professionals use to consistently generate conversion lifts for their clients using AB testing.

6 Highly Productive Ways To AB Test Content Marketing

Here are six different ways to AB test content elements and the things you should be measuring.
There is a critical part of your sales funnel that probably isn’t optimized.
When you think about CRO, you think about optimizing your online funnel – your emails, landing pages, checkout process, etc. – in order to acquire more customers.
In fact, when it comes to content driven marketing, we rarely see the same commitment to testing, tracking, and optimizing that occurs elsewhere in marketing. Considering that content is found at the top of your sales funnel, the wrong content could be hurting your conversion rates.
Content can be tested in the same way anything else can be tested, and some elements definitely deserve a more CRO-esque approach.

Goals for AB Testing Content

One of the reasons that content is less-frequently tested, is that the goals are often unclear.
Content is great for SEO.
Content is great for educating clients.
Content is great for sharing on social media.
Content is also great for getting prospects into the sales funnel. This is typically done by collecting an email address to begin the conversation.
Here are the 6 different elements you should definitely consider testing. You can run any of these tests using these recommended AB testing tools, but I’ve also included some simple WordPress plugins as a viable alternative if you want to try this on a small budget.

Your headline is arguably the most important piece of your content. It’s the thing that determines whether people click through from your email or social media post.
Yup, only 2 in every 10 of your readers actually read past the headline. Even fewer make it all the way through the article.
Funny enough, it’s also one of the simplest things to test. It’s so easy.
You already know how to run an AB test. Applying that practice to your headlines is a simple 4-step process.
Stephanie Flaxman of CopyBlogger says you should ask yourself three questions to make your headline the best it can be:

1. Who will benefit from this content?
2. How do I help them?
3. What makes this content special?

But don’t get too excited – The first headline you come up with will probably suck. Or maybe it will just be mediocre.
The whole point of AB testing is that you don’t have to come up with THE perfect headline. You simply need to come up with a variety of solid options, and then you can see what performs best.
This is why I recommend creating a list of 5-10 possible headlines.
Next, pick your two favorites and move on to step #2.
2. Send both versions to a live audience.
Now it’s time to test the headlines. You want to show one headline to 50% of your traffic and the other headline to the other 50%.
How you accomplish this will depend on how you acquire traffic.
For example, if you primarily drive traffic to your new blog posts via an email list, create your email and then send half of your subscribers the email using one headline and the other half the same email but using the alternate headline.
If your promote via social media, try posting at different times or across different channels using the alternate headlines and see what happens.
If you promote via paid channels, simply create two ads, using a different headline for each, and set up a normal AB test using proper statistical analysis.
Once you’ve run your tests, it’s time to review the data.
3. Analyze the results.
If your traffic is too low to get statistically significant results, it’s still worth running the tests. Your initial readers typically come from your email list or your most active social followers – aka the people most likely to share your content. Getting a feel for what they respond to is always worthwhile, and you might notice certain trends over time.
4. Implement the one with the most clicks.
Once you have your winner, simply set it as your permanent headline. That’s all there is to it.
But your headline isn’t the only thing that gets people to click.

2. Split Test Your Featured images

Content Marketing Institute, the industry leader in content marketing, found that “ad engagement increased by 65% when a new image was tested versus simply testing new copy.”
Brian Massey summarizes it well here, “Spend as much time on your images as your copy.”
Whether you’re using paid ads in your content marketing strategy or not, the image matters almost as much as the headline (maybe more).
So, how does one select the right featured image?
There is some science behind choosing a featured image. If you think about it, picking one image is harder than picking several. So, pick a couple and let your test results decide for you.
Here are three keys that will help guide your selection.
1. Pick something compelling
Your image should relate to whatever your article is about. That said, being relevant is pretty ambiguous.
This article from Inc is not directly relevant to the content, but our brains are designed to make connections.

As long as you can relate it in some way, you’re probably OK, but you want your image to be compelling. Not any relevant image will do. Roy H. Williams, director of the business school The Wizard Academy, outlines a number of techniques that make images compelling.

• Silhouettes: We tend to fill in silhouettes with ourselves or our aspirational dreams.
• Portals: Our attention is drawn into doorways, tunnels, windows and openings.
• Cropped Images: When we are only given a piece of the image, we fill in the missing parts.
• Faces: We stare at the human face. This can work against our headlines.
Pro tip: If you use a human face, have them looking at your headline for best resuts.

The above image may not be highly relevant, but it’s use of a silhouette is compelling.
2. Make sure it is relevant to the post
Your headline and featured image should work together to be both relevant and compelling.
Let’s look at some other examples from Inc.

Do you see how they combine relevant images with compelling headlines? It makes it hard not to click on the article.
Finally, the third important factor to consider when choosing an image is…
3. Always use high-quality images
I know you already know this, but I wanted to remind you. Nothing grinds my gears more than a blog post with a terrible, grainy image.
Now you know how to optimize individual posts for conversions, but what about a more general approach to your overall content marketing strategy?
The next element you should be testing is content length.

3. Find Your Ideal Content Length

Now we’re getting into the overall content creation process. Testing your ideal content length will give you an idea to help you create a content template for all your articles going forward.
According to a study done by Medium, the current ideal content length is about 1,600 words; or, around a 7-minute read.

However, this may not be the case for you.
Yes, the average posts that get the most shares are long, in-depth posts. But that doesn’t mean shorter posts don’t get shares as well. And more importantly, that doesn’t mean shorter posts won’t do a better job of driving qualified leads to your business.
The only way to know the optimum length of posts for your audience is to test it. In order to test the ideal length, you can take two different approaches.
The first and simplest option is to try a variety of content lengths over time and look for trends. You could publish a 1,500 word post one week, a 400 post the next week, a 4,000 word guide the following week, and an infographic the 4th week. Rinse and repeat. You should be testing out different content types anyway, and incorporating varying lengths of content within that schedule won’t require much more effort on your part.
The data you want to measure — time on page — is found easily in Google Analytics. This is a free analytics tool that any content marketer should become familiar with.
The second option is to split test a single post by sending segments of users to different length versions of the same post.
In similar fashion, test video length for views and watch times to see how long your videos should be.

4. Take Your Opt-in Forms Seriously

Opt-in or signup forms are a critical part of content marketing and sales funnels. It’s important that they are converting at the highest rate possible.
So what parts of your opt-in form can you test?
First, test the form length.
I’ve seen forms that ask for the whole shebang; everything from your full name to your phone number and more.
Believe it or not, this can work. Just take HubSpot for example. They have a ridiculous amount of free downloads, from templates to eBooks, and every one of them comes with a form like this:

I put three pages into one image because it was too big to fit in one screenshot!
Here’s the kicker: They see tremendous success with this behemoth of a form. I’ve personally filled out at least a half dozen of their forms like this for free downloads.
So, what’s the ideal form length?
Well, take a look at this chart by eloqua.

It seems the optimal number of fields is 7 because you’re getting the most information with the least drop off in conversions.
That said, you can potentially get close to a 60% conversion rate when asking for only a single piece of information.
Oddly enough, the data above suggest that having 7 form fields is better than having only 2. While this is just one study, it could mean that you’ve been asking for too little information and might want to revisit your opt-in forms.

• In general, the more form fields you have, the lower your conversion rate will be, but the quality of your list will be.

Once you’ve determined the optimal number of form fields, it’s time to test location. Test placement on the page.
Typically, forms are located:

• Place it at the top to clearly indicate that they must complete a form.
• Place it at the bottom so that they can take action after consuming your content.
• Place it in the sidebar, which is where readers look when they want to subscribe.
• Place it in the content so scanners see it.
• In a popup triggered by exit intent

Where you place your offers is as important as the length of your forms.

Try multiple locations. Personally, I like to include one in the sidebar, one on exit intent, and one either in the middle of or at the end of my content.
Don’t overwhelm your visitors with too many choices. If you have four different opt-ins, some call-to-actions, related posts, and other things to click on, they may just leave your page altogether.

Whenever you create a piece of content on your website, be it a blog post, a landing page, or even an about page, you should always ask yourself this question:

Where do we want our readers to do after reading this content?

In other words, “Where are we sending them next?”
A lot of people have no idea how to answer that question. I mean, it’s not obvious – especially when you have a lot of content you could send them to.
You might have any one (or more) of these CTAs in your content:

• Related blog posts
• A “start here” page
• A sales pitch/landing page
• An initial consultation call
• An email subscription

How do you know where to send them?
Depending on your marketing strategy, this might mean immediately collecting a lead, or it could be something else.
Let me give you an example. ChannelApe provides integrations between the systems ecommerce websites use to run their business. ChannelApe offers a free trial for their automatic supplier integration as the next step for anyone reading their list of dropshippers.

This makes sense because anyone interested in a list of dropshippers is probably also interested in integrating those dropshipper’s products with their store.
Notice how ChannelApe uses a bright orange background to help their CTA stand out from the rest of their content. Color is only one of the variables you should test on your CTAs.
In addition to CTA colors, you can also test:

• Copy
• Images
• Offers
• Positions

OK, let’s say you want to test the position of your related posts.
I know what you’re thinking.

“Bill, wouldn’t I just put related posts at the end of a blog post?”

Maybe. But what if your readers aren’t getting to the end? You don’t want them to leave, do you?
For that matter… what’s “related”? Are you tagging your posts and pages properly?
And what about the posts getting the most interaction? Don’t you think your readers would like to see those?
Or do you want to drive traffic to certain pages over others, like a “start here” page or a new blog series?
Do you see where I’m going with this?
Simply repeat this process of asking questions for every variable you may want to include, then put your answers to the test.

Let’s recap:

• You want your headlines and featured images to be relevant and compelling.
• The “ideal” content length is 1,600 words, but you shouldn’t blindly follow that number.
• The position and length of opt-in forms matters.
• Always know where you want your visitors to go next in order to effectively use CTAs.

If there’s one thing you should take away from this post, it’s this:
Have you ever tried to split test elements of your content before? I’d love to hear. Leave a comment below and let me know!

Bill Widmer is a freelance writer and content marketer. With over two years of experience, Bill can help you get the most out of your content marketing and blog.

10 Value Proposition Upgrades That Increased Conversions By At Least 100%

10 successful value proposition examples proven by AB testing.

Conversion Sciences has completed thousands of tests on websites of all kinds for businesses of all sizes. At times, we’ve been under pressure to show results quickly. When we want to place a bet on what to test, where do we turn?

Copy and images. These are the primary components of a website’s value proposition.

It’s the #1 factor determining your conversion rate. If you deliver a poor value proposition, there is little we can do to optimize. If you nail it, we can optimize a site to new heights.

So, I have to ask: have you ever taken the time to split test your value proposition?

This article shows you how to identify a poor value proposition, hypothesize a series of better alternatives, and split test them to identify the wining combination of copy, video and images.

Essential Qualities Of A Worthwhile Value Proposition

Your value proposition is a statement, which can be made up of the following elements:

• Copy
• Bullet points
• Images or Graphics
• Video

Words carry tremendous power, but they aren’t the only element you can employ in promising defined value to potential customers. A value proposition can be made up of any of the above elements, as well as others I’ve no doubt failed to mention.

To be effective, your value proposition should include the following characteristics:

1. Conveys a clear, easily understood message
3. Explicitly targets a specific audience segment
4. Makes a clear promise regarding the benefits being delivered

Hopefully, these criteria are making you aware of what your value proposition is not. It is not a slogan, creative phrase, or teaser.

The best way to demonstrate this is to show you some real examples of businesses that improved their conversion rates by upgrading their value propositions.

Let’s get started.

Example #1: Groove Increases Conversions By 104%

Groove is simple help desk software. It’s a streamlined product designed to help smaller teams provide personalized customer support without learning and configuring something more complicated like Zendesk.

Groove’s original homepage was converting at only 2.3%.

Groove SaaS and eCommerce Customer Support Value Proposition

After reaching out to several experts for help, they received the following advice:

“You’re talking to your customers the way you think marketers are supposed to talk. It’s marketing-speak, and people hate that… you should talk like your customers do”

With this in mind, the Groove team spent some time talking to various customers over the phone in order to get a feel for how those customers were talking about Groove and the actual words they were using.

They also changed their opening autoresponder email to the following, which ended up generating an astounding 41% response rate and becoming a prime, continuous source of qualitative data for the business:

Groove welcome email established their value proposition.

As a result of this feedback, they created a new “copy first” landing page, with a completely revamped value proposition.

Groove created a ‘copy first’ landing page based on feedback from customers

After testing the new page against the original, Groove found that it converted at 4.3% for an 87% improvement. After running additional tests with more minor tweaks over the next two weeks, the conversion rate ultimately settled at 4.7%, bringing the total improvement to 104%.

Key Takeaways

So what can we learn from Groove’s big win?

• Benefit-driven headlines perform better than headlines simply stating the product category.
• The subheading is not a good place for a testimonial. You need to explain your value before you bring in proof to verify your claims.
• Notice how the new headline explains a bit of the “how and what” while still keeping the customer in focus.
• While Groove doesn’t explicitly define the target audience within the headine and subheading, they do accomplish this via the above-the-fold bullet point and video testimonial.

Example #2: Comnio Increases Signups By 408%

Comnio is a reputation management company that helps both consumers and businesses resolve customer service issues.

After transitioning away from a moderately NSFW branding strategy, the company needed a new way to communicate it’s value and attract users. After the page below failed to convert, they contacted Conversion Sciences’ Brian Massey for a CRO strategy consultation.

Comnio’s landing page failed to convert well

Brian helped the team come up with a new version:

“My recommendations were to focus on the company less and on what will happen more and to use a hero image that is more relevant. By September 2015, the homepage was taking a different approach, focusing on the service value and defining the steps that make it work.”

Comnio’s new landing page performed at a high rate

This new page was a definite improvement over the previous version, and over the next 30 days, it converted a respectable 3.6% of site visits.

That said, there were still some clear problems, the most obvious being that the opening headline and subheadline were failing to make a clear promise. In order to optimize this page, Comnio implemented the following changes:

1. Changed the headline to explain what they do (as a benefit, not a feature)
2. Changed the subheadline to explain the pains/problems Comnio solves for users
6. Swapped out the position of company logos with the position of user testimonials
7. Added a gradient line below the hero shot to separate it from the rest of the page

The new page looked like this:

Comnio further refined the landing page with a significantly higher conversion rate

Thanks in large part to a strong headline, this new page converted at an incredible 18.3% over its 30-day test, a 408% increase over the previous version.

It’s also worth noting that 49% of new signups used one the social signup options available on the new page.

Key Takeaways

So what can we learn from Comnio’s huge conversion spike? Whenever this many changes are implemented in one test, it hurts our ability to make specific conclusions, but here’s what I’m seeing:

• The new headline isn’t cute, catchy, or cool. It’s a simple, definitive statement, and that’s exactly why it works so well.
• Directly addressing emotional customer pain points (no waiting, no repeating yourself) within your value proposition can have a MASSIVE impact on your conversion rate.
• Signup friction can significantly decrease your conversion rate. Considering half signups on the new page occurred via the social buttons, it would make sense to assume this feature was a big part of the conversion boost.
• Brian also noted that the social signup buttons themselves could have served as social proof, borrowing trust from Facebook and Twitter.

Example #3: Udemy Increases Clicks By 246%

Udemy is a massive marketplace for online courses on everything you can imagine.

And while the company’s meteoric growth is certainly a testament to their product-market fit and understanding of their own value proposition, until somewhat recently, the individual course pages were very poorly optimized.

Until this last year, Udemy course pages looked like this:

Udemy landing page that needed higher conversion rates

If I’m trying to sell my course via this page, there are a number of major problems diminishing my conversion rate.

• Udemy is essentially stealing the headline of the page with it’s bold “You can learn anything…” banner. If I’m on this page, I either clicked here through a direct-link or through Udemy’s browser, and in neither case, does it make sense to tell me about Udemy’s 10,000 courses.
• With 3 columns, I have no clue where to look first. Where is the value proposition?
• I can barely even tell the green rectangle on the right is supposed to be a CTA button.

While Vanessa’s course does have a value proposition, it certainly isn’t laid out in a way that makes it defined or obvious.

Eventually, Udemy caught-on to this a tested a special layout:

Udemny redesigned landing page employing user testing

Unlike the old page, this version has a very clear value proposition, with the headline, subheadline, video and CTA all clearly displayed without distraction.

Most importantly, this new landing page receives 246% more click-throughs than the old course landing page.

Udemy also altered their normal course landing pages to incorporate some of these elements, putting the headline, subheadline and promo video front and center, with a much more obvious CTA button and all additional information below the fold.

Udemy used the same techniques to update their course page.

Key Takeaways

So what can we learn from Udemy’s landing page improvements?

• Layout is extremely important.
• Limiting your hero shot to only the core elements of your value proposition will virtually always serve you better than throwing up a bunch of info and letting the reader decide what to read first.
• Unless you are working with some sort of advanced interactive technology, it’s important that you take visitors through a linear journey, where you control the narrative they follow through your page.

160 Driving Academy is an Illinois based firm that offers truck-driving classes and guarantees a job upon graduation.

In order to improve the conversion rate on their truck-driving classes page, the company reached out to Spectrum, a lead-generation marketing company. Spectrum’s team quickly noted that the page’s stock photo was sub-optimal.

160 Driving Academy original landing page with stock photo.

The team had a real image of an actual student available to test, but almost didn’t test it out.

“… in this case we had a branded photo of an actual 160 Driving Academy student standing in front of a truck available, but we originally opted not to use it for the page out of concern that the student’s ‘University of Florida’ sweatshirt would send the wrong message to consumers trying to obtain an Illinois, Missouri, or Iowa license. (These states are about 2,000 kilometers from the University of Florida).”

Ultimately, they decided to go ahead and test the real student photo anyway and simply photoshopped the logo off the sweatshirt:

Revised landing page with picture of actual student.

The primary goal of this test was to increase the number of visitors who converted into leads via the contact form to the right of the page, and this simple change resulted in an incredible 161% conversion lift with 98% confidence.

The change also resulted in a 38.4% increase (also 98% confidence) in actual class registrations via this page!

Not bad for a simple photo change.

Key Takeaways

So what can we learn from this case study? Yes, stock photos tend to be poor performers, but why?

The answer lies in how our brains respond to images. Essentially, our brains are far more likely to notice and remember images versus words, but these advantages tend not to apply to stock photos, as our brains have learned to automatically ignore them.

For a more in-depth breakdown of this subject, check out this writeup from VWO.

Example #5: The HOTH Increases Leads By 844%

The HOTH is a white label SEO service company, providing link building services to agencies and SEO resellers.

Despite having what most would consider a solid homepage, their conversion rate was sitting at a very poor 1.34%. It started with the following value proposition and then followed a fairly standard landing page flow:

The Hoth homepage had a low conversion rate.

While their landing page wasn’t bad as a whole, you may be noticing that their value proposition was a bit vague and hinged primarily on the assumption that incoming users would click and watch the video.

The HOTH team decided to make a big changeup, and completely scrapped the entire landing page, replacing it with a new headline, subheadline and…. that’s it.

The Hoth made a big change to their landing page.

Behold, the brilliant new landing page!

And while you might be tempted to laugh, this new variation converted at 13.13%, an incredible 844% increase from the original!

Key Takeaways

So what can we learn from this?

• For certain audiences, saying less and creating a curiosity gap might encourage them to give you their contact info

Example #6: Conversioner Client Increases Revenue By 65%

So yes, I know that 65% is not quite the 100% I promised you in the headline, but let’s be honest, you aren’t scoffing at a 65% increase in actual revenue.

This case study comes from Conversioner and features an unnamed client whose product enables customers to design & personalize their own invitations, greeting cards, slideshows, etc.

The client’s original homepage looked like this:

The original homepage an invitation service

At first glance, this value proposition really isn’t that bad.

Sure, they are focused primarily with themselves in the headline, but they sort of make up for it in the subheadline by discussing the direct customer benefits, right?

“Delight guests with a unique invite they won’t forget.”

There’s just one really big problem here. These customer benefits have nothing to do with the customer or the benefits.

“Delight your guests”… who talks like that? Nobody. Nobody talks like that. When you are thinking about sending out invites, you aren’t thinking, “Hmmm how can I delight my guests?”

But we aren’t done: “… a unique invite they won’t forget.”

This copy is completely disconnected from the target consumer. Why do people send invites? Is it so their guests will never forget the invites?

No. The best possible invite is one that gets people to your event. That’s it. Your goal is a great party. Your goal is a bunch of fun people at your great party. That’s the primary metric, and it isn’t even addressed in this value proposition.

Which is why the Conversioner team made a change:

The revised homepage of the invitations service

Notice that this new variation doesn’t completely abandon the “what we do” portion of the value proposition. It is still communicating exactly what is being offered from the get-go.

“Create Free Invitations”

But then it speaks to the benefits. It’s free AND it is the start of a great party.

The proof is in the pudding, and this change resulted in a 65% increase in total revenue.

Key Takeaways

So what can we learn from Conversioner’s successful experiment?

• Don’t let “benefits” become another buzzword. Focusing on benefits only matters if those benefits are relevant and important to the target audience.
• Think through what is motivating your customers outside of the immediate conversion funnel. They aren’t just signing up for your email builder. They are planning an event. Speak to that.

Example #7: The Sims 3 Increases Game Registrations By 128%

I’m guessing you’ve heard of The Sims franchise, but in case you haven’t, it’s one of the best selling computer game franchises in history.

While the third installment was sold as a standalone game, the business model relied heavily on in-game micro-transactions. But in order to begin making these in-game purchases, users needed to first register the game.

The Sims’ marketing team found that once players had registered, they were significantly easier to convert into repeat buyers. Registrations were primarily solicited via the game’s launch screen, but the conversion rate was unsatisfactory.

The launch screen of the Sims 3 game.

As you can see, it’s fairly obvious why nobody was registering.

Why would they? How could they?

“Join the fun!” … what does that mean? If I’m a gamer pulling up the launch screen, I already know how to join the fun. I just click the giant Play button on the left side of the screen. And there is nothing on this screen that would cause me to pause that line of action and consider registering.

Unsurprisingly, this is exactly what WiderFunnel thought when they were hired to improve this page. They quickly realized the need to incentivize users to register and make it very clear what was being requested of them.

The team came up with 6 different variations to test. Here’s their direct commentary:

1. Variations A1 & A2: ‘simple’: These two test Variations emphasized the overall benefits of game registration and online play. Much of the control page’s content was removed in order to improve eyeflow, a new headline with a game tips & content offer was added, a credibility indicator was included and the call-to-action was made clear and prominent. Both A1 and A2 Variations were identical except for background color which was white on one Variation and blue on the other.
2. Variation B: ‘shop’: This Variation was similar to Variations A1 and A2 in that it was focused on the overall benefits of registering and emphasized free content in its offer. In addition, this Variation included links to The Sims 3 Store where players can buy game content and to the Exchange where players can download free content.
3. Variation C: ‘free stuff’: In this Variation, the headline was changed to emphasize a free content offer and the subhead highlighted a more specific offer to receive free points and a free town upon registering. Links to The Sims 3 Store and the Exchange were also included in this variation but benefit-oriented bullet points were removed to keep copy to a minimum.
4. Variation D: ‘free town’: This test Variation was focused on a specific offer to receive a free Sims town upon registering. The offer was prominent in the headline and echoed in the background image. General benefits of game registration were listed in the form of bullet points.
5. Variation E: ‘free points’: As with Variation D, this Variation put the emphasis on a specific offer for 1,000 free SimPoints and the imagery depicted content that could be downloaded by redeeming points.

#4 converted best, bringing in 128% more registrations than the original.

This version of the Sims 3 launch page performed best.

While this isn’t surprising, it serves to illustrate how simple conversion optimization can be. It’s really just a matter of giving people what they want. Sometimes, identifying what that is will be challenging. And sometimes, it will take a bit of digging.

Key Takeaways

So what should we learn from this?

• Give the people what they want! What do your users want and how can you give it to them?
• Be specific with the benefits you are promising. “Join the fun” is not anything. “Get Riverview FREE” is specific.
• Make your CTA obvious. If your #1 goal is to make someone take _______ action, everything about your landing page should make that obvious.

Alpha Software is a software company with a number of product offerings, the most recent of which deals with mobile app development.

The company wanted to improve results for one of it’s product landing pages, pictured below:

The Alpha landing page for mobile app development.

They tested it against the following simplified page:

An alternate design for the Alpha landing page.

This new streamlined version resulted in 98% more trial signups than the original. That’s a pretty drastic improvement considering the changes can be summed up in two bullet points:

• Bullets expanded and tidied up

And this isn’t the only case study where the removal of navigation resulted in an uptick in conversions. It’s actually pretty common.

In a similar test by Infusionsoft, a page with secondary navigation between the headline and the rest of the value proposition…

… was tested against the same page, minus the nav bar, with different CTA text:

This version of the Infusionsoft page has no menu below the headline

The simplified page with no extra navigation bar had 40.6% more conversions at a 99.3% confidence level.

While I think the CTA change definitely played a role in these results, it’s very important for marketers to streamline the navigation of their landing pages (and their websites as a whole).

Key Takeaways

So why did I include this in our list?

• Distraction is a big deal when it comes to framing your value proposition. Remove distractions, even if that means eliminating basic site navigation options.
• Don’t be afraid of bullet points. They tend to be used in hero shots nowadays, but they can be a great option when you can’t get fit everything you need in the headline and subheadline.

Example #9: HubSpot Client Improves Conversions By 106%

For our next to last example, I want to look at a client case study released by HubSpot awhile back. This unnamed client had a homepage converting poorly at less than 2% and had decided it was time to take optimization seriously.

The client looked through several landing page best practices and decided to make some critical adjustments to their page.

The 1st change was to replace the original vague headline with a clear new headline and benefit-driven subheadline:

Two versions of a landing page with different headline designs.

The 2nd change was to add a single, obvious CTA instead of offering a buffet of product options for visitors to select from.

Two versions of a landing page with the call to action higher on the page.

The 3rd change was to move individual product selections down below the hero shot. The new page started with a single value proposition and then allowed users to navigate to specific products.

The result of these three changes was a 106% lift in page conversions.

The results of this landing page AB test.

The main issue I want to address this with study is the question of “Should we try to convert first or segment first?”

In my professional experience, combined with the many studies I’ve reviewed, it’s usually better for every page to have a clear, singular direction to begin with and then go into multiple navigation or segmentation options.

Another test that speaks to this comes from Behave.com (formerly WhichTestWon). The marketing team from fashion retailer Express had an exciting idea to test a new homepage that immediately segmented users based on whether they were looking for women’s clothing or men’s clothing.

This Express homepage tries to segment men and women right away.

They tested this against their original page that pitched the current discount in circulation and offered a singular value proposition:

This Express homepage converted better than the segmented one.

The segmented test page converted poorly compared to the original, with the following results at a 98% confidence level:

• 2.01% decline in product views, per visit
• 4.43% drop in cart additions, per visit
• 10.59% plummet in overall orders, per visit

Key Takeaways

So what can we learn from these two case studies?

• Give people a reason to stay before you give them multiple navigation options to select from.
• In a similar vein, the less options you give people, the more likely they are to convert in the way you are looking for. Offering a single CTA is always worth testing.
• The more of the Who, What, Where and Why you can explain in your value proposition, the better chance you have of resonating with new visitors.

Example #10: TruckersReport Increases Leads By 79.3%

TruckersReport is a network of professional truck drivers, connected by a trucking industry forum that brings in over 1 million visitors per month.

One of the services they provide is assistance in helping truck drivers find better jobs. The conversion funnel for this service began with a simple online form that was followed by a 4-step resume submission process.

The initial landing page was converting at 12.1%:

Truckers report landing page.

ConversionXL was brought in to optimize this funnel, and after analyzing site data and running several qualitative tests with a few of the most recommended AB testing tools, they came up with the following insights:

• Mobile visits (smartphones + tablets) formed about 50% of the total traffic. Truck drivers were using the site while on the road! –> Need responsive design
• Weak headline, no benefit –> Need a better headline that includes a benefit, addresses main pain-points or wants
• Cheesy stock photo, the good old handshake –> Need a better photo that people would relate to
• Simple, but boring design that might just look too basic and amateur –> Improve the design to create better first impressions
• Lack of proof, credibility –> Add some
• Drivers wanted 3 things the most: better pay, more benefits and more home time. Other things in the list were better working hours, well-maintained equipment, respect from the employer. Many were jaded by empty promises and had negative associations with recruiters.

Using these insights, they created and tested 6 different variations, ultimately landing on the following page:

Three views of the redesigned Truckers Report homepage.

This new page saw a conversion lift of 79.3% (yes, I know I fudged on the 100% think again… sorry not sorry). Instead of trying to explain why, I’ll simply quote Peep Laja:

• Prominent headline that would be #1 in visual hierarchy
• Explanatory paragraph right underneath to explain what the page is about
• Large background images tend to work well as attention-grabbers
• Warm, smiling people that look you in the eye also help with attention
• Left side of the screen gets more attention, so we kept copy on the left
• As per Gutenberg diagram, bottom right is the terminal area, so that explains the form and call to action placement.

The team also optimized the entire funnel, but since our focus is on value propositions today, I’ll simply direct you to Peep’s writeup for the full story.

Key Takeaways

So what are our value proposition takeaways?

• Start with the benefits. I can’t say this enough. What does your target audience want most? Tell them about that right off the bat.
• Eliminate uncertainty. When you tell people exactly what to expect, it builds trust. Notice the “1. 2. 3.” on the new page. If you are going to require something from the user, tell them exactly what to expect from the beginning.
• If you aren’t mindful of how your value proposition is displaying to mobile users, change that now. You can’t afford to ignore mobile traffic, and you should be split testing mobile users separately from desktop users.

10 Value Proposition Examples With 28 Takeaways

Optimizing your value proposition is a low hanging fruit that can have a tremendous impact on your website. It’s also a core consideration in a good AB testing framework.

We’ve covered 28 different takeaways in this article, and for you convenience, I’ve gone ahead and put them into an easy cheat sheet you can download via the form below.

The 20 Most Recommended AB Testing Tools By Leading CRO Experts

There are a ton of AB testing tools on the market right now, and that number is only going to increase.

When evaluating these tools for use in your own business, it can be difficult to wade through the marketing rhetoric and identify exactly which tools are a good fit. That’s why we reached out to our network of CRO specialists in order to bring you a comprehensive look at the best AB testing tools on the market.

Our goal here isn’t necessarily to give you a complete review of each tool, but rather, to show you which split testing tools are preferred by fulltime CRO experts – people whose businesses depend completely on the results they are able to deliver to their clients.

We’ll cover two primary categories of tools:

1. Tools for running the actual AB tests
2. Tools for collecting data in order to make good hypotheses

At the end of the day, the “right” tool is going to vary depending on the business. As Paul Rouke explains:

We see it time and time again: companies sign up to multi-year contracts for feature rich, enterprise level tools which have a fantastic looking client list, and it ends up burning through their entire CRO budget. Companies invest without considering the need for resource and skills, or they are simply sold on the tool’s ‘ease of use’.

Many companies don’t have the internal skills in place yet to actually utilize this tool, and so the all-singing, all-dancing tool hardly gets used. Also, people using the tool don’t understand the need for or cost of customer research, data, psychology, design, UX principles etc., meaning they’re ultimately testing the wrong things.

The tools that in my experience deliver the most long-term value are those which are reasonably priced, allowing companies to spend more of their budget on making sure they are testing intelligently and developing an effective testing process.

No tool on this list will be the right fit for every business. That said, without breaking up our list into tiers, we would like to note 4 tools that came up very consistently from the experts we queried.

The two most popular AB testing tools by a wide margin were Optimizely and VWO. These are the most common AB testing tools used by Conversion Sciences clients, and virtually every single expert we chatted with is using both of these tools on a regular basis.

Another two tools that came up frequently (in about a third of responses), were Convert Experiences and UsabilityHub. Both of these tools received consistently strong reviews from the experts who used them and fill key needs in the CRO space, which we’ll discuss in their respective entries.

Without further ado, let’s talk a look at our list of recommended AB testing tools.

AB Testing Tools

The following tools are our experts’ recommended options for running AB tests. We’ve listed these in order of frequency with which they were mentioned by our experts. This is not to be confused as a ranking by quality.

1. Optimizely
2. VWO
3. Convert Experiences
4. SiteSpect
5. AB Tasty
6. Sentient Ascend
8. Qubit
10. Marketing Tools With Built-In Testing

Optimizely

Optimizely is basically the big kid on campus. It’s our experts’ go-to choice for working with enterprise level clients, and despite the significant price increases over the years, it remains the king.

It’s also reasonably user friendly for such a complex tool, as Shanelle Mullin summarizes:

Optimizely is the leading A/B testing tool by a fairly large margin. It’s easy to use – you don’t need to be technical to launch small tests – and the Stats Engine makes testing easier for beginners.

Since the Conversion Sciences team uses this tool every single day, I asked them to give me a few thoughts on what they like and dislike about it.

According to the team, Optimizely offers some of the following benefits.

• Easy editing access through the dashboard
• Retroactive filtering (i.e. IP addresses)
• Intuitive data display and goal comparison
• Saved Audiences (not available in VWO)
• Great integration with 3rd party tools

AB testing softare Optimizely dashboard

On the flip side, Optimizely is bit lacking in these ways:

• Test setup is not as intuitive compared to other tools
• Slow updates for saved changes to the CDN
• Doesn’t carry through query params/cookies within a certain test
• Targeting is more difficult

Optimizely’s multivariate testing setup is simple and intuitive, and it’s the leading split testing tool for a reason. For businesses with the budget and team to utilize Optimizely to its fullest potential, it is clearly a must-own.

VWO

AB Testing Tool VWO Dashboard

Coming in just behind Optimizely in the AB testing pantheon is Visual Website Optimizer (VWO). VWO is incredibly popular in the marketing space, and in addition to serving as a top choice for businesses with smaller budgets, it is also frequently used in conjunction with Optimizely by businesses who run complex testing campaigns.

According to the Conversion Sciences team, VWO offers some of the following benefits as compared to Optimizely:

More intuitive interface with color coding

• Easier goal setup
• Better customer support

On the flip side, VWO is lacking in the following areas:

• Can’t view goal reports all at once, which makes them harder to compare
• No saved targeting, so you must start fresh with each test unless you clone
• No cumulative CR graph if you have low traffic (or what VWO considers low traffic). Instead it gives CR ranges. You must export the data to get any usable information.

This perspective is mostly shared by the ConversionXL team as well, as explained by Shanelle Mullin:

VWO is very easy to use, especially with its WYSIWYG editor. They have something similar to Optimizely’s Stats Engine called Smart Stats, which is based on Bayesian decisions. VWO also offers heatmaps, clickmaps, personalization tools and on-page surveys.

Overall, VWO is in intriguing solo option for smaller to midsized businesses and also works very well in conjunction with Optimizely for enterprise clients.

Convert Experiences

AB Testing Tool Convert Experiences Screenshot

While Optimizely and VWO were the tools most commonly mentioned, Convert Experiences received some of the most effusive praise from those who had worked with it.

It seems to have hit a sweet spot for SME/SMBs, combining an exceptional power-to-price ratio with an intuitive interface and highly regarded customer support.

We are platform agnostic, so if our client already has a tool in use, then we try to use that.  But in cases where the client has never done any testing before, we typically look first to use Convert (convert.com).  I like Convert for a number of reasons.  From the very beginning, it has been one of the easiest tools to integrate with Google Analytics.  Also, for tricky variations, I’ve had better luck with Convert than others (Optimizely) at getting the variation to display just the way we want.  And the support at Convert has always been excellent—again, better than most of their competitors.

We focus on small to medium size clients, and Convert is excellent for that segment with flexible pricing.  It’s a great solution for small businesses doing in-house conversion optimization, but it can also work very well for agencies.

– Tom Bowen, Web Site Optimizers

Convert Experiences also stood out as the type of tool that catches new fans wherever it’s discovered, leading me to believe that it will continue to grow and pick up market share.

We have come across convert.com more and more in recent months working on client campaigns.  If you are a true marketer and want actionable data then they are a good choice.  The user interface is actually pretty good and you can actually understand the data they give you on experiments.  They run on the typical drag and drop style experiment setup engine that most others do and can be manipulated even if you aren’t a technical wizard.

The price isn’t too bad either as they fall somewhere in the middle of Optimizely and VWO.  I would recommend them to someone who has a bit of budget constraints but wants a bit more testing power.  We have used them on multi million dollar per month campaigns with much success.

– Justin Christianson, Conversion Fanatics

Convert Experiences is known for having some of the most robust multivariate testing options in it’s class. At the same time, it is also one of the few tools in its class to not offer any sort of email split testing capabilities.

Overall, it’s a highly recommended AB testing tool that is worth trying out.

Convert has great customer support (via live chat) and is easy to use. We’d recommend it to the same people who are considering using Optimizely and VWO.

– Karl Blanks, Conversion Rate Experts

SiteSpect

AB Testing software SiteSpect Example Report

SiteSpect initially distinguished itself as one of the first server-side testing solutions on the market, and it has remained a top choice for more technically sophisticated users and security-conscious clients.

For a long period, SiteSpect was one of the few platforms offering a server-side solution. This has given them a huge advantage by allowing more complex testing, by adapting to newer JavaScript technologies, and by accommodating security-conscious clients.

– Stephen Pavlovich, Conversion

SiteSpect has the advantage that it works in a different way. It’s tag-free. SiteSpect edits the HTML before it even leaves the server, rather than after it has hit the user’s browser. It tends to be popular with companies that want to self-host and are technically sophisticated.

– Karl Blanks, Conversion Rate Experts

As a server-side testing solution, SiteSpect avoids many of the issues that can arise with the more typical browser-based testing platforms that utilize javascript tags.

• Tag-based solutions typically charge by the number of tag calls you make, even if those tags don’t end up being used.
• Tag-based solutions often require third-party cookies, which certain browsers or browser settings might not support, causing you to lose the ability to test a large percentage of traffic.
• Tag-based solutions can have imprecise reporting because the javascript doesn’t always fire.

While this value proposition won’t be the deciding factor for many businesses, for those requiring a server-side solution, SiteSpect is one of the best options on the market.

AB Tasty

AB Testing software ABTasty Example

AB Tasty is a solution for testing, re-engagement of users, and content personalisation, designed for marketing teams. Paul Rouke had a good bit to say here, so I’m going to let him take it away.

The tools that in my experience deliver the most long-term value are those which are reasonably priced, allowing companies to spend more of their budget on making sure they are testing intelligently and developing an effective testing process. I talk about this in-depth in my article The Great Divide Between BS and Intelligent Optimization.

On this note, my favorite tool would be something like AB Tasty, which is priced sensibly, yet has a powerful platform that facilitates a wide range of testing, from simple iterative tests through to innovative tests, along with strategic tests which can help evolve a business proposition and market positioning.

I would recommend AB Tasty (and similarly Convert.com) to the following types of companies:

(1) Companies just starting to invest in conversion optimisation – they won’t break the bank, they won’t overwhelm you with add-ons you will never use as you’re starting out, but they have the capability to match your progress as you scale up your testing output

(2) Companies who have been investing in conversion optimisation but who want to start using a higher portion of their budget (75% or more) on people, skills, process and methodology in order to deliver a greater impact and ROI

(3) Companies frustrated at investing significant amounts of money in enterprise testing platforms, which aren’t being used anywhere near their potential and are taking away from the budget for investing in people, skills and developing an intelligent process for strategic optimisation

Sentient Ascend

AB Testing Software Ascend Uses Machine Learning

Sentient Ascend (formerly Digital Certainty) is a new player bringing advanced machine learning algorithms to the CRO space. Conversion Science’s own Brian Massey explains why this is a big deal:

Sentient Ascend is one of the new generation testing tools that utilize machine learning algorithms to speed multivariate testing. Evolutionary, or genetic algorithms do a better job of finding optimum combinations, isolating the richest local maximum for a solution set.

We love being able to assemble our highest rated hypotheses and throw them in the mix to have the machine sort them for us.

In the future, tools like this will let us optimize for multiple segments simultaneously. We believe this is the final step forward full time personalization solutions.

AB Testing software Google Optimize Example

Google Optimize is a split testing function of Google Analytics. If you are looking for a reasonably powerful AB testing solution with no monetary cost, look no further.

Although I am not normally a big fan of Google for the sake of split testing, Google Experiments has its advantages.  The first one is that you can’t beat the price.  The second is you can have all your data in one place under your Google analytics account. There are some downfalls in that you are going to have to leverage a bit more technical gumption in setting up your experiments and the overall process might take a little longer, but if you are looking for a low barrier to entry in your testing then this is a good place to start.

– Justin Christianson, Conversion Fanatics

As Justin eluded to, Google Experiments is going to be particularly useful to avid Google Analytics users who have the ability to utilize its more complicated features.

Qubit

Testing Platform Qubit Example Screen Capture

Qubit is a testing platform focused primarily on personalization. Accordingly, it has some of the strongest segmentation capabilities of any tool on this list.

Qubit has a strong focus on granular segmentation – and the suite covering analytics through to testing gives it an advantage. They’ve now broken out of their traditional retail focus to become a strong personalisation platform across sectors.

– Stephen Pavlovich, Conversion

If advanced segmentation or personalization are a priority for your business or clients, Qubit is a tool worth checking out.

Long known for being the most expensive AB testing tool on the market, the benefits of using Adobe Target can be summed up in this one sentence from Alex Harris:

Just for good measure, here’s Stephen Pavlovich to reiterate the point:

I like Adobe Target. The integration of Adobe Analytics and Target is strong – especially being able to push data two-ways. And the fact that Target is normally an inexpensive upsell for Analytics customers is a bonus.

Marketing Tools With Built-In Testing

In addition to dedicated AB testing tools, there are some great marketing tools out there that include built-in split testing capabilities. This is fairly common with tools like landing page builders, email service providers, or lead capture solutions.

As Justin Christianson explains, there are some positives and negatives to relying on these built-in tools:

Most page builders out there such as LeadPages and Instapage have split testing capabilities built into their platforms.  The problem is you don’t have much control over the goals measured and the adaptability to test more complex elements.  The good thing is they are extremely easy to setup and use for those quick and dirty type tests.  I recommend the use of this to just get some tests up and running, as constantly testing is extremely important.  If you are currently using a platform with these native testing capabilities then this is a good place to start.

One particular tool that was highlighted by several of our experts was Unbounce, one of the web’s more popular landing page builders.

I also like Unbounce, and not just because I like Oli Gardner. It seems most everyone there lives and breathes landing pages, so the expertise that comes with the tool is virtually unmatched.  Their support is also excellent.  Unbounce works really well when we’re creating a new landing page from scratch and want to try different variations, since it’s so easy to create brand new pages using the tool.

– Tom Bowen, Web Site Optimizers

Unbounce is an excellent tool for A/B testing your landing pages. While many landing page tools also offer A/B testing, I think Unbounce has the best and most flexible page editor when creating variations of your pages to be tested, and their landing page templates have the most CRO best practices included already.

Unbounce is outstanding for online marketing teams that want the most flexibility when creating and A/B testing their landing pages – many other landing page tools are limited to a fixed grid system which makes it much harder to make changes.

Rich Page

Another popular tool was OptinMonster, which began as a popular popup tool and has evolved into a more fully featured lead generation software.

Optin Monster is an outstanding tool that lets you easily A/B test visitor opt-in incentives to see which converts best – not only headlines, images and CTAs, but also which types perform best (like a discount versus a free guide). In particular it offers great customization options and many popup styles, and exit intent popups.

Optin Monster is particularly useful for the many website marketers who don’t have enough traffic to do formal A/B testing (using tools like Optimizely or VWO) but still want to get a better idea of their best performing content variations. It has great pricing options suitable for online businesses on a low budget.

– Rich Page

Tools For Gathering Data

As every good split tester knows, your AB tests are only as good as the hypotheses you are testing. The following tools represent our experts’ favorite choices for collecting data to fuel effective AB tests.

1. UsabilityHub
3. Crazy Egg
4. UserTesting.com
5. Lucky Orange
6. ClickTale
7. HotJar
8. Inspectlet
9. SessionCam

UsabilityHub

User testing platform UsabilityHub

UsabilityHub was by far the most frequently mentioned analytics tool by our group of CRO experts. UsabilityHub is a collection of 5 usability tests that can be administered to visitors in order to collect key insights.

UsabilityHub is great for clarity testing and getting quick indications of potential improvements. It is also great for uncovering personal biases in the creation of page variations. I would recommend it to anyone doing conversion optimization or even basic usability testing.

– Craig Andrews, allies4me

While many of the tools on this list deal primarily with quantitative data, UsabilityHub offers uniquely efficient ways to collect valuable qualitative data.

Once I’ve identified underperforming pags, the next step is to figure out what’s wrong with those pages by gathering qualitative data. For top landing pages, including the homepage, I like to run one of UsabilityHub’s “5 Second Tests” to gauge whether people understand the product or service offered. The first question I always ask is “what do you think this company sells?”. I’ve gotten some surprisingly bad results, where large numbers of respondents gave the wrong answer. In these cases, running a simple A/B test on a headline and/or hero shot to clarify what the company does is an easy win.

– Theresa Baiocco, Conversion Max

It also can be a cost-effective alternative if your website doesn’t get enough traffic to facilitate use of an actual split testing tool.

UsabilityHub is essential if you want to do A/B testing but your website doesn’t have enough traffic to do so. Instead it enables you to show your proposed page improvements to testers (including your visitors) to get their quick feedback, particularly using the highly useful ‘Question Test’ and ‘Preference Test’ features.

UsabilityHub can be particularly useful for the many website marketers who don’t have enough traffic to do formal A/B testing (using tools like Optimizely or VWO) but still want to get a better idea of their best performing content variations.

– Rich Page

Analytics platform Google Analytics Screen Capture

To the surprise of exactly no one, Google Analytics was high up on the list of recommended analytics tools. Yet despite its popularity, very few marketers or business owners are using this free tool to its full potential.

Theresa Baiocco makes the follow recommendations for getting started:

There’s so much data in Google Analytics that it’s easy to suffer from paralysis by analysis. It helps to have a few reports you use regularly and know what you’re looking for before jumping in. The obvious reports for finding the most problematic pages in your funnel are the funnel visualization and goal flow reports. But I also like to look at top landing pages, and using the “comparison” view, I see which of them have higher bounce rates than average for the site. Those 3 reports together are a good starting point for identifying which pages to work on first.

When it comes to applying Google Analytics to your AB testing efforts, John Ekman of Conversionista offers some advice:

Most of the AB testing tools provide an easy integration with Google Analytics. Do not miss this opportunity in your AB testing setup!

When you integrate your testing tool with GA it means that you will be able to break down your test results and look at A vs. B in all dimensions available in GA. You will be able to see behavior segmented by device, returning vs new visitors, geography etc.

For example: if you are using Enhanced Ecommerce setup for GA you will be able to compare your E-commerce funnel for the A version vs. the B version. Maybe the A version gets more add to carts, but then that effect withers off and the result in the checkout is the same?!

Example of Google Analytics ecommerce report for AB test variation

Word of warning: as soon as you start segmenting your data you might lose statistical significance in the underlying segments. Even if your AB test results are statistically significant on the overall level that does not mean that the deviations you see in smaller segments of your test data are significant. The smaller the data sample size, the harder it is to reach significance. What you think is a strong signal is just some data noise.

For those interested in tapping into the full potential of Google Analytics, here’s everything you’ll ever need.

Crazy Egg

User intelligence tool Crazy Egg Confetti report

Crazy Egg is one of the more popular heatmap and click-tracking tools online, thanks to an attractive interface, an affordable price point, and a deceptively powerful feature set.

Crazy Egg is a highly recommended budget tool by Brian Massey and the Conversion Sciences team, who had the following to say:

Crazy Egg offers tools to help you visually identify the most popular areas of your page, help you see which parts of your pages are working and which ones are not, and give you greater insight as to what your users are doing on your pages via both mobile and desktop sites.

UserTesting.com

User testing platform UserTesting.com

UserTesting.com is a unique service that provies videos of real users in your target market experiencing your site and talking through what they’re thinking.

This service is recommended by Craig Andrews, who had the following to say:

UserTesting.com is great for hypothesis generation and uncovering personal biases. It is an absolutely fantastic tool for persuading clients on the reality and importance of certain site issues, and I would recommend it to anyone doing conversion optimization or even basic usability testing

Lucky Orange

Lucky Orange is kind of like Crazy Egg with a bit of UserTesting.com, a bit of The Godfather, and a bit of a hundred other things. It’s a surprisingly diverse package of conversion features that make you start to believe their claim as “the original all-in-one conversion optimization suite”, despite the incredibly low price point.

Despite the hundred new tools that have popped up since Lucky Orange hit the market, Theresa Baiocco still swears by the original:

No testing program is complete without analyzing how users behave on the site. Optimizers all have their favorite tools for gathering this data, and while the newest and hottest kid on the block is Hotjar, I still like using my old go-to: Lucky Orange. Starting at just \$10/month, Lucky Orange gives you visitor recordings, conversion funnel reports, form analytics, polls, chat, and heat maps of clicks, scroll depth, and mouse movements – all in one place.

ClickTale

Heatmapping and session recording tool ClickTale dashboard

Clicktale is a cloud-based analytic system that allows you to visualize your customer’s experience on your website from their perspective. It’s an enterprise-level tool that combines session recording with click and scroll tracking, and while it comes with an enterprise price tag, it’s made some significant quality strides over the last few years.

As Dieter Davis summarized recently for UX Magazine:

There has been a huge improvement in Clicktale over the past three years, in tracking, reporting and accuracy. If you want “any old session recording JS”, boxed-product application out there, there are a variety of options. If you want accurate rendering that is linked to your existing analytics and a company that will help you tune as your own website evolves, then Clicktale is a good choice. It’s the one I’ve chosen as I wouldn’t want to risk the privacy of my customers or risk degrading the performance of my website. Clicktale also gives me a representative sample that is accurate by resolution and responsive design.

Hotjar

Hotjar offers heatmap reports, session recordings, polls, surveys and more

HotJar is the latest SaaS success story to blaze its way across the web. It’s a jack of all trades type tool: an all-in-one tool that does heatmaps, scroll tracking, recordings, funnel tracking, form analysis, feedback polls, surveys, and more.

And from what a few of our conversion experts have seen so far, it does all of those things about as well as you would expect from a jack of all trades.

On the plus side, Hotjar has prioritized creating an exceptional user experience, so if you are a solo blogger wanting a feature-rich, easy-to-use toolkit in one place with a reasonable price tag, Hotjar might be the perfect choice for you.

Stephen Esketzis had the following to say about his experience with the tool:

So overall, HotJar really is a great tool with a lot of value to offer any online business (or website in general at that). There’s not many businesses that work online I wouldn’t recommend this tool to.

With a no-brainer price point (and even a free plan) it’s pretty hard to go wrong.

Inspectlet

Session recording software Inspectlet

Inspectlet is primarily a session recording tool with additional heatmaps as well. Here’s what Anders Toxboe had to say about it in a recent review:

Inspectlet is simple to use. It gets out of the way in order to let the user do what he or she needs. The simple funnel analysis and filtering options is a breeze to use and covered my basic needs.Inspectlet does what it does good with a few minor glitches. It doesn’t have the newer features that have started appearing lately such as watching live recordings, live chatting, surveys, and polls.

In other words, Inspectlet is an easy-to-use, budget-friendly session recording tool that might be right for you depending on your needs.

SessionCam

Session recording software SessionCam offers a Suffer Score

SessionCam is a session recording tool that has also added heatmaps form analytics to its offering. It’s a classic example of a tool that combines better-than-average functionality with a more-difficult-than-average user interface.

Peter Hornsby had the following to say in his review for UXmatters:

SessionCam provides a lot of useful functionality, but its user interface isn’t the easiest to learn or use. Getting the most out of it requires a nontrivial investment of time.

And later:

UX designers have long known that, where there is internal resistance to change, showing stakeholders clear evidence of users experiencing problems can be a powerful tool in persuading them to recognize and address issues. SessionCam meets the need for a tool that provides this data in a much more dynamic, cost-effective way than using traditional observation techniques.

SessionCam [also] manages [to protect user data] effectively by masking the data that users enter into form fields, so you can put their concerns to rest.

If you are looking for a more robust session recording and form analytic tool that keeps user data safe, SessionCam is worth checking out.

Analytics platform Adobe Analytics Site Overview

Adobe Analytics is a big data analysis tool that helps CMOs understand the performance of their businesses across all digital channels. It enables real time web, mobile and social analytics across online channels, and data integration with offline and third-party sources.

In other words, Adobe Analytics is a \$100k+ per year, enterprise level analytics tool that has some serious firepower. Here’s what David Williams of ASOS.com had to say about it:

After a thorough review of the market, we chose Adobe Analytics to satisfy our current and future analytics and optimization needs. We needed a solution that could scale globally with our business, improve productivity, and offer out-of-the box integration with our key partners to deliver more value from our existing investments. Adobe’s constant pace of innovation continues to deliver value for our business, and live stream (the event firehose) is the latest capability that opens up exciting opportunities for how we engage with customers.

Conclusion

Well that’s that: 20 of the most recommended AB testing tools from a diverse collection of the web’s leading CRO experts.

Have you used any of these tools before? Do you have a favorite that wasn’t included? We’d love to hear your thoughts in the comments.

And if you are looking for a quick way to calculate how a conversion lift could increase your bottom line, be sure to check out our Optimization Calculator.

A/B Testing Statistics: An Intuitive Guide For Non-Mathematicians

A/B testing statistics made simple. A guide that will clear up some of the more confusing concepts while providing you with a solid framework to AB test effectively.

Here’s the deal. You simply cannot A/B test effectively without a sound understanding of A/B testing statistics.

And while there has been a lot of exceptional content written on AB testing statistics, I’ve found that most of these articles are either overly simplistic or they get very complex without anchoring each concept to a bigger picture.

Today, I’m going to explain the statistics of AB testing within a linear, easy-to-follow narrative. It will cover everything you need to use AB testing software effectively.

You might have been told that plugging a few numbers into a statistical significance calculator is enough to validate a test. Or perhaps you see the green “test is significant” checkmark popup on your testing dashboard and immediately begin preparing the success reports for your boss.

In other words, you might know just enough about split testing statistics to dupe yourself into making major errors, and that’s exactly what I’m hoping to save you from today.

Here’s my best attempt at making statistics intuitive.

Why Statistics Are So Important To A/B Testing

The first question that has to be asked is “Why are statistics important to AB testing?”

The answer to that questions is that AB testing is inherently a statistics-based process. The two are inseparable from each other.

An AB test is an example of statistical hypothesis testing, a process whereby a hypothesis is made about the relationship between two data sets and those data sets are then compared against each other to determine if there is a statistically significant relationship or not.

To put this in more practical terms, a prediction is made that Page Variation #B will perform better than Page Variation #A, and then data sets from both pages are observed and compared to determine if Page Variation #B is a statistically significant improvement over Page Variation #A.

This process is an example of statistical hypothesis testing.

But that’s not the whole story. The point of AB testing has absolutely nothing to do with how variations #A or #B perform. We don’t care about that.

What we care about is how our page will ultimately perform with our entire audience.

And from this birdseye view, the answer to our original question is that statistical analysis is our best tool for predicting outcomes we don’t know using information we do know.

For example, we have no way of knowing with 100% accuracy how the next 100,000 people who visit our website will behave. That is information we cannot know today, and if we were to wait o until those 100,000 people visited our site, it would be too late to optimize their experience.

What we can do is observe the next 1,000 people who visit our site and then use statistical analysis to predict how the following 99,000 will behave.

If we set things up properly, we can make that prediction with incredible accuracy, which allows us to optimize how we interact with those 99,000 visitors. This is why AB testing can be so valuable to businesses.

In short, statistical analysis allows us to use information we know to predict outcomes we don’t know with a reasonable level of accuracy.

The Complexities Of Sampling, Simplified

That seems fairly straightforward, so where does it get complicated?

The complexities arrive in all the ways a given “sample” can inaccurately represent the overall “population”, and all the things we have to do to ensure that our sample can accurately represent the population.

Let’s define some terminology real quick.

The “population” is the group we want information about. It’s the next 100,000 visitors in my previous example. When we’re testing a webpage, the true population is every future individual who will visit that page.

The “sample” is a small portion of the larger population. It’s the first 1,000 visitors we observe in my previous example.

In a perfect world, the sample would be 100% representative of the overall population.

For example:

Let’s say 10,000 out of those 100,000 visitors are going to ultimately convert into sales. Our true conversion rate would then be 10%.

In a tester’s perfect world, the mean (average) conversion rate of any sample(s) we select from the population would always be identical to the population’s true conversion rate. In other words, if you selected a sample of 10 visitors, 1 of them (10%) would buy, and if you selected a sample of 100 visitors, then 10 would buy.

But that’s not how things work in real life.

In real life, you might have only 2 out of the first 100 buy or you might have 20… or even zero. You could have a single purchase from Monday through Friday and then 30 on Saturday.

This variability across samples is expressed as a unit called the “variance”, which measures how far a random sample can differ from the true mean (average).

The Freakonomics podcast makes an excellent point about what “random” really is. If you have one person flip a coin 100 times, you would have a random list of heads or tails with a high variance.

If we write these results down, we would expect to see several examples of long streaks, five or seven or even ten heads in a row. When we think of randomness, we imagine that these streaks would be rare. Statistically, they are quite possible in such a dataset with high variance.

The higher the variance, the more variable the mean will be across samples. Variance is, in some ways, the reason statistical analysis isn’t a simple process. It’s the reason I need to write an article like this in the first place.

So it would not be impossible to take a sample of ten results that contain one of these streaks. This would certainly not be representative of the entire 100 flips of the coin, however.

Fortunately, we have a phenomenon that helps us account for variance called “regression toward the mean”.

Regression toward the mean is “the phenomenon that if a variable is extreme on its first measurement, it will tend to be closer to the average on its second measurement.”

Ultimately, this ensures that as we continue increasing the sample size and the length of observation, the mean of our observations will get closer and closer to the true mean of the population.

In other words, if we test a big enough sample for a sufficient length of time, we will get accurate “enough” results.

So what do I mean by accurate “enough”?

Understanding Confidence Intervals & Margin of Error

In order to compare two pages against each other in an Ab test, we have to first collect data on each page individually.

Typically, whatever AB testing tool you are using will automatically handle this for you, but there are some important details that can affect how you interpret results, and this is the foundation of statistical hypothesis testing, so I want to go ahead and cover this part of the process.

Let’s say you test your original page with 3,662 visitors and get 378 conversions. What is the conversion rate?

You are probably tempted to say 10.3%, but that’s inaccurate. 10.3% is simply the mean of our sample. There’s a lot more to the story.

To understand the full story, we need to understand two key terms:

1. Confidence Interval
2. Margin of Error

You may have seen something like this before in your split testing dashboard.

The original page above has a conversion rate of 10.3% plus or minus 1.0%. The 10.3% conversion rate value is the mean. The ± 1.0 % is the margin for error, and this gives us a confidence interval spanning from 9.3% to 11.3%.

10.3% ± 1.0 % at 95% confidence is our actual conversion rate for this page.

What we are saying here is that we are 95% confident that the true mean of this page is between 9.3% and 11.3%. From another angle, we are saying that if we were to take 20 total samples, we can know with complete certainty that the sample conversion rate would fall between 9.3% and 11.3% in at least 19 of those samples.

The confidence interval is an observed range in which a given percentage of test outcomes fall. We manually select our desired confidence level at the beginning of our test, and the size of the sample we need is based on our desired confidence level.

The range of our confidence level is then calculated using the mean and the margin of error.

The easiest way to demonstrate this with a visual.

The confidence level is decided upon ahead of time and based on direct observation. There is no prediction involved. In the above example, we are saying that 19 out of every 20 samples tested WILL, with 100% certainty, have an observed mean between 9.3% and 11.3%.

The upper bound of the confidence interval is found by adding the margin of error to the mean. The lower bound is found by subtracting the margin of error from the mean.

The margin for error is a function of the standard deviation, which is a function of the variance. Really all you need to know is that all of these terms are measures of variability across samples.

Confidence levels are often confused with significance levels (which we’ll discuss in the next section) due to the fact that the significance level is set based on the confidence level, usually at 95%.

You can set the confidence level to be whatever you like. If you want 99% certainty, you can achieve it, BUT it will require a significantly larger sample size. As the chart below demonstrates, diminishing returns make 99% impractical for most marketers, and 95% or even 90% is often used instead for a cost-efficient level of accuracy.

In high-stakes scenarios (live-saving medicine, for example), testers will often use 99% confidence intervals, but for the purposes of the typical CRO specialist, 95% is almost always sufficient.

Advanced testing tools will use this process to measure the sample conversion rate for both the original page AND Variation B, so it’s not something you are really going to ever have to calculate on your own, but this is how our process starts, and as we’ll see in a bit, it can impact how we compare the performance of our pages.

Once we have our conversion rates for both the pages we are testing against each other, we use statistical hypothesis testing to compare these pages and determine whether the difference is statistically significant.

It’s important to understand the confidence levels your AB testing tools are using and to keep an eye on the confidence intervals of your pages’ conversion rates.

If the confidence intervals of your original page and Variation B overlap, you need to keep testing even if your testing tool is saying that one is a statistically significant winner.

Significance, Errors, & How To Achieve The Former While Avoiding The Latter

Remember, our goal here isn’t to identify the true conversion rate of our population. That’s impossible.

When running an AB test, we are making a hypothesis that Variation B will convert at a higher rate for our overall population than Variation A will. Instead of displaying both pages to all 100,000 visitors, we display them to a sample instead and observe what happens.

• If Variation A (the original) had a better conversion rate with our sample of visitors, then no further actions need to be taken as Variation A is already our permanent page.
• If Variation B had a better conversion rate, then we need determine whether the improvement was statistically large “enough” for us to conclude that the change would be reflected in the larger population and thus warrant us changing our page to Variation B.

So why can’t we take the results at face value?

The answer is variability across samples. Thanks to the variance, there are a number of things that can happen when we run our AB test.

1. Test says Variation B is better & Variation B is actually better
2. Test says Variation B is better & Variation B is not actually better (type I error)
3. Test says Variation B is not better & Variation B is actually better (type II error)
4. Test says Variation B is not better & Variation B is not actually better

As you can see, there are two different types of errors that can occur. In examining how we avoid these errors, we will simultaneously be examining how we run a successful AB test.

Before we continue, I need to quickly explain a concept called the null hypothesis.

The null hypothesis is a baseline assumption that there is no relationship between two data sets. When a statistical hypothesis test is run, the results either disprove the null hypothesis or they fail to disprove the null hypothesis.

This concept is similar to “innocent until proven guilty”: A defendant’s innocence is legally supposed to be the underlying assumption unless proven otherwise.

For the purposes of our AB test, it means that we automatically assume Variation B is NOT a meaningful improvement over Variation A. That is our null hypothesis. Either we disprove it by showing that Variation B’s conversion rate is a statistically significant improvement over Variation A, or we fail to disprove it.

And speaking of statistical significance…

Type I Errors & Statistical Significance

A type I error occurs when we incorrectly reject the null hypothesis.

To put this in AB testing terms, a type I error would occur if we concluded that Variation B was “better” than Variation A when it actually was not.

Remember that by “better”, we aren’t talking about the sample. The point of testing our samples is to predict how a new page variation will perform with the overall population. Variation B may have a higher conversion rate than Variation A within our sample, but we don’t truly care about the sample results. We care about whether or not those results allow us to predict overall population behavior with a reasonable level of accuracy.

So let’s say that Variation B performs better in our sample. How do we know whether or not that improvement will translate to the overall population? How do we avoid making a type I error?

Statistical significance.

Statistical significance is attained when the p-value is less than the significance level. And that is way too many new words in one sentence, so let’s break down these terms real quick and then we’ll summarize the entire concept in plain English.

The p-value is the probability of obtaining at least as extreme results given that the null hypothesis is true.

In other words, the p-value is the expected fluctuation in a given sample, similar to the variance. Imagine running an A/A test, where you displayed your page to 1,000 people and then displayed the exact same page to another 1,000 people.

You wouldn’t expect the sample conversion rates to be identical. We know there will be variability across samples. But you also wouldn’t expect it be drastically higher or lower. There is a range of variability that you would expect to see across samples, and that, in essence, is our p-value.

The significance level is the probability of rejecting the null hypothesis given that it is true.

Essentially, the significance level is a value we set based on the level of accuracy we deem acceptable. The industry standard significance level is 5%, which means we are seeking results with 95% accuracy.

So, to answer our original question:

We achieve statistical significance in our test when we can say with 95% certainty that the increase in Variation B’s conversion rate falls outside the expected range of sample variability.

Or from another way of looking at it, we are using statistical inference to determine that if we were to display Variation A to 20 different samples, at least 19 of them would convert at lower rates than Variation B.

Type II Errors & Statistical Power

A type II error occurs when the null hypothesis is false, but we incorrectly fail to reject it.

To put this in AB testing terms, a type II error would occur if we concluded that Variation B was not “better” than Variation A when it actually was better.

Just as type I errors are related to statistical significance, type II errors are related to statistical power, which is the probability that a test correctly rejects the null hypothesis.

For our purposes as split testers, the main takeaway is that larger sample sizes over longer testing periods equal more accurate tests. Or as Ton Wesseling of Testing.Agency says here:

You want to test as long as possible – at least 1 purchase cycle – the more data, the higher the Statistical Power of your test! More traffic means you have a higher chance of recognizing your winner on the significance level your testing on!

Because…small changes can make a big impact, but big impacts don’t happen too often – most of the times, your variation is slightly better – so you need much data to be able to notice a significant winner.

Statistical significance is typically the primary concern for AB testers, but it’s important to understand that tests will oscillate between being significant and not significant over the course of a test. This is why it’s important to have a sufficiently large sample size and to test over a set time period that accounts for the full spectrum of population variability.

For example, if you are testing a business that has noticeable changes in visitor behavior on the 1st and 15th of the month, you need to run your test for at least a full calendar month.  This is your best defense against one of the most common mistakes in AB testing… getting seduced by the novelty effect.

Peter Borden explains the novelty effect in this post:

Sometimes there’s a “novelty effect” at work. Any change you make to your website will cause your existing user base to pay more attention. Changing that big call-to-action button on your site from green to orange will make returning visitors more likely to see it, if only because they had tuned it out previously. Any change helps to disrupt the banner blindness they’ve developed and should move the needle, if only temporarily.

More likely is that your results were false positives in the first place. This usually happens because someone runs a one-tailed test that ends up being overpowered. The testing tool eventually flags the results as passing their minimum significance level. A big green button appears: “Ding ding! We have a winner!” And the marketer turns the test off, never realizing that the promised uplift was a mirage.

By testing a large sample size that runs long enough to account for time-based variability, you can avoid falling victim to the novelty effect.

It’s important to note that whether we are talking about the sample size or the length of time a test is run, the parameters for the test MUST be decided on in advance.

Statistical significance cannot be used as a stopping point or, as Evan Miller details, your results will be meaningless.

As Peter alludes to above, many AB testing tools will notify you when a test’s results become statistical significance. Ignore this. Your results will often oscillate between being statistically significant and not being statistically significant.

The only point at which you should evaluate significance is the endpoint that you predetermined for your test.

Terminology Cheat Sheet

We’ve covered quite a bit today.

For those of you who have just been smiling and nodding whenever statistics are brought up, I hope this guide has cleared up some of the more confusing concepts while providing you with a solid framework from which to pursue deeper understanding.

If you’re anything like me, reading through it once won’t be enough, so I’ve gone ahead and put together a terminology cheat sheet that you can grab. It lists concise definitions for all the statistics terms and concepts we covered in this article.