Q 463. A/B Testing applies the concept of 2 sample hypothesis testing. What is A/B testing and how do companies deploy this to improve the customer experience. Provide some examples Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday. All questions so far can be seen here - https://www.benchmarksixsigma.com/forum/lean-six-sigma-business-excellence-questions/ Please visit the forum home page at https://www.benchmarksixsigma.com/forum/ to respond to the latest question open till the next Tuesday/ Friday evening 5 PM as per Indian Standard Time. Questions launched on Tuesdays are open till Friday and questions launched on Friday are open till Tuesday. When you respond to this question, your answer will not be visible till it is reviewed. Only non-plagiarised (plagiarism below 5-10%) responses will be approved. If you have doubts about plagiarism, please check your answer with a plagiarism checker tool like https://smallseotools.com/plagiarism-checker/ before submitting. The best answer is always shown at the top among responses and the author finds honorable mention in our Business Excellence dictionary at https://www.benchmarksixsigma.com/forum/business-excellence-dictionary-glossary/ along with the related term

A/B Testing

April 15, 20224 yr

Q 463. A/B Testing applies the concept of 2 sample hypothesis testing. What is A/B testing and how do companies deploy this to improve the customer experience. Provide some examples

Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

All questions so far can be seen here - https://www.benchmarksixsigma.com/forum/lean-six-sigma-business-excellence-questions/
Please visit the forum home page at https://www.benchmarksixsigma.com/forum/ to respond to the latest question open till the next Tuesday/ Friday evening 5 PM as per Indian Standard Time. Questions launched on Tuesdays are open till Friday and questions launched on Friday are open till Tuesday.
When you respond to this question, your answer will not be visible till it is reviewed. Only non-plagiarised (plagiarism below 5-10%) responses will be approved. If you have doubts about plagiarism, please check your answer with a plagiarism checker tool like https://smallseotools.com/plagiarism-checker/ before submitting.
The best answer is always shown at the top among responses and the author finds honorable mention in our Business Excellence dictionary at https://www.benchmarksixsigma.com/forum/business-excellence-dictionary-glossary/ along with the related term

April 16, 20224 yr

A/B testing refers to a method of comparing two versions of a marketing campaign (website/app) against each other to determine which one performs better.

For example..In an experiment, 2 or more variants of a website/app are shown to users at random, and statistical analysis is used to determine which variation performs better.

InA/B Testing, A refers to 'control' or the original variable whereas B refers to 'variation' or a new version of the original variable.

The idea behind A/B testing is that you show the existing version of the product to the control group and the variated version of the product to the experimental group. The difference in performance in experimental vs control group is tracked, to identify the effect of this new version of the product on the performance. The version that moves business metric in the positive direction is known as the 'winner.

This essentially leads to testing new product variants that will improve the performance of the existing product. It also helps in getting direct feedback from the actual users by presenting them the existing vs variated product options and in this way they can quickly test new ideas.

April 18, 20224 yr

A/B Testing (Split Testing, Bucket Testing):

A/B Testing lets marketers better understand which key formatting of a website or any piece of content makes the customer and clients more engaging. It simply compares two versions of webpage (sometimes more) to identify which variant appeals for more clicks.

Predominantly for Web pages, nevertheless also used for comparing Emails, Application Interface and Advertisements.

772746587_Compare2.jpg.8603ac64ec88a6f37ec4e630e472d16c.jpg

Testing process is simple, below are few of the milestones

Collecting data

Identifying Conversion Goals

Generating Hypothesis

Creating Variations

Running Experiment

Analyzing results

Example 1:

Example 2:

Statistical analysis is performed to identify which variation/version better performs and in-line with the conversion goals. Search analytics tools like Stats Engine, Bloomreach, SiteImprove & Semrush uses build-in advanced statistical models which can throw out results real time.

Is A/B Testing similar to that of Multivariate tests?

A/B compares 2 pages with entirely different headlines, Text and Images.

Multivariate test compares identical pages, however different fonts and sizes are compared.

Below are some of the essential considerations of A/B Testing Variables:

Layout

CTA’s (Calls-to-Action)

Content Offers

Color

Size

Email Subject Line

Headlines

Email Sender

Pricing Scheme

Copy length

Landing page

Tone

Images

Timing

Frequency

Video Vs Text Sales

Forms

Targeting and Personalization

Sales Copy

Data Visualization

Mostly used in the below industries:

Media

Travel

E-commerce

Banking

Fin-Tech

Technology

Benefits of A/B Testing:

Helps in conversion goals/sales

Helps in making data-driven decision

Improved user engagement

Reduces bounce rates

Ease of analysis

It would be wise to use both A/B Analysis and Multivariate analysis together. First A/B can be used to determine which layout and design converts well and then using multivariate to fine-tune formatting the page to attract widespread traffic.

1

April 18, 20224 yr

A/B testing is a direct comparison between two or more different versions of assets or random experimentation process of two or more versions of variable. For example, an airline company can market the same international airfare price through different forms of email and social media marketing post over the course of a week. They be able to use the data and analysis to determine which performed best and generated customer engagement, clicks, comments and actual ticket sales.
Continuous A/B testing additionally helps us to understand customer audience and what they like, respond to or are influenced by to improve your success and increase sales and revenue.
The main aim of A/B testing is to enhance a customer's experience to generate more revenue and profits. The better a customer understands a website and knows how to use it, the more motivated they are to make a purchase, sign up for services or convert into a repeat user. Applying A/B testing allows us to learn what consumers enjoys and uses on our websites/pages/applications.

Example of A/B testing in E-commerce sites:
E-commerce websites and mobile applications depend on on A/B testing to provide the best customer experience, as the smooth purchase funnel is critical to gaining orders and profit. Companies test out where and how to display shipping costs, promotions, sales, shopping cart and checkout page perceptibility and placement throughout. Each change is to encourage the consumer to buy or shop again.
If there is one item often in continuous A/B testing is the shopping cart feature, which almost every e-commerce site has and the shopping cart icon stays visible no matter which page you logged in. It serves as a reminder for products in your cart but also encourages you to:
•   Keep shopping
•   Discover daily deals
•   Check items on your wish list
•   Buying items in cart
•   Enable the one-click payment feature
The one-click ordering button is usual for several well-known online retailers and results from extensive A/B testing.

April 18, 20224 yr

Solution

A/B Testing

Let’s consider a test goal of designing web pages using marketing strategy, to get maximum online audience & placement for marketing ad creative. (i.e., groups of words used in text promotion.)

A/B testing or bucket testing or split-run testing is utilized to test “likeness performance of a text/ ad-message” by control group and test audience, for two finalized designs of promotional content; for the vector under consideration.

At ideation stage, copywriter/marketer may narrow down two final designs to be creatively scaped on landing page of website, email template, webpage advertisement or ad-word ad, pop-up message on end user screen. A set of reliable measurable data on Key Performance Indicators KPIs, “Return on Investment”, “Click Through Rate”, “Average Revenue Per Session”, “Revenue Per Visitor”, is put to scanner through “A/B testing”; to select better design for online promotion: as perceived by Control & Test audience.

ROI: Return-On-Investment, is defined as “ratio of Profit earned by investment divided by total cost of investment.”

CTR: Click-Through-Rate, is defined as ratio of “select number of users clicking on hyperlink against total number of users reaching webpage, email, ad-creative.”

ARPS: Average Revenue Per Session, is defined as, “average amount of earning per web session or earning registered in visits to the website”, and calculated by “dividing the total revenue of in web session by the total number of website visits.”

RPV: Revenue Per Visitor, is defined as, “the amount of money generated each time a customer visits a website”, and calculated by “dividing the total revenue by the total number of visitors to a website”.

The two variants— “Variant A” or “Variant B”, are put to test using one or more method-- “AB Lift Test”, “Z-Score test” & “statistical hypothesis testing” or "two-sample hypothesis testing". The scores are analysed to decide about effective outlook to be shared in production.

One may complete A/B testing by following these steps:

1. Think and decide on objective to be achieved from online promotion, including learning, training-needs to-be-met, & promotion to be shared.

2. Determine Key Performance Indicator of customer success to be utilized Return on Investment, Click Through Rate, Average Revenue Per Session, Revenue Per Visitor.

3. Decide on ad-timeslot and web-locations with highest traffic volume and that most-cost to improve optimization process. It is important to note that spots with lower traffic volume require longer campaign run periods to get standard outcomes.

4. Decide on elements of campaign design to be tested in A/B testing. Usually, a change in single element is recorded in A/B testing to get authentic results. Changes in banner- header or footer, different placement of message with varied selection of words- short precise or simple statements in user comprehendible language, may be put to test.

5. Monitor data generated in the test run and highlight trends resulting in better KPIs higher traffic, higher conversions, more revenues and better ROI%.

Precautions in A/B Testing

1. Sample Size: Sample size used in A/B testing should be sufficiently large, to generate authentic user opinions, on changes in content design; as means to effective communication. A sample size between 100 to 1000 user observation be considered a good fit. Further confidence level, statistical significance may be observed to interpret finding from statistical tests used in A/B testing.

2. Blocking: Blocking is act of accounting biases in sample group to maintain randomness in the sample. The first step is to form homogenous columns (basis gender, age, geographic location, occupation) in groups, followed by a randomized test within each group to check trends within the group. Blocking adjusts randomness biases originating from homogenous similar groups of respondents, in sample- control group (responding to campaign design without altered design variable) and test group (responding to campaign design with altered design variable), with fresh randomized testing. Blocking is effective for 2-3 variable and re-randomization is preferred in A/B Testing, when dependent covariate is higher in number in sample.

3. Adjust Test run duration with traffic volume, to generate authentic trends. A/B test run with lower traffic need to be continued for a longer duration of time to result in authentic customer preferences insights on campaign design.

4. Testing for multiple changes in campaign design, using A/B testing, may lead to spurious correlation in results.

Tests used in A/B Testing

A/B Lift: Lift in A/B Testing is defined as, “the percentage difference in conversion rate of control design and a successful test treatment.”

Application:

1. Incremental Lift in Revenue Per Session in A/B Testing: may be calculated by initially subtracting revenue per session of the control design from the test treatment campaign. The resultant figure may be divided by the revenue per session of test treatment and then multiplied by 100, to get percentage estimate of Incremental Lift in Revenue Per Session in A/B Testing.

2. Cost of Running the Test: Since, Control and Test group form basis to A/B testing, and are implemented on 50%-50% capacity: it can be inferred that control, did not generate revenue in the A/B testing. Cost of Running the test, is calculated by, “multiplying the revenue generated for control treatment by the incremental revenue per session (%).”

3. Identification of Multiplier for Test Distribution: A multiplier is required to be identified, to estimate of value required to get value of test distribution (Test/Control) equal to 100% traffic volume. Basis Test and Control group used in 50%-50% campaign runs during A/B Testing, multiplier of 2.0 is considered for evaluation purpose.

4. Estimate the Value gained if Test Campaign is rolled to a 100% traffic-users: The estimate of value gained on idealizing winning test campaign to 100% traffic user is calculated by, “multiplying cost of test campaign with value of Multiplier realised for Test Distribution in earlier step.”

5. Value Gained from Testing Period: To calculate the value gained by the winning test only during the testing period, “subtract the cost of running the test from the value gained if rolled out to 100% traffic volume.

6. Forecasting Incremental Revenue Gain: It can be safely concluded from above steps that treatment design wins and is rolled out at 100%. Further the winning treatment campaign would be earning an incremental revenue for a set period of time known as forecasted period for successful run decided using market condition, nature of product or service and campaign design. Forecasting Incremental Revenue Gain can be done by, “dividing the value of the selected winning test design at 100% traffic volume, by the number of days of test design run. Further, multiply the daily gain by the forecasted number of days.”

7. Overall Program Value: Overall Program Value need to be calculated by estimating the total value of A/B testing program including cost of all test design campaigns and statistical testing including hypothesis testing implemented for programme. The operational cost is then reduced from program cost achieved above to reach the estimate cost of A/B Testing.

Z-Score in A/B Testing:

Z-Score is standardized score used for data under normal distribution, that describes the number of standard deviations a sample element is from its mean. In case of A/B testing website or online respondent is considered as observation parameter. The Control group are represented through bell-curve in plot and Test group & sample set of all website visitors represent the second bell curve or the shift in the bell curve from sample mean.

Hypothesis Testing

Two alternative hypotheses are tested for statistical significance and impact of A/B Testing, assuming:

i. Null Hypothesis: H₀: Default campaign design generates a conversion rate equal to x% (default).

ii. Alternate Hypothesis: H₁: New variable in campaign design generates a conversion rate greater than x%.

Here, critical observation is expressed using

p-value= statistical significance of test=probability of negating Null hypothesis when Null Hypothesis is true.

Alpha(α)= false positive rate; or significance level; or type I error.

Beta(β)= false negative rate; or type II error.

power: true positive rate; (1 – beta). 1 - alpha: true negative rate.0

Likely outcome of Hypothesis testing:

No error

Type I error: In-correctly rejecting Null Hypothesis, when H₀is true and accepting alternative hypothesis as likely outcome of Hypothesis testing.

Type II error: In-correctly accepting Null Hypothesis, when H₁ is true and rejecting alternative hypothesis as likely outcome of Hypothesis testing.

Here, key consideration is to determine, whether the difference between two sample distribution; i.e., the original default control design & the new test design with change in one variable; has occurred as random chance: or, there is a presence of bias, accounting for the difference. Statistical reasoning using three critical parameters Confidence, Statistical significance and Significance may be utilized to remove speculation about bias in observation of control and test design.

Confidence intervals measure the degree of uncertainty or certainty in a sampling method. Or a confidence level refers to the percentage of all possible samples that can be expected to include the true population parameter. Or, a confidence interval is how much uncertainty there is with any particular statistic. Confidence intervals are often used with a margin of error. Normally confidence level of 95% to 99% is desirable for test statistics to reach meaningful interpretation.

Statistical significance is validation of a result to be attributable to a specific cause, after testing or experimentation, is achieved on sample data. In statistical hypothesis testing, a result has statistical significance when result is an unlikely outcome, given the null hypothesis. Statistical Significance is measured in p-value and normally p-value for valid statistical significance is estimate as p-value=0.05. The result hold statistical significance, when p ≤ α.

The significance for a hypothesis test is a value, for which a P-value, less than or equal to, is considered statistically significant. The significance value may be shared as 0.1, 0.05, 0.01 depending on nature of sample universe and study type. The significance values correspond to the probability of observing an extreme value in sample by chance.

1

April 18, 20224 yr

A/B Testing or Bucket Testing or Split-Run Testing.

It is a simple randomized controlled experiment of two samples hypothesis testing for two alternatives namely A and B. It tests the user’s response to Product A against Product B to see which is more attractive to the end-user. The products are similar in every way except for one attribute such as color or size.

Version A could be the current product that would form the control group whereas version B is the treatment group which is a modification of version A.

A/B testing helps to obtain major improvements through minor changes and obtain high rewards by taking small risks.

Use Cases. Some of the use cases of A/B testing could be in newsletters, email campaigns, digital marketing, internet advertisement, website design, etc.

Variant Options could be with respect to layout, theme, color scheme, images, headings, subheadings, font, font size, pricing, offers, call to action design, etc.

Examples

A/B testing can be used to design the purchase funnel on an e-commerce website. The drop-off rates of the two designs of the purchase funnel can be tested to see if the difference in the drop-off rates is statistically significant.

A/B testing was first used by Google in 2000 to assess the ideal number of results to display on its search engine page since the larger number of results displayed on its page increased the loading time.

A/B testing was used in Microsoft to assess the different ways to display advertising headlines on their search engine, Microsoft Bing. This experiment led to a revenue increase of 12% for the company within four hours of its implementation, without affecting the other user’s experience metrics.

Email Marketing/Social Media Marketing with variations of

“Offer ends soon” Vs “Offer ends by this weekend”

Discount Rs.500 Vs 5% Discount

Arial Font Vs Times New Romans Font

Blue color Vs Red color

To Determine the price point that offers the maximum total revenue.

Political Campaigns. To determine what voters, want from a candidate.

Testing of APIs by HTTP Routing. New features of APIs are tested through A/B Testing. In this case, a certain percentage of the end-users will get the new version through HTTP routing. This also ensures that in case of a bug, only a small percentage of users are affected.

Segmentation and targeting. A/B testing can be used to determine the optimal variant for a particular segment of the market and specifically target that segment.

References

https://en.wikipedia.org/wiki/A/B_testing

https://mailchimp.com/marketing-glossary/ab-tests/

April 19, 20224 yr

Anshul Vaidya has provided the best answer to the question. Examples quoted by Mohamed are worth reading.

A/B Testing

Featured Replies

Solved by AnshulVaidya

Create an account or sign in to comment

Who's Online (See full list)

Lead AI Transformation without coding

Most Solved

Forum Statistics

Member Statistics

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)