Campaign holdout reporting

The Holdout report for a campaign helps you to measure the impact of running a campaign against a baseline of doing nothing. It compares the performance of the test group to the performance of the campaign control group.

To learn more about holdout testing, or to set up holdout testing for a campaign, see Campaign holdout testing.

Note: For information about data limits, see Data retention for reporting.

Note

Holdout reports are available only for campaigns where campaign holdout is enabled (Campaign Control Group > 0%). When global holdouts are enabled for an account, users are held out for every campaign, but holdout reports are not available for every campaign.

Holdout report metrics

You will need a goal event to measure the performance of a campaign test group against the holdout group. You can use a standard goal event, visits or purchases, or a custom goal event to measure the impact of running a campaign.

The campaign level holdout group is considered to be the baseline. You can compare the performance of the baseline group to test the effectiveness of your campaign.

User Groups	Effectiveness of campaign
Campaign level holdout	Baseline against which performance of other user groups is compared.
Test group - users who received campaign messages.	If the performance of the test group is better than the performance of the campaign control group (baseline), it means that your campaign was effective.
Global holdout - users who did not receive any messages from any campaign or syndication.	If the performance of the campaign control group (baseline) is better than the global control group, it means that your overall campaign strategy is working.

Holdout report details

The following details are available in the campaign holdout report.

Field	Description
Total Users	The total number of users in the group at the end of the selected time period. Example 1: 1f the time window for the campaign is from March 1 to May 1, the count includes all unique users who were sent the campaign or held out from the start of the campaign until May 1. Example 2: If the time window for the campaign is from January 1 to May 1, the count includes all unique users who were sent the campaign or held out from the start of the campaign until May 1. Hence the number of total users in both the examples will be the same.
Total Completed	The number of unique users in the group who completed the goal event during the time period. Example 1: If the time window for the campaign is from March 1 to May 1, the count includes all unique users who completed the goal event from March 1 to May 1. Example 2: If the time window for the campaign is from January 1 to May 1, the count includes all unique users who completed the goal event from January 1 to May 1. Hence the Total Completed value for example 1 will most likely be lower than that of example 2.
Conversion	The number of users in the group that converted, i.e. completed the goal event, as a percentage of the total number of users in the group. Conversion = Total Completed/Total Users Example 1: If the time window for the campaign is from March 1 to May 1, the Total Completed count (x) includes all unique users who completed the goal event from March 1 to May 1. Conversion for this time window is x/Total Users. Example 2: If the time window for the campaign is from January 1 to May 1, the Total Completed count (y) includes all unique users who completed the goal event from January 1 to May 1. Conversion for this time window is y/Total Users. The Conversion in example 1 will most likely be lower than the Conversion in example 2. However, this will equally impact both the control and the test groups. So the relative performance of the test group over the control group would still be meaningful for both the time windows.
Lift %	The Lift% for a group (test or global control) is calculated by comparing the conversion for that group with the conversion for the baseline. Lift%_{Test Group} = (Conversion_{Test Group}/Conversion_Baseline) - 1 If the Lift % > 0, it is an improvement. If the Lift % < 0, it is a degradation.
Lift	It is the estimate number of incremental users who completed the goal when compared to the baseline. Estimated Lift = Total Completed_{Test Group} x (Conversion_{Test Group} - Conversion_Baseline)
Confidence Level	The statistical likelihood or probability (p’%) that the improvement (or degradation) observed from the experiment is correct. So if you were to infinitely repeat this experiment, you would observe the same improvement p’% of the time. The ?2 test (chi-squared test) is used to calculate the confidence level. p’ = 1 - p(?2)
Statistical Significance	When the Confidence Level is higher than 90%, this indicates that the results are statistically significant. However, you can use a higher or lower confidence level based on your experimentation objectives.
Confidence Interval	The range of values within which the conversion for the group will lie p’% (i.e. confidence level percentage) of the time.

Plot a goal event

You can also plot the goal event as a time series for each user group. This chart shows the performance of each user group over a period of time.

Viewing the holdout report

You can view all the campaigns for which holdout testing is enabled from the Campaign index page by selecting the Holdout Enabled option for the Holdout Status filter.

To view the Holdout report for a particular campaign, click Holdout Report in the actions menu for that campaign.

You can also view the holdout report for a campaign from the Reports tab for the campaign.

Holdout Analysis vs Attribution

You may observe differences in the counts of goal events when comparing holdout reports to campaign detail/ summary reports or Insights reports.

Attribution: The important thing to keep in mind is that campaign statistics are based on the attribution logic you’ve selected for your account. This means that we will attribute a goal event to a campaign based on the user’s campaign interaction.

Holdout reporting shows goal event statistics independent of campaign attribution. In other words, we will show how many goal events were completed by the users during the period of time used for reporting. The user does not have to interact with the campaign for us to count the goal event. This is the entire purpose of holdout reporting. We are trying to compare the results of the test group to the control group of users who didn’t receive the campaign. Just like users in the control group complete the goal event without interacting with the campaign, users in the test group do the same. What we are trying to determine is whether more users in the test group complete the goal event than those in the control group.

How visits are tracked and attributed

What counts as a visit?

A visit is created when Blueshift receives a pageload event or a custom event that is not marked as a goal. For reference, see our recommended events guide.

What makes a visit count in campaign reports?

The event must include campaign identifiers like bsft_eid, bsft_uid, or experiment_uuid.
Only attributed visits are shown in campaign performance reports.

How is attribution handled?

If you're using Blueshift’s JavaScript tracking code, campaign identifiers such as bsft_eid are typically captured automatically from the campaign click URL. See the event tracking guide for setup details.
If events are sent from a backend system, the identifiers must be manually extracted from the URL and included in the event payload.

Why the numbers may not match

The campaign holdout report shows visits from users in both the test and control groups, allowing you to compare performance.
If the campaign performance report shows zero visits, the test group in the holdout report will also reflect the same. This typically happens when the test group either recorded no visits or the visits were not attributed.
Use audience insights reports to view all visits, including those not attributed to a campaign.

Things to know

Confidence: Your observations from a holdout experiment can be random unless you have collected enough data to average out the randomness.

For example, if you toss a coin, there is an equal probability of heads or tails showing up. If you toss the coin only twice, there is no guarantee that it will be heads once and tails the other time. However, if you toss the coin 1000 times, it is more likely that it will be heads close to 500 times.

This is called the law of large numbers. The same theory applies to your holdout experiments. Hence you should be careful about any conclusions you arrive at from the holdout reports for a campaign. The holdout report includes a confidence level and confidence interval so that you can ensure that the results of your experiment are statistically significant.

Campaign holdout reporting

Note

Comments