一.论文信息

1.论文题目

Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework. 基于强化学习框架的A/B测试中的动态因果效应评估(Causal Effects Evaluation)

2.发表年份

3.期刊/会议

4.论文链接

https://www.tandfonline.com/doi/full/10.1080/01621459.2022.2027776

5.作者团队

二.论文内容

简介

A/B 测试或在线实验是一种标准的商业策略,用于将制药、技术和传统行业的新产品与旧产品进行比较。主要挑战出现在双边市场平台(例如优步)的在线实验中,其中只有一个单位随着时间的推移接受一系列治疗。在这些实验中,给定时间的治疗会影响当前结果以及未来结果。本文的目的是介绍一种强化学习框架,用于在这些实验中进行 A/B 测试,同时描述长期治疗效果。本文提议的测试程序允许顺序监控和在线更新。普遍适用于不同行业的多种处理设计。此外,本文系统地研究了测试程序的理论特性(例如,尺寸和功率)。最后,将此框架应用于模拟数据和从一家技术公司获得的真实数据示例,以说明其相对于当前实践的优势。

——来自智源社区

摘要

A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. 【AB测试或者在线实验,是在制药、技术、传统行业中,比较新产品和旧产品的一种标准商业策略】

Major challenges arise in online experiments of two-sided marketplace platforms (e.g., Uber) where there is only one unit that receives a sequence of treatments over time. 【在双边市场平台(two-sided marketplace platforms)(例如:优步),的在线实验出现了重大挑战,那里(指:双边平台市场)只有一个单元在一段时间内来接受处理。】

In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. 【在那些实验中,特定时间的处理(the treatment at a given time)会影响当前结果,也会影响未来结果】

The aim of this article is to introduce a reinforcement learning framework for carrying A/B testing in these experiments, while characterizing the long-term treatment effects. 【这篇文章的目的是介绍在这些实验中进行A/B测试的强化学习框架,同时描述长期处理的效果】

Our proposed testing procedure allows for sequential monitoring and online updating. 【我们提出的测试过程(testing procedure) 允许连续的监视和在线更新(monitoring and online updating)】

It is generally applicable to a variety of treatment designs in different industries. 【一般适用于(it is generally applicable)不同行业的(different industries)的处理设计】

In addition, we systematically investigate the theoretical properties (e.g., size and power) of our testing procedure. 【另外,我们系统调研了我们测试过程中它的理论属性(theoretical properties),例如(尺寸和功率)】

Finally, we apply our framework to both simulated data and a real-world data example obtained from a technological company to illustrate its advantage over the current practice. 【最后,我们将该框架用在模拟数据(simulated data)和从一家科技公司获得的真实数据(a real-w)上】

A Python implementation of our test is available at https://github.com/callmespring/CausalRL. Supplementary materials for this article are available online. 【我们测试的Python代码实现(Python implementation of our test)可以在以下网站获得,本文的补充资料(Supplementary materials)可以在线查阅】