Reinforcement Learning (or A/B & Multivariate Testing)

Reinforcement Learning (RL) is an A/B and multivariate testing framework for Drupal where every visitor click is treated as human feedback (RLHF-style). Each page view is a trial, each conversion is a reward, and the algorithm continuously shifts traffic to whichever variant is winning. No fixed test horizons. No manual winner picking. No third-party SaaS.

RL is part of the DXPR marketing CMS stack and ships in DXPR CMS.

What you can A/B test with RL

RL: A/B Test Views Content (rl_sorting): the order of items in any Drupal View
RL: A/B Test Page Titles (rl_page_title, bundled): page titles for nodes, View pages, and any controller
RL: A/B Test Menu Links (rl_menu_link, bundled): labels in any menu link
DXPR Builder integration: variant slots inside builder blocks

Where experimentation fits in your AI workflow

Human review catches what's obviously off-brand, off-message, or factually wrong. It can't catch what merely fails to convert; only visitors can do that. Harvard Business School research on enterprise gen AI concludes that "designing targeted experiments and using scientific methods to test, refine, and scale promising solutions" is the layer between review and full rollout.

RL is that layer for Drupal. After your team approves a variant, RL tests it in production against the alternatives, shifts traffic toward what works, and sends the result back to the report. Variants can be hand-written, AI-generated, or both; RL is indifferent to authorship.

Source: Berndt et al., A Systematic Approach to Experimenting with Gen AI, Harvard Business Review, January-February 2026.

Why RL instead of fixed-horizon A/B testing?

Traditional A/B tests run for a fixed window (say two weeks) and split traffic 50/50 the whole time, even when one variant is obviously losing. RL turns the experiment into a feedback loop: every click adjusts the model, traffic shifts toward the leader as soon as evidence emerges, and the test never has to "end". You can run dozens or thousands of variants simultaneously (true multivariate testing), and a newly added variant is in play on the next render with no manual setup.

How it works

RL uses a multi-armed bandit (Thompson Sampling). Each variant has a reward distribution; on each render the algorithm samples from the distributions and picks the highest sample. Wins update the distribution toward higher rewards; losses update toward lower. The math: ThompsonCalculator.php.

Features

Multivariate by default: 2 to thousands of variants, no manual configuration
Real-time RLHF loop: visitor clicks update the model on every page
Fast HTTP REST API: optimized JSON endpoint for tracking and decisions
Admin reports: per-experiment performance, traffic, and confidence
Service-based architecture: extensible decorators, custom variant selectors
Data sovereignty: no cloud, no third-party SaaS, all data stays in your Drupal database
GDPR-friendly tracking: only anonymous interaction counts, no user IDs or cookies

You need RL if

You want to A/B or multivariate test any part of your site without third-party SaaS
You want continuous optimization rather than fixed-horizon experiments
You want to add or remove variants on the fly without restarting tests
You want a core API you can call from any module, View, or block

Prefer a turnkey demo site?

Spin up DXPR CMS, Drupal pre-configured with DXPR Builder, DXPR Theme, RL, and security best practices.

Get DXPR CMS »

Installation

composer require drupal/rl
drush en rl

Verify rl.php access

RL ships an .htaccess file that allows direct access to rl.php (same pattern as Drupal core's statistics.php). Test it:

curl -X POST -d "action=ping" http://example.com/modules/contrib/rl/rl.php

If the test fails:

Apache: ensure .htaccess files are processed (AllowOverride All)
Nginx: copy the rewrite rules from .htaccess to your server config
Security modules: whitelist /modules/contrib/rl/rl.php

If server policies prevent direct access, use the Drupal Routes API instead.

Drush command reference

Category	Commands	Description
Discovery	`rl:list`, `rl:status`, `rl:performance`, `rl:trends`	List A/B tests, check phase/confidence, arm-level stats, historical trends
Analysis	`rl:analyze`, `rl:export`	Full analysis with recommendations, export experiment data
Experiment CRUD	`rl:experiment:create`, `rl:experiment:update`, `rl:experiment:delete`	Create, update, and delete A/B tests with `--dry-run` support
Configuration	`rl:config:get`, `rl:config:set`, `rl:config:list`, `rl:config:reset`	Get/set module settings, list all with current values, reset to defaults
Setup	`rl:setup-ai`	Install AI assistant skill files for Claude Code, Codex, Gemini, Copilot, Cursor

AI coding assistant integration

RL ships a built-in Agent Skills file that teaches AI coding assistants how to manage A/B tests through natural language. Compatible with Claude Code, Codex CLI, Gemini CLI, GitHub Copilot, Cursor, and any tool supporting the standard.

After installing the module, run drush rl:setup-ai to enable AI assistant support. Your assistant will then respond to prompts like:

"List all running A/B tests"
"Analyze the hero_cta_test experiment"
"Create a new A/B test for the homepage banner"
"What's the conversion rate for variant B?"

API

// Get the experiment manager
$experiment_manager = \Drupal::service('rl.experiment_manager');

// Record a trial (content shown)
$experiment_manager-&gt;recordTurn('my-experiment', 'variant-a');

// Record a reward (user clicked)
$experiment_manager-&gt;recordReward('my-experiment', 'variant-a');

// Get scores for the variants currently in play
$scores = $experiment_manager-&gt;getThompsonScores('my-experiment', NULL, ['variant-a', 'variant-b']);

// Pick a winner
$ts_calculator = \Drupal::service('rl.ts_calculator');
$best_arm = $ts_calculator-&gt;selectBestArm($scores);

JavaScript API

Attach the rl/api library to get Drupal.rl on the page:

Drupal.rl.turn('hero_cta', 'v0');
Drupal.rl.reward('hero_cta', 'v0');

Drupal.rl.decide('hero_cta', ['v0', 'v1', 'v2']).then(function (armId) {
  showVariant(armId);
});

All three methods feed a shared 500 ms batch window, so every A/B test on the page rides one POST to rl.php. See the README for the HTTP wire format and server-side patterns.

FAQ

Does RL store my A/B test's variants?

It stores their performance data, not the authoritative list. Every variant that has received traffic has a row in rl_arm_data with turn and reward counts. But "which variants are in play right now" is owned by your module, not RL.

Different consumer modules keep the live variant list in different places:

rl_sorting: the content returned by a View
rl_page_title: fields on a content entity
rl_menu_link: labels on a menu link
DXPR Builder: slots inside a block component

On each call your module passes its current list (getThompsonScores($id, NULL, $arms) in PHP or Drupal.rl.decide(id, arms) in JS) and RL matches it against the stored stats to pick a winner. A newly added variant is in play on the next render; a removed one stops appearing. No second save step can drift out of sync with your module's UI.

When do I pick a winner and end an A/B test?

Only when you want to. RL has no fixed horizon and no significance gate to wait out. It just shifts traffic to whatever variant is winning right now and keeps adapting as evidence changes.

Two patterns, depending on what you're testing:

Converging tests: a better page title, a clearer checkout button, a stronger hero image. Once the report shows a confident winner, lock it in and move on.
Evergreen experiments: blog post lists, banner ads that fade as returning visitors tune them out, seasonal calls to action. Leave them running. RL follows the winner as it shifts.

In both cases the loser of a pair just stops receiving traffic on its own, so there's no urgency to declare a winner by hand. If you're used to fixed-horizon A/B tools, this is the biggest mental shift: there's no "test complete" flag to chase.

Related modules

RL: A/B Test Views Content (rl_sorting): A/B test the order of any Drupal View
Analyze: content analysis and quality scoring for Drupal
AI Content Strategy: AI-driven content strategy recommendations

Resources

Supporting organizations:

DXPR

Main sponsorship

Project information

Project categories: Automation, Content display, User engagement
70 sites report using this module
Created by jurriaanroelofs on 3 August 2025, updated 30 April 2026
Stable releases for this project are covered by the security advisory policy.
Look for the shield icon below.

Releases

1.1.6

released 15 May 2026

Works with Drupal: ^10.3 | ^11

PHP 8.4 compatibility fix

Install:

Development version: 1.x-dev updated 7 May 2026 at 08:34 UTC

View all releases

Reinforcement Learning (or A/B & Multivariate Testing)

Primary tabs

What you can A/B test with RL

Where experimentation fits in your AI workflow

Why RL instead of fixed-horizon A/B testing?

How it works

Features

You need RL if

Prefer a turnkey demo site?

Installation

Verify rl.php access

Drush command reference

AI coding assistant integration

API

JavaScript API

FAQ

Does RL store my A/B test's variants?

When do I pick a winner and end an A/B test?

Related modules

Resources

Project information

Releases

Maintainers

Issues for Reinforcement Learning (RL) API

All issues

Bug report

Statistics

Resources

Development

News items

Our community

Documentation

Drupal code base

Governance of community