Reinforcement Learning (RL) is an A/B and multivariate testing framework for Drupal where every visitor click is treated as human feedback (RLHF-style). Each page view is a trial, each conversion is a reward, and the algorithm continuously shifts traffic to whichever variant is winning. No fixed test horizons. No manual winner picking. No third-party SaaS.
RL is part of the DXPR marketing CMS stack and ships in DXPR CMS.
What you can A/B test with RL
- RL: A/B Test Views Content (rl_sorting): the order of items in any Drupal View
- RL: A/B Test Page Titles (rl_page_title, bundled): page titles for nodes, View pages, and any controller
- RL: A/B Test Menu Links (rl_menu_link, bundled): labels in any menu link
- DXPR Builder integration: variant slots inside builder blocks
Where experimentation fits in your AI workflow
Human review catches what's obviously off-brand, off-message, or factually wrong. It can't catch what merely fails to convert; only visitors can do that. Harvard Business School research on enterprise gen AI concludes that "designing targeted experiments and using scientific methods to test, refine, and scale promising solutions" is the layer between review and full rollout.
RL is that layer for Drupal. After your team approves a variant, RL tests it in production against the alternatives, shifts traffic toward what works, and sends the result back to the report. Variants can be hand-written, AI-generated, or both; RL is indifferent to authorship.
Source: Berndt et al., A Systematic Approach to Experimenting with Gen AI, Harvard Business Review, January-February 2026.
Why RL instead of fixed-horizon A/B testing?
Traditional A/B tests run for a fixed window (say two weeks) and split traffic 50/50 the whole time, even when one variant is obviously losing. RL turns the experiment into a feedback loop: every click adjusts the model, traffic shifts toward the leader as soon as evidence emerges, and the test never has to "end". You can run dozens or thousands of variants simultaneously (true multivariate testing), and a newly added variant is in play on the next render with no manual setup.
How it works
RL uses a multi-armed bandit (Thompson Sampling). Each variant has a reward distribution; on each render the algorithm samples from the distributions and picks the highest sample. Wins update the distribution toward higher rewards; losses update toward lower. The math: ThompsonCalculator.php.
Features
- Multivariate by default: 2 to thousands of variants, no manual configuration
- Real-time RLHF loop: visitor clicks update the model on every page
- Fast HTTP REST API: optimized JSON endpoint for tracking and decisions
- Admin reports: per-experiment performance, traffic, and confidence
- Service-based architecture: extensible decorators, custom variant selectors
- Data sovereignty: no cloud, no third-party SaaS, all data stays in your Drupal database
- GDPR-friendly tracking: only anonymous interaction counts, no user IDs or cookies
You need RL if
- You want to A/B or multivariate test any part of your site without third-party SaaS
- You want continuous optimization rather than fixed-horizon experiments
- You want to add or remove variants on the fly without restarting tests
- You want a core API you can call from any module, View, or block
Prefer a turnkey demo site?
Spin up DXPR CMS, Drupal pre-configured with DXPR Builder, DXPR Theme, RL, and security best practices.
Installation
composer require drupal/rl drush en rl
Verify rl.php access
RL ships an .htaccess file that allows direct access to rl.php (same pattern as Drupal core's statistics.php). Test it:
curl -X POST -d "action=ping" http://example.com/modules/contrib/rl/rl.phpIf the test fails:
- Apache: ensure
.htaccessfiles are processed (AllowOverride All) - Nginx: copy the rewrite rules from
.htaccessto your server config - Security modules: whitelist
/modules/contrib/rl/rl.php
If server policies prevent direct access, use the Drupal Routes API instead.
Drush command reference
| Category | Commands | Description |
|---|---|---|
| Discovery | rl:list, rl:status, rl:performance, rl:trends |
List A/B tests, check phase/confidence, arm-level stats, historical trends |
| Analysis | rl:analyze, rl:export |
Full analysis with recommendations, export experiment data |
| Experiment CRUD | rl:experiment:create, rl:experiment:update, rl:experiment:delete |
Create, update, and delete A/B tests with --dry-run support |
| Configuration | rl:config:get, rl:config:set, rl:config:list, rl:config:reset |
Get/set module settings, list all with current values, reset to defaults |
| Setup | rl:setup-ai |
Install AI assistant skill files for Claude Code, Codex, Gemini, Copilot, Cursor |
AI coding assistant integration
RL ships a built-in Agent Skills file that teaches AI coding assistants how to manage A/B tests through natural language. Compatible with Claude Code, Codex CLI, Gemini CLI, GitHub Copilot, Cursor, and any tool supporting the standard.
After installing the module, run drush rl:setup-ai to enable AI assistant support. Your assistant will then respond to prompts like:
- "List all running A/B tests"
- "Analyze the hero_cta_test experiment"
- "Create a new A/B test for the homepage banner"
- "What's the conversion rate for variant B?"
API
// Get the experiment manager $experiment_manager = \Drupal::service('rl.experiment_manager'); // Record a trial (content shown) $experiment_manager->recordTurn('my-experiment', 'variant-a'); // Record a reward (user clicked) $experiment_manager->recordReward('my-experiment', 'variant-a'); // Get scores for the variants currently in play $scores = $experiment_manager->getThompsonScores('my-experiment', NULL, ['variant-a', 'variant-b']); // Pick a winner $ts_calculator = \Drupal::service('rl.ts_calculator'); $best_arm = $ts_calculator->selectBestArm($scores);
JavaScript API
Attach the rl/api library to get Drupal.rl on the page:
Drupal.rl.turn('hero_cta', 'v0'); Drupal.rl.reward('hero_cta', 'v0'); Drupal.rl.decide('hero_cta', ['v0', 'v1', 'v2']).then(function (armId) { showVariant(armId); });
All three methods feed a shared 500 ms batch window, so every A/B test on the page rides one POST to rl.php. See the README for the HTTP wire format and server-side patterns.
FAQ
Does RL store my A/B test's variants?
It stores their performance data, not the authoritative list. Every variant that has received traffic has a row in rl_arm_data with turn and reward counts. But "which variants are in play right now" is owned by your module, not RL.
Different consumer modules keep the live variant list in different places:
- rl_sorting: the content returned by a View
- rl_page_title: fields on a content entity
- rl_menu_link: labels on a menu link
- DXPR Builder: slots inside a block component
On each call your module passes its current list (getThompsonScores($id, NULL, $arms) in PHP or Drupal.rl.decide(id, arms) in JS) and RL matches it against the stored stats to pick a winner. A newly added variant is in play on the next render; a removed one stops appearing. No second save step can drift out of sync with your module's UI.
When do I pick a winner and end an A/B test?
Only when you want to. RL has no fixed horizon and no significance gate to wait out. It just shifts traffic to whatever variant is winning right now and keeps adapting as evidence changes.
Two patterns, depending on what you're testing:
- Converging tests: a better page title, a clearer checkout button, a stronger hero image. Once the report shows a confident winner, lock it in and move on.
- Evergreen experiments: blog post lists, banner ads that fade as returning visitors tune them out, seasonal calls to action. Leave them running. RL follows the winner as it shifts.
In both cases the loser of a pair just stops receiving traffic on its own, so there's no urgency to declare a winner by hand. If you're used to fixed-horizon A/B tools, this is the biggest mental shift: there's no "test complete" flag to chase.
Related modules
- RL: A/B Test Views Content (rl_sorting): A/B test the order of any Drupal View
- Analyze: content analysis and quality scoring for Drupal
- AI Content Strategy: AI-driven content strategy recommendations
Resources
Project information
- Project categories: Automation, Content display, User engagement
103 sites report using this module
- Created by jurriaanroelofs on , updated
Stable releases for this project are covered by the security advisory policy.
Look for the shield icon below.


