We have a Drupal 10 site exposing ~13 custom REST API endpoints used by a .NET service layer and AngularJS frontend. Some endpoints are expensive to generate (5–30 seconds on cold builds) due to complex entity loading, paragraph traversal and taxonomy resolution across hundreds of nodes.
Below is the existing caching approach in our REST resources (before the two-layer strategy was added):
$response = new ResourceResponse([
'data' => $nodeToArray,
]);
$cacheTags = ['node:' . $node->id()];
$cacheTags[] = 'node_list:' . ContentTypeBBundle::getStaticBundle();
$this->setItemCache($response, [$node]);
$response->addCacheableDependency((new CacheableMetadata())
->addCacheTags($cacheTags));
$response->addCacheableDependency($this->getUrlContextCacheableMetadata());
return $response;While dynamic_page_cache is enabled, frequent cache invalidation in an active editorial environment leads to repeated cold builds. We previously faced a production issue where concurrent cold builds exhausted the MySQL connection pool.
To address this, a colleague suggested a two-layer caching approach. I’d like to understand if this approach is valid and recommended.
L1 (Primary cache)
- Uses
cache.datawith custom keys per endpoint (Is it recommended to move from built-in cache tags to custom cache keys, considering the additional invalidation logic and maintenance involved?) - Serves most requests (~5–20 ms)
- On L1 miss, falls through to L2
L2 (Fallback cache)
- Uses State API (
key_valuetable) — survivesdrush crsincedrupal_flush_all_caches()does not touchkey_value - Only read on L1 miss (~50 ms), overwritten on rebuild (no accumulation)
On node save
hook_node_presave: L1 custom keys explicitly deleted viadeleteMultiple()before the HTTP response is sent. L2 intentionally preserved to serve stale responses during the rebuild windowdrupal_register_shutdown_function(): after the response is sent to the editor, warmer services rebuild affected endpoints and write fresh data to both L1 and L2 ( Is this the preferred approach in Drupal? )
On drush cr via terminal
- Heavy endpoints (2 in our case) warmed synchronously before pipeline exits
- Remaining endpoints warmed via background
nohup drushprocess, with Queue Workers as fallback
Questions:
- State API usage:
We are storing ~9000 entries (fixed set, overwritten each time). Are there known performance or scalability concerns with using State API for this purpose? - Custom cache bin alternative:
Would creating a custom cache bin that survivesdrupal_flush_all_caches()be a more idiomatic approach? Or does that risk breaking expected Drupal behavior? - Post-response rebuilds:
Is usingdrupal_register_shutdown_function()reliable for cache warming, or would Queue Workers be a safer pattern? -
Overall approach:
Would it be better to continue optimizing the API code and rely on the existing API endpoint level caching approach, rather than introducing a more complex L1/L2 caching strategy? However, if code optimization alone is not sufficient, and we risk encountering SNAT or connection exhaustion issues during deployment (e.g., when multiple teams rundrush crsimultaneously), what would be the recommended approach to handle this scenario?
Would appreciate feedback on whether this approach aligns with Drupal best practices, or if there’s a more standard way to handle this use case.
Comments
Yes, it works and many
Yes, it works and many experienced Drupal developers have done similar things (especially for expensive external API calls or heavy computed data).
However, it's not the most idiomatic solution, but it works well
Rajveer Singh
rajveer.gang@gmail.com
-- Ignore --
-- Replied to wrong post --
Contact me to contract me for D7 -> D10/11 migrations.
I would use the Cache API and
I would use the Cache API and create a custom cache bin. Is this something you've considered? is there a reason why not to do that?
Contact me to contract me for D7 -> D10/11 migrations.