A while back, our team encountered a puzzling production bug: URLs generated by the UrlResolver
would randomly differ depending on who accessed them. The bug has since been fixed, and a patch is now available for installation.
After some initial investigation and discussion with the team, we confirmed this was one of those elusive issues—reproducible only in certain browsers or for specific users. We couldn’t reproduce it locally, nor in our integration environment.
So what was going on?
If you're familiar with Optimizely DXP, you know it runs on Azure App Services, with your app scaled out across multiple instances. For those new to the concept: scaling out means your application code runs on several servers in parallel to handle high web traffic efficiently.
But there's a caveat—when running code across multiple nodes, synchronizing state becomes critical.
Optimizely CMS handles this using Azure Service Bus to propagate key events and updates across all nodes. Whether you're publishing content or stopping a scheduled job, those actions are broadcast so that all instances stay in sync.
In our case, however, the problem was a cache invalidation issue across nodes. One node properly refreshed its cache and generated the updated URL, while others continued using stale data. This led to inconsistent URL generation depending on which server a user hit—hence the randomness in user reports.
Here's a visual to illustrate the scaled-out app structure with multiple nodes, one Azure Service Bus, and a single point of entry.
+----------------------+
| Load Balancer |
+----------+-----------+
|
+--------------------+--------------------+
| | |
+-----v-----+ +-----v-----+ +-----v-----+
| Node A | | Node B | | Node C |
| (Updated) | | (Stale) | | (Stale) |
+-----------+ +-----------+ +-----------+
\ | /
\ | /
\ +--------v--------+ /
+--------> Azure Service <--------+
| Bus |
+-----------------+
You can test this behavior yourself (in production or preproduction) if scale-out is enabled. Just open your browser’s developer tools, inspect your cookies, and delete ARRAffinity
and ARRAffinitySameSite
. Reload the page. If a different GUID appears, you’ve been routed to a different node.
The takeaway? If a bug appears non-deterministically across users or browsers, consider your application’s distributed nature. Multi-node environments can introduce quirks that don't show up locally or in single-instance testing.