Technical insights and software architecture

Deep dives into PHP development, Horde Framework evolution and practical software engineering. Focused on real-world solutions for complex technical challenges. “Always close to the source”.

Core Topics

PHP, Horde Framework, authentication systems, composer workflows and modern development practices.

Long-form Analysis

Comprehensive technical articles exploring architectural decisions, migration strategies and lessons learned from real projects.

Code & Community

Open source contributions, framework development and sharing knowledge with the PHP developer community.

Cold cache, cache hit, no bears

Cache Cache – Wicked Whupsie

Why did a wicked wiki page take seconds to load when nobody was editing it? Why did a whups ticket report recalculate the same numbers over and over again even though nothing had changed? And why were we spending CPU time trying to calculate available categories and queues on every single request?

Conventional wisdom has it that if it ain’t broke you better don’t try fix it. But sometimes we ought to do just that. The team recently went deep into the woods to chop some logs and find out why our wicked wiki and the whups ticket system often made the server so busy that other services lagged and barely worked. There’s some lessons to be learned from this.

Something for nothing – What to cache

The first lesson was that not every calculation deserves to be remembered. Many caching strategies are verbose with keys and slightly beside the point. We are looking for items that rarely change but are read often or which are too expensive to calculate on access.

On the horde.org website we pull two RSS feeds and present them in a box: The “Planet Horde” blog roll and the “Latest News” section re-iterating latest releases and other news. Early in our investigation we suspected if the system as a whole was slow, pulling the feeds from Jonah through a http client would make the page render even slower. So we cached the feeds12. Easy. But it wasn’t enough. We also decided to pre-warm the cache with a background script which pulls the feeds independent of visits3. The first visitor after some quiet period will not suffer the cost of downloading the feed.

The lesson was: Only the ones that are expensive, repeatable, and broadly reused.

In Wicked, individual page renders are only needed for pages which carry live content such as references into Horde Application blocks. All other pages only need to re-render when they have changes. So we only cache those pages which don’t have dynamic content4. That’s like 95% of them. We didn’t bother to bisect the pages with live content into cacheable “real wiki markup” and live dynamic content on top but it would be a trivial extension. It’s a problem we currently don’t feel. Patches welcome.

In Whups Ticketing, there’s two high priority candidates. Caching every single ticket isn’t attractive but almost every page access retrieves categories, queues, ticket types – metadata which changes only rarely and through very specific actions5. The other high priority item is report generation. Reports are slow and expensive6. They aren’t even used that much but they generate quite some load.

There ain’t – When not to cache

Caching reports as generated was an intuition but it lead down the wrong path. The individual reports actually used are slightly different, reducing the number of cache hits. But they are composed of some common building blocks. Identifying these cache worthy items allowed not to cache the final report but the hard parts7. This brought sufficient performance improvements.

Don’t jump in – When to invalidate and when not to

All caches are not equal. Sometimes you want extra performance but your primary worry is being correct and current. Think about caching permissions. Permissions may be expensive to calculate and attractive to cache but you must be laser focused on invalidating. Serving an outdated approval from cache is a no go and a security nightmare.

On the other hand, the website example is just fine without any invalidation. The cache will update when the cache warming script runs again. Users will not see the latest news or blog content for a few minutes. Horde.org is not a news website and the world won’t hold its breath to read the latest updates. We will be fine with a bit of stale cache.

An even smarter option is building systems in a way that you don’t have to invalidate cache. Think about a wiki page. A user edits the wiki page and generates a new revision of the page. New content is served under the same URL as before. But wicked doesn’t bother to invalidate the cache. Wicked keys the cache not by the page ID alone but also by the page revision8. It is safe to never actively expire the cached rendering result. It will age out and it won’t be used a lot as most access goes to the recent version. Access to older revisions is relatively rare.

There ain’t no bears in there

Caching is great but sometimes your problems are more fundamental. Sometimes you need the performance of a statically rendered page rather than just a cached pre-rendered content area in an otherwise still dynamically php-rendered canvas.
Think further – should the user’s request hit your server at all? Should a caching front proxy or HTTP-level caching prevent the request from even hitting your primary application? There’s no universal answers and render-to-static approaches have their own limitations and issues. The solution should fit your problem and over-generalizing rarely helps.

Footnotes

  1. Add caching for the feeds (horde-web)

  2. Make feed URL and timeout configurable (horde-web)

  3. Add a CLI script for feed retrieval / cache warming (horde-web)

  4. Allow caching pages which don’t contain dynamic content (wicked) — dynamic content detection: preg_match('/\[\[block /s', $text) === 1 2

  5. Cache for metadata which only changes on admin actions (whups)

  6. Cache generated reports (whups)

  7. Offload expensive parts of report generation to driver (whups)

  8. Same commit as 4 (wicked) — cache key by revision: 'wicked.render.' . $this->currentPageId . '.' . $this->currentPageVersion


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *