LifeInVistaprint

February 3rd, 2015

Disk contention in ASP.NET display templates

Author: Gary Schorer

The holiday season is peak time for Vistaprint. Activity in the days surrounding Black Friday and Cyber Monday can easily double that of normal days, mostly thanks to holiday cards and personalized gifts, at times reaching 2,000 requests per second. In preparation, we run stress tests on our customer-facing servers. We simulate load by running on fewer and fewer servers until our metrics say we’ve pushed them too far. Then we extrapolate those numbers to decide if we have enough capacity for the upcoming peak (plus some headroom, of course).

This year, we saw some concerning results when testing our gallery pages, which showcase designs that customers can choose from. The pages became unresponsive at a load level that was dangerously close to our projections for Cyber Monday.

gallery-xmas

A gallery page (excerpt)

The Problem

Internal monitoring indicated that the pages were tripping over each other when accessing disk, as shown by the partial stack trace below:

In a nutshell, TemplateHelpers tries to find a view for an editor template. It asks a ViewEngine, which asks a BuildManager, which eventually checks disk.

Before we dig into that further, let’s pause for a brief summary of editor and display templates, which are a feature of ASP.NET MVC. They provide a convenient mechanism for views to have an inheritance structure similar to classes. Display templates render objects in a read-only format, while editor templates render them read-write.

Suppose you have an Automobile class, with Car and Truck classes inheriting from it. Templates allow you to add files named like “Car.cshtml” and “Automobile.cshtml” to a specially named directory (/Views/Shared/EditorTemplates and DisplayTemplates, respectively). When calling @Html.DisplayFor(m => m.Vehicle), a view will be selected by inspecting the object’s class hierarchy. For an object of type Car, it will see Car.cshtml and use that. For trucks it would find no such file. It would then move up the class hierarchy and check for an Automobile.cshtml. In this example, such a file does exist. But if no view is found, ASP.NET MVC generates one based on the members of the type.

Based on the stack traces we saw above, we monitored disk I/O on our servers and saw many attempts to read files from disk. This behavior was both undesirable and unexpected, given that we had precompiled our web app. Where was it coming from? We were seeing many attempts to search for things like List1.cshtml, Collection.cshtml and Object.cshtml. The problem turned out to be a confluence of the caching strategies used by TemplateHelpers and BuildManager.

Let's walk through what's happening behind the stack trace in a little bit more detail, though simplified for readability.

TemplateSequenceDiagramThere's a lot going on here so let's recap the highlights:

  • Looping can occur in several places. TemplateHelpers loops over each ViewEngine. Internally, each ViewEngine loops over directories it knows about, file extensions it knows about, and display modes it knows about.
  • BuildManager keeps a static cache of views it has found on disk.
  • TemplateHelpers keeps a request-scoped cache of what it actually used to render an object of a given type.

Note what happens when we look for Truck. We don't have a static cache keeping track of the fact that, for some types, there is nothing on disk!. Each request may ask for a Truck view only once, but it's still happening on every request, even though the answer never changes (at least for precompiled applications). Also, each request for a non-existent view will actually hit disk several times. At Vistaprint, we have two view engines (Razor and WebForms), each of which has two directories to search, two file extensions to inspect, and (potentially) two display modes to consider. A single missing link in the chain can hit disk as many as 16 times. And that penalty is paid on every request, for every type in its hierarchy. The absolute worst outcomes result when there are no templates specified at all. asp.net will gladly generate one for you on the fly, but not before getting all the way down the class hierarchy. This penalty can also be unexpectedly harsh for collections (e.g. List<T>), where it will search for List1, Collection and Object before finally giving up.

At the end of the day, we concluded that every hit to our gallery page was checking disk more than 300 times looking for files that were never present.

The Solution

We considered several strategies for resolving this problem. Our first idea was to add templates for the most frequently accessed types to eliminate the cache misses. There were dozens of templates, however, so this would’ve required extensive time and testing to confirm that we didn’t unintentionally change the layout of the page. We didn’t think we had enough time to execute that successfully.

The other option was to manipulate the request-scoped cache used by TemplateHelpers. This ultimately seemed like the most viable resolution. We decided to prime the cache using an action filter. Our filter kept its own static cache of the complete misses – i.e. the cases where no templates existed at all and ASP.NET had to generate them programmatically. At the beginning of each request for this page, our filter copies that data into the request-scoped cache used by TemplateHelpers. This isn’t exactly a supported operation, but TemplateHelpers’ cache lives in HttpContext, so it is easy enough to leave something there for it to find later on. While the request is executing, it may find additional templates it needs which also get added to its request-scoped cache. At the end of the request, our filter runs again. This time, it looks for any changes made during the latest request and copies them back to the global cache. Each request benefits from the discoveries of the previous one, and in pretty short order, the cache is filled and disk hits almost disappeared.

After verifying the change didn’t introduce any new bugs, we patched half the servers and reran our stress test. The patched servers went from hundreds of disk hits per page to just a handful of hits per page. Under normal operating conditions, this didn’t produce a significant performance gain. Under load, however, the results were more dramatic. Unpatched servers immediately ground to a halt, with page load times in the 20-30 second range. Patched servers kept in line with their normal performance, handling the page in about one second.

TemplateCacheElapsedTimeWhen Cyber Monday arrived, our load projections proved accurate. Had we not addressed this disk contention issue, our gallery pages would have become unresponsive. Today’s lesson: while editor and display templates are great, use them with discretion. Deep class hierarchies, with corresponding template structures that are sparsely populated, can produce very poor performance under load.

The source code for the caching attribute can be found here. However, take it with a grain of salt. We targeted this at a single controller. It would need a little TLC before it could handle things like multiple areas.

Gary Schorer is a Senior Lead Software Engineer at Vistaprint.

Recent Posts

Join Life in Vistaprint Search Jobs