Meltdown/Spectre – the security/performance tradeoff

It is axiomatic in the InfoSec realm that keeping patch levels current is one of the most critical steps for any IT shop. In the performance realm, it is a rule of thumb (Rule of Thumb = Axiomatic – m) that any updates that cause significant performance degradation should be thoroughly investigated, evaluated and possibly ameliorated before deployment into production.  The Meltdown/Spectre security flaws now present a clear case where the security risk may be in direct opposition to the performance requirements.

In my previous post I spoke of the necessity of quantifying the performance impact. This must be done for ALL enterprise applications. The risk of not patching systems must also be quantified. In applications having a multi-level architecture, both the risk and the performance effects must be quantified at each level. (as an aside, I’d do this one level at a time, starting from the furthest level away from the workload source – each level separately and then add levels from the far-end back to the source.) The decision point between that which is axiomatic and that which is RoT-ic can only be made at the business owner level. I do not envy them the choice. 

For my personal computing environment, where the risk is near 0 and the probability of a performance hit is near 1, the choice seems clear. In a mission critical business application, the decision will be more difficult. Even in an environment where the security risk is low and the probability of missed SLAs is high, this won’t be an easy choice. Quantifying both risk and performance will at least provide a sound basis for decision making.

Being mostly a Performance Engineer, I inhabit a world of tradeoffs so let me propose a few Rules of Thumb:

·     If the performance degradation is low (< 5%) or otherwise acceptable – patch.

·     If the performance hit is > 10% && < 25% and the risk is low to med then

o  If you will not miss SLAs – patch and begin tuning to gain back what you lost.

o  If you will miss SLAs – don’t patch and start tuning efforts so you can patch.

·     If the performance hit is > 25% and the risk is above medium and missing SLAs is not a viable option then

o  Delay patching

o  Begin vigorous tuning efforts across the architecture to decrease execution/response times.

o  Investigate alternative deployments to non effected chipset architectures.

o  Manage the risk while increasing your IP and ID capabilities to shrink the risk window until such time as the patch will not torpedo your business.

To view or add a comment, sign in

More articles by Mark Monaghan

  • Report Driven Test Design

    Earlier in my career I often arrived at the reporting part of a performance effort only to find that I was lacking some…

  • Toolset vs Skillset: an argument for experience

    The first piece of electronic equipment I worked on had tubes. That’s right: cathode, plate, anode, grids and a pile of…

    3 Comments
  • Performance test system design

    The design of en environment to support performance testing is equal parts art and science. Some will argue that the…

    2 Comments
  • Security vs Performance in the @intel chipset design

    The recent news about security issues in the @Intel chips is causing (or should be causing) quite a stir. The…

    1 Comment
  • Performance Test Design – No thinking required!

    When designing performance tests for servers the most important quality of the test is a meaningful, controllable…

    1 Comment

Explore content categories