April 23rd, 2014 - by Andrew Sytsko
Your processing power – or more specifically, your hardware or host’s processing power - is your ability to do business online. A loss of processing power will leave your customers struggling to successfully complete straightforward interactions with your business online. More often than not, this translates to an impression of sloppy customer service, depending on how rapidly your technical support team is able to troubleshoot the problem.
The crucial point to recognize is that such performance regressions are not isolated technical issues, but inevitable flaws in service-provision. Regular monitoring of both your CPU load in single moments and graphs of these ‘snapshots’ over weeks, months and years, will give you an accurate picture of your system’s capabilities.
Fail to prepare, and you’re preparing for failure
If your graphs are rising - demonstrating increased burdens on your service and increased waiting times for your customers – this is a fair indication that your system requires some kind of improvement.
Depending on the functions at play, you might want to upgrade your hardware, or your hosting service will be emailing to up their fees for your over-exuberant CPU usage. However, understanding precisely what solutions are required makes sophisticated monitoring essential.
For example, a snapshot CPU usage of 100% sounds dire: that’s all your capacity utilized, with nothing spare if another customer should enter a transaction on your site. However, 100% usage might be entirely justified depending on the function engaged or the number of users active.
The only way you’ll know this is by analyzing your peak computational power, which compares productivity over time, for specific functions. If a given process hogs 100% of your CPU usage for 20 minutes, everyday at 3am, this might well be justifiable and unproblematic for business. Scheduling high-burden tasks when fewer customers are online makes good sense. However, high usage which does not originate from user requests is suspicious. And if your CPU time is 5 seconds today, for a task that executed in 2 seconds yesterday, this spells technical problems for customers and losses for your business. Peak computational power will tell you whether you should be worried in the first place - and if so, where your solution lies.
The key issue is to be able to predict capacity, in order to provide a reliable service for your clients.
Solutions, not stop-gaps
Depending on the sophistication of your system monitoring, you’ll be able to identify CPU usage that is problematic (as we’ve seen, not all high CPU usage, is bad usage). You’ll then be able to pinpoint the solutions that work best for your system, and for your business.
Increased CPU usage might point to solutions involving hardware upgrades, enhanced quotas of your host’s CPU –or more serious underlying performance regressions, which you’ll want to investigate in order to avoid unnecessary overhaul and upgrade costs. Pragmatic, rather than technical solutions include re-scheduling lengthy processes to hours when site activity is low. You can look to cancel unnecessary processes altogether, once you’ve identified where these are taking up CPU usage, by checking your system monitoring.
If the problem implicated by your rising graphs is a software bug, for instance, you’ll be constantly searching for greater CPU capacity, when the long-term fix lies with your software developers. A familiar problem, which can be detected readily enough through diligent monitoring, sees CPU usage rocket where SQL queries are executing when they haven’t been optimized, either because of inadequate software updates or unsynchronized versions. Of course, system malware is another common issue for overactive CPU usage, making anti-virus scans indispensible.
Monitoring for the win
Monitoring your CPU usage achieves the dual objectives of enabling you to identify overactive CPU usage, in addition to providing evidence of performance regressions for third parties.
In the first place, you only want to expend your time, energy and resources on problems that require a fix. As long as you’ve performed due diligence in monitoring your CPU usage, high usage in a snapshot moment might well be justified, in that it doesn’t interfere with business or pose any threat to system performance over time. Of course, you’ll need the context of performance over time to tell you this. Comparing timeline graphics, such as user inquiry execution time, will enable you to channel your energies into providing steady performance, with improvements over time.
In the second place, where host servers or external software developers are implicated in performance regressions, CPU usage monitoring gives you the undisputable evidence you need to demonstrate unacceptable system impact. If your virtual machine is hosted on a server to which the provider decides to run a couple more virtual machines off, smart monitoring will show you this. Where performance losses mean lost customers, CPU usage monitoring is self-evidently crucial.