anturis.com

Get Insight into Your Website Performance with Key Apache Statistics

- Clifford -

Apache Web server has been with us since 1995 and is still by far the most popular web server in the world since April 1996. For any given website there is more than a 50% probability it is running on Apache.

While it is highly efficient and secure, with performance comparable to modern ‘high-performance’ web servers, Apache has extendable modular architecture and is known for its configuration complexity. Incorrect or misaligned Apache configuration may cause a variety of problems for websites it serves such as slow responses or occasional service denial.

First Steps

Understanding website performance and uptime should begin with external monitoring, which allows you to discover problems from a user’s perspective. Such problems involve:

  • Response timeout – no timely response from the server;
  • HTTP errors instead of a response; and
  • Slow response – say, comparing to the last month’s average.

All these problems should be confirmed with monitoring from multiple locations to eliminate network related problems. Only when the same issue observed from several locations at the same time can we conclude there is something wrong at the server side of things.

The next step is to check everything is OK with the server itself. It is, when basic metrics like CPU, memory, disk and swap come into play. It will add more shades to the picture and give you some guidance on where to dig further.

The most frequent reason for website performance issues is malfunctioning, buggy or un-optimised application (e.g. PHP script), and cumbersome DB queries. But before digging deep into profiling of an application and its queries it makes sense to check Apache configuration and statistics – and this is what this post is about.

Introducing Apache Statistics Module: mod_status

Mod_status is used to get a current view of Apache key parameters and a snapshot of all its request handlers with their statuses. The information is provided in the form of a webpage – you can take a look at such page of an Apache Project here: http://www.apache.org/server-status.

Apache normally compiles with mod_status built in but not enabled. Check your apache configuration file (often at /etc/httpd/conf/httpd.conf – but it can be in many other locations so you may want to search for httpd.conf), and find these or similar lines within one of your hosts or virtual hosts:

	<Location /server-status>
			SetHandler server-status
			Order deny,allow
			Deny from all
			Allow from 127.0.0.1
	</Location>

Some Apache default configuration has these lines commented out. The above configuration will make statistics available at http://127.0.0.1/server-status. The Allow statement ensures that this URL will be only available locally – it is wise for security purpose to limit access to this information.

To enable extended status you’ll need to add the following line to the end of httpd.conf file:

	ExtendedStatus On

Note that the collection of extended status information may slightly slow down the server.

After you make and save changes to the configuration file you should reload Apache as below (again, the command may be different, depending on the Apache distribution):

	service httpd reload

Now you can access Apache statistics. You may also want to make your monitoring system read and store this data. In this case you may set up alerts in case of an abnormal situation (such as too few idle workers – read about this below). You can also track historical trends and compare data for different time periods in this way.

Let’s now examine what kind of statistics are provided by Apache and how these may be useful.

Apache processes and threads architecture

Latest Apache versions implement hybrid multi-threaded and multi-process models to server requests. This means that you will see multiple Apache processes running and that each process contains multiple threads. This allows great trade-off between efficiency and stability. Each single server thread that processes the requests (there are also other types of threads) is called worker.

Apache always maintains a number of idle (spare) workers across all the processes as this allows it to immediately assign a request to a thread for processing, without the need to spawn a thread, which would heavily increase processing latency. Those workers that are already processing requests are called busy workers. Depending on the number of idle workers Apache is able to fork or kill processes. So under normal conditions the number of idle workers should be more or less stable thanks to Apache self-regulation.

The way Apache forks processes and threads is defined by a particular Multi-Processing Module (MPM) implementation. These modules are also responsible for binding ports, accepting connections, and dispatching it to workers. There are several of them depending on OS with prefork and worker being most popular on Unix OS family. The difference is that prefork doesn’t use threads and preforks all the necessary processes, while worker makes use of both processes and threads. Thus prefork is less memory efficient but allows more stability in case of non-thread safe applications.

Workers’ configuration

A typical Multi-Processing Module (MPM) configuration looks like this (taken from apache.org):

ServerLimit       16
StartServers      2
MaxClients        150
MinSpareThreads   25
MaxSpareThreads   75
ThreadsPerChild   25
MaxRequestsPerChild  10000

The following is a brief explanation of these configuration directives:

  • ServerLimit is a hard limit on the number of active Apache child processes. It should follow this rule ServerLimit >= MaxClients / ThreadsPerChild
  • StartServers is a number of child processes launched initially
  • MaxClients is a very important parameter that sets the maximum number of workers (all threads in all processes), and also sets the limit to the maximum number of client requests that may be served simultaneously. Any connection attempts over the MaxClients limit will normally be queued, up to a specific number guided by ListenBacklog directive. Note, that in Apache version 2.4 this directive is renamed to MaxRequestWorkers.
  • MinSpareThreads and MaxSpareThreads is the boundaries of the number of idle workers.
  • ThreadsPerChild specifies the fixed number of threads created by each child process.
  • MaxRequestsPerChild is the number of served requests (or connections depending on the particular type of MPM in use), after which the child process will die. The purpose of this directive is to fight accidental memory leaks.

Know what workers are working on

Mod_status provides information about what each worker is doing in the form of a scoreboard, which looks like this:

_RRR_RRRRRKR_WR___R_KWW_RRR_RR_RWRWR_R_RWRR_RK__K_RRRRRR__RRRWRR
_RRR__W_K__RR___WR___RW_RRR_WRR__WK_R_RKR__R_RRR_KRWWWRR_RRRW___
________________________________________________________________
_R_____________________________________R________________________
_R______R_________R___R_______________W__________W___K__________
__R______________R_RR______________________R____________________
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
RRK_RR_WR___R____RRR_R_R_R__RR_RRWWW__R__R__RRK_R__R_RWW____R__R
W_RR____RW_RRW____R___RRW__RWR_RR__KRWKR_R___R_WR____R_RRRRR_RKR
................................................................
................................................................
................................................................

Each character has the following meaning:

Idle workers:
"_" Waiting for Connection
Busy workers:
"S" Starting up
"R" Reading Request
"W" Sending Reply
"K" Keepalive (read)
"D" DNS Lookup,
"C" Closing connection
"L" Logging
"G" Gracefully finishing
"I" Idle cleanup of worker
No worker running (but configuration allows it to be started if needed):
 "." Open slot with no current process

 

Normally, the majority of workers should be in R/W or idle (“_”) state. If you see big number of workers in other states like “K”, “D” or “L” this will hint to you that there is a corresponding problem with keep-alive settings, DNS resolving or logging.

Anything unusual with the traffic?

Another good starting point for troubleshooting website problems is to check whether traffic was/is unusual. An unexpected surge in traffic may be caused by many factors, varying from DOS attack to a marketing campaign that a sys admin was not aware about. Module mod_status provides the following information about the traffic (if ExtendedStatus is enabled): the number of requests per second, the number of bytes served per second and the average number of bytes per request. Comparing these values to historical data may reveal any abnormalities, such as unusually high or low traffic being processed.

Take a look at Apache logs

Apache, as with any other good software, writes logs. There are two logs: error.log and access.log. The first one is used to store all errors, such as failure to start a module, or a process. It also contains errors encountered when processing visitor requests, as well as those sent to the visitor, such as a ‘document not found’ error and another 400 series errors. It will also contain all PHP or other app-level errors encountered by Apache modules. So the Apache log is definitely worth taking a look at for yourself.

Access.log keeps record of all visitors and all their requests. As there are usually quite a lot of such requests this log is usable in combination with some traffic analytics software. Good traffic analytics is able to spot unusual visitor behavior, such as malicious behavior when someone tries to hack your site.

Sizing Apache to the server

The most important server characteristic in relation to a web server is the amount of RAM. A web server should never swap because swapping a process will make request processing latency unacceptable. So it is important to understand the maximum number of workers that fits RAM of your server and to adjust the MaxClients (or MaxRequestWorkers)directive accordingly. In order to do this you should observe how much RAM is consumed by how many Apache processes under some normal conditions. Divide the first by the second and compare it to your total available physical memory to understand how many Apache processes you can have on this server.

Knowing the maximum number of Apache workers per server will in turn give you some insight into the traffic you can serve by the server. You can use this information to anticipate future upgrades of your infrastructure.

Watching busy and idle workers

Watching the number of busy and idle workers is a good, proactive way to find out Apache configuration problems early enough.

If in case of peak traffic the number of idle workers approaches or hits zero, this may result in some requests being queued; waiting to be processed by an available worker. Such queued request must wait for older requests to be processed, which results in lower response times of your website. To improve the situation you should consider increasing MaxClients (or MaxRequestWorkers), which is the limit for the number of simultaneous connections.

But be aware that more workers will need more server resources and if the server doesn’t have additional resources (and first of all – RAM) then changing MaxClients/MaxRequestWorkers will be counterproductive. In the latter case the only way is to solve resources bottleneck: upgrade the server or buy another one and load-balance; move static data away from the server; or migrate to nginx (e.g. use nginx with php-fpm).

Conclusion

Apache statistics provided by mod_status is a great tool for optimizing your website or web application performance. Monitoring the statistics will guide you as to which Apache configuration parameters can be tuned to achieve the best performance in your particular case. Unfortunately there is no silver bullet – different recipes are good for different environments and web applications.

Leave a Comment

Your email address will not be published. Required fields are marked *

 
 
 

We are glad you have chosen to leave a comment. Please keep in mind that comments are moderated according to our comment policy.