ApacheConNA: Monitoring httpd and Tomcat

ApacheConNA: Monitoring httpd and Tomcat #


Monitoring - a task generally neglected - or over done - during development.
But still vital enough to wake up people from well earned sleep at night when
done wrong. Rainer Jung provided some valuable insights on how to monitor Apache httpd and Tomcat.


Of course failure detection, alarms and notifications are all part of good
monitoring. However so is avoidance of false positives and metric collection,
visualisation, and collection in advance to help with capacity planning and
uncover irregular behaviour.


In general the standard pieces being monitored are load, cache utilisation,
memory, garbage collection and response times. What we do not see from all that
are times spent waiting for the backend, looping in code, blocked threads.


When it comes to monitoring Java - JMX is pretty much the standard choice. Data
is grouped in management beans (MBeans). Each Java process has default beans,
on top there are beans provided by Tomcat, on top there may be application
specific ones.


For remote access, there are Java clients that know the protocol - the server
must be configured though to accept their connection. Keep in mind to open the
firewall in between as well if there is any. Well known clients include
JVisualVM (nice for interactive inspection), jmxterm as a command line client.


The only issue: Most MBeans encode source code structure, where what you really
need is change rates. In general those are easy to infer though.


On the server side for Tomcat there is the JMXProxy in Tomcat manager that
exposes MBeans. In addition there is Jolohia (including JSon serialisation) or
the option to roll your own.


So what kind of information is in MBeans:




  • OS - load, process cpu time, physical memory, global OS level
    stats. As an example: Here deviding cpu time by time geves you the average cpu
    concurrency.


  • Runtime MBean gives uptime.

  • Threading MBean gives information on count, max available threads etc

  • Class Loading MBean should get stable unless you are using dynamic
    languaes or have enabled class unloading for jsps in Tomcat.

  • Compliation contains HotSpot compiler information.

  • Memory contains information on all regions thrown in one pot. If you need
    more fine grained information look out for the Memory Pool and GC MBeans.




As for Tomcat specific things:




  • Threadpool (for each connector) has information on size, number of busy
    threads.

  • GlobalRequestProc has request counts, processing times, max time bytes
    received/sent, error count (those that Tomcat notices that is).

  • RequestProcessor exists once per thread, it shows if a request is
    currently running and for how long. Nice to see if there are long running
    requests.

  • DataSource provides information on Tomcat provided database connections.




Per Webapp there are a couple of more MBeans:




  • ManagerMBean has information on session management - e.g. session
    counter since start, login rate, active sessions, expired sessions, max active
    sinse restart sessions (here a restart is possible), number of rejected
    sessions, average alive time, processing time it took to clean up sessions,
    create and required rate for last 100 sessions

  • ServletMBean contains request count, accumulated processing time.

  • JspMBean (together with activated loading/unloading policy) has
    information on unload and reload stats and provides the max number of loaded
    jsps.




For httpd the goals with monitoring are pretty similar. The only difference is
the protocol used - in this case provided by the status module. As an
alternative use the scoreboard connections.


You will find information on




  • restart time, uptime

  • serverload

  • total number of accesses and traffic

  • idle workers and number of requests currently processed

  • cpu usage - though that is only accurate when all children are stopped
    which in production isn't particularly likely.




Lines that indicate what threads do contain waitinng, request read, send reply
- more information is documented online.


When monitoring make sure to monitor not only production but also your stress
tests to make meaningful comparisons.