ApacheConNA: Monitoring httpd and Tomcat

2013-05-13 20:23

Monitoring - a task generally neglected - or over done - during development.
But still vital enough to wake up people from well earned sleep at night when
done wrong. Rainer Jung provided some valuable insights on how to monitor Apache httpd and Tomcat.

Of course failure detection, alarms and notifications are all part of good
monitoring. However so is avoidance of false positives and metric collection,
visualisation, and collection in advance to help with capacity planning and
uncover irregular behaviour.

In general the standard pieces being monitored are load, cache utilisation,
memory, garbage collection and response times. What we do not see from all that
are times spent waiting for the backend, looping in code, blocked threads.

When it comes to monitoring Java - JMX is pretty much the standard choice. Data
is grouped in management beans (MBeans). Each Java process has default beans,
on top there are beans provided by Tomcat, on top there may be application
specific ones.

For remote access, there are Java clients that know the protocol - the server
must be configured though to accept their connection. Keep in mind to open the
firewall in between as well if there is any. Well known clients include
JVisualVM (nice for interactive inspection), jmxterm as a command line client.

The only issue: Most MBeans encode source code structure, where what you really
need is change rates. In general those are easy to infer though.

On the server side for Tomcat there is the JMXProxy in Tomcat manager that
exposes MBeans. In addition there is Jolohia (including JSon serialisation) or
the option to roll your own.

So what kind of information is in MBeans:

  • OS - load, process cpu time, physical memory, global OS level
    stats. As an example: Here deviding cpu time by time geves you the average cpu

  • Runtime MBean gives uptime.

  • Threading MBean gives information on count, max available threads etc

  • Class Loading MBean should get stable unless you are using dynamic
    languaes or have enabled class unloading for jsps in Tomcat.

  • Compliation contains HotSpot compiler information.

  • Memory contains information on all regions thrown in one pot. If you need
    more fine grained information look out for the Memory Pool and GC MBeans.

As for Tomcat specific things:

  • Threadpool (for each connector) has information on size, number of busy

  • GlobalRequestProc has request counts, processing times, max time bytes
    received/sent, error count (those that Tomcat notices that is).

  • RequestProcessor exists once per thread, it shows if a request is
    currently running and for how long. Nice to see if there are long running

  • DataSource provides information on Tomcat provided database connections.

Per Webapp there are a couple of more MBeans:

  • ManagerMBean has information on session management - e.g. session
    counter since start, login rate, active sessions, expired sessions, max active
    sinse restart sessions (here a restart is possible), number of rejected
    sessions, average alive time, processing time it took to clean up sessions,
    create and required rate for last 100 sessions

  • ServletMBean contains request count, accumulated processing time.

  • JspMBean (together with activated loading/unloading policy) has
    information on unload and reload stats and provides the max number of loaded

For httpd the goals with monitoring are pretty similar. The only difference is
the protocol used - in this case provided by the status module. As an
alternative use the scoreboard connections.

You will find information on

  • restart time, uptime

  • serverload

  • total number of accesses and traffic

  • idle workers and number of requests currently processed

  • cpu usage - though that is only accurate when all children are stopped
    which in production isn't particularly likely.

Lines that indicate what threads do contain waitinng, request read, send reply
- more information is documented online.

When monitoring make sure to monitor not only production but also your stress
tests to make meaningful comparisons.

ApacheConEU - part 06

2012-11-15 20:48
For the next session I joined the Tomcat crowd in Marc Thomas' to learn more on Tomcat reverse proxy configurations. One rather common setup is to have Tomcat connected to an httpd instance. One common issue encountered with this setup in particular when running httpd with the event mpm is the problem of thread exhaustion on tomcat's side. Fixes include always having more active tomcat threads than there can be httpd threads at any one time and to disable persistent connections. Keep in mind that tomcat performance does not degrade gracefully here - in case of thread exhaustion it just goes downhill very quickly. One famous example here was an issue with the ASF jira: After weeks of debugging bad performance, after blaming the hardware, the os, the JVM, the java implementation in generally finally the number of threads was increased resulting in a smoothly running system...

Another common configuration problem is to rename to deployed web application war - for instance in order to keep the version number in the war name itself - and change the path on httpd's side. This is bad for at least for reasons:

  • redirects will fail - you can configure ProxyReversePath which will fix some issues but does not affect all http headers
  • cookie paths break - you can configure CookiePathReverse here
  • links that are generated in the web app will fail - you can use mod_sed/ _substitute/ _proxy_html to fix that - however those configurations tend to become messy and are error prone
  • custom headers usually also break

If the only reason for doing such a thing is to keep the version number in the file name it might be an option to use "foo##bar.1.2.3" as filename - tomcat will ignore anything after the hashtags.

When dealing with proxying traffic make sure to inform tomcat about https termination events in order to correctly handle secure cookies and sessions. This is done automatically with mode_jk and mod_ajp, mod_proxy needs some more manual work. When dealing with virtual hosting make sure to use ProxyPreserveHeader in order to be able to switch hosts on Tomcat's side.

Julien Nioche shared some details on the nutch crawler. Being the mother of all Hadoop projects (as in Hadoop was born out of developments inside of nutch) the project has become rather quite with a steady stream of development in the recent past. Julien himself uses the nutch for gathering crawled data for several customer projects - feeding this data into an NLP pipeline based on Behemoth that glues Mahout, UIMA and Gate together.

The basic crawling steps including building the web graph, computing a link based ranking method and indexing are still the same since last I looked at the project - just that for indexing the project now uses solr instead of their own lucene based solution.

The main advantage of nutch is its pluggability: the protocol parser, html filter, url filter, url normaliser all can be exchanged against your own implementations.

In their 2.0 version they moved away from using their own plain hdfs storage to a table schema - mapped to the real database through Gora, an abstraction layer to connect to e.g. Cassandra or HBase. The schema itself is based on Avro but can be adopted to your needs. The advantages are obvious: Though still distributed this approach is much easier and simpler in terms of logic kept in nutch itself. Also it is easier to connect to the data for third parties - all you need is the schema as well as Gora. The current disadvantage lies in it's configuration overhead and instability compared to the old solution. Most likely at least the latter one will go away as version 2.0 stableises.

In terms of future work the project focuses on stabilisation, synchronising features of version 1.x and 2.x (link ranking is only available in version 1.x while support for elastic search is only available in version 2.x). In terms of functionality the goal is to move to Solr Cloud, support sitemaps (as implemented by commons crawler), more (pluggable?) indexers.

The goal is to delegate implementations - it was already done for Tika and Solr. Most likely it will also happen for the fetcher, protocol handling, robots.txt handling, url normalisation and filtering, graph processing code and others.

The next talk in the Solr/Lucene talk dealt with scaling Solr to big data. The goal of the speaker was to index 100 million documents - the number of documents was expected to grow in the future. Being under heavy time pressure and having a bash wizard on the project they started building lots of their glue code and components in bash scripts: There were scripts for starting/stopping services, remote indexing, performance monitoring, content extraction, ingestion and deployment. Short term this was a very good idea - it allowed for fast iterations and learning. On the long run they slowly replaced their custom software with standard components (tika for content extraction, puppet for deployment etc.).

They quickly learnt to value property files in order to easily reconfigure the system even in production (relying heavily on bash xml was of course not an option). One problem this came in handy with was adjusting the sharding configuration - going from a simple random sharding to old vs new to monthly they could optimise the configuration to their load. What worked well for them was to separate JVM startup from Solr core startup - they would start with empty solrs

Tomcat Tuesday talk

2009-05-21 09:07
Since several months at neofonie we have a talk given by external or internal developers on various subjects each Tuesday. Usually these presentations are a nice way to get an overview of new emerging technologies, to get an overview of current conference topics or to gain insight into interesting internal projects.

This week we had Apache Tomcat Committer and PMC Peter Rossbach here at neofonie to talk about the Tomcat architecture and Tomcat clustering solutions. He gave two pretty in-depth presentations on the Tomcat internals, Tomcat optimization and extension points.

Some points that were especially interesting to me: The project started out in the late nineties, initiated by a bunch of developers who just wanted to see what it takes to write a web application container and that fullfills the spec. The goal basically was a reference implementation. Soon enough however users defined the resulting code as production ready and used it.

There are a few caveats from this history that are still visible. One is the lack of tests in the codebase. Sure, each release is tested agains the Sun TCK - but these tests cannot be opened to the general public. So if you as a developer make extensions or modifications to the code base there is no easy way of knowing whether you broke something or not.

For me as a developer it was interesting to see really how complex it quickly gets to cluster tomcat deployments and make them failure resistant. Some tools mentioned that help automatic with easier deployment are Puppet and FAI. One issue however that is still on the developer's agenda is Tomcat monitoring.

To summarize: The conference room was packed with developers expecting two very interesting talks. Thanks to Peter Rossbach for coming to neofonie and explaining more on the internals of the Tomcat software, the project and the community behind.