JAX: Tales from production #
In a second presentation Peter RoÃbach together with Andreas Schmidt provided
some more detail on what
the topic logging entails in real world projects.
Development messages turn into valuable information needed to
uncover issues
and downtime of systems, capacity planning, measuring the effect of software
changes, analysing
resource usage under real world usage. In addition to these
technical use cases there is a need to provide business
metrics.
When dealing with multiple systems you deal with correlating values across
machines and
systems, providing meaningful visualisations to draw the correct
decisions.
When thinking of your log
architecture you might want to consider storing not
only log messages. In addition facts like release numbers should
be tracked
somewhere - ready to join in when needed to correlate behaviour with release
version. To do that also
track events like rolling out a release to production.
Launching in a new market, switching traffic to a new system
could be other
events. Introduce not only pure log messages but also provide aggregated
metrics and counters. All
of these pieces should be stored and tracked
automatically to free operations for more important
work.
Have you ever thought about documenting not only your software, it’s interfaces
and input/output
format? What about documenting the logged information as well?
What about the fields contained in each log message?
Are they documented or do
people have to infer their meaning from the content? What about valid ranges
for values
- are they noted down somewhere? Did you store whether a specific
field can only contain integers or whether some day it also could contain
letters? What about the number format - is it decimal, hexadecimal?
For a nice architecture documentation of the BBC checkout
Winning the metrics battle by the BBC dev blog.
There’s an abundance of tools out there to help you with all sorts of logging
related topics:- For visualisation and transport: Datadog, kibana, logstash,
statsd,
graphite, syslog-ng - For providing the values: JMX, metrics, Jolokia
- For
collection: collecd, statsd, graphite, newrelic, datadog
- For storage: typical RRD tools including
RRD4j, MongoDB, OpenTSDB based
on HBase, Hadoop - For charting: Munin, Cacti, Nagios, Graphit,
Ganglia, New Relic, Datadog
- For Profiling: Dynatrace, New Relic, Boundary
- For events:
Zabbix, Icinga, OMD, OpenNMS, HypericHQ, Nagios,JbossRHQ
- For logging: splunk, Graylog2, Kibana,
logstash
Make sure to provide metrics consistently and be able to add them with minimal
effort. Self adaption and automation are useful for this. Make sure developers,
operations and product owners are able to use the same system so there is no
information gap on either side. Your logging pipeline should be tailored to
provide easy and fast feedback on the implementation and features of the
product.
To reach a decent level of automation a set of tools is needed for:- Configuration management (where to
store passwords, urls or ips, log
levels etc.). Typical names here include Zookeeper,but also CFEngine, Puppet
and Chef. - Deployment management. Typical names here are UC4, udeploy, glu,
etsy
deployment. - Server orchestration (e.g. what is started when during boot). Typical
names include UC4, Nolio, Marionette Collective, rundeck. - Automated provisioning (think ``how long does it
take from server failure
to bringing that service back up online?’’). Typical names include kickstart,
vagrant, or typical cloud environments. - Test driven/ behaviour driven environments (think about adjusting
not
only your application but also firewall configurations). Typical tools that
come to mind here include Server spec, rspec, cucumber, c-puppet, chef. - When it comes to defining the points of communication for the
whole
pipeline there is no tool you can use that is better than traditional pen and
paper, socially getting both development and operations into one room.
The tooling to support this process goes from simple self-written bash scripts
in the startup model to frameworks that support the flow partially, up to
process based suites that help you. No matter which path you choose the goal
should always be to end up with a well documented, reproducable step into
production. When introducing such systems problems in your organisation may
become apparent. Sometimes it helps to just create facts: It’s easier to ask for
forgiveness than permission. - For visualisation and transport: Datadog, kibana, logstash,
statsd,