May 17, 2013
BigDataCon # Together with Uwe Schindler I had published a series of articles on Apache
Lucene at Software and Support Media's Java Mag several years ago. Earlier this
year S&S kindly invited my to their BigDataCon - co-located with JAX to give a
talk of my choosing that at least touches upon Lucene.
Thinking back and forth about what topic to cover what came to my mind was to
...
May 16, 2013
Hadoop Summit Amsterdam # About a month ago I attended the first European Hadoop Summit, organised by
Hortonworks in Amsterdam. The two day conference brought together both vendors
and users of Apache Hadoop for talks, exhibition and after conference beer
drinking.
Russel Jurney kindly asked me to chair the Hadoop applied track during
Apache Con EU. As a result I had a good excuse to attend the event. Overall
...
May 15, 2013
ApacheConNA: Misc # In his talk on Spdy Mathew Steele explained how he implemented the spdy protocol
as an Apache httpd module - working around most of the safety measures and
design decisions in the current httpd version. Essentially to get httpd to
support the protocol all you need now is mod_spdy plus a modified version of
mod_ssl.
The keynote on the last day was given by the Puppet founder.
...
May 14, 2013
ApacheConNA: Hadoop metrics # Have you ever measured the general behaviour of your Hadoop jobs? Have you
sized your cluster accordingly? Do you know whether your work load really is IO
bound or CPU bound? Legend has it noone expecpt Allen Wittenauer over at
Linked.In, formerly Y! ever did this analysis for his clusters.
Steve Watt gave a pitch for actually going out into your datacenter measuring
what is going on there and adjusting the deployment accordingly: In small
...
May 13, 2013
ApacheConNA: Monitoring httpd and Tomcat # Monitoring - a task generally neglected - or over done - during development.
But still vital enough to wake up people from well earned sleep at night when
done wrong. Rainer Jung provided some valuable insights on how to monitor Apache httpd and Tomcat.
Of course failure detection, alarms and notifications are all part of good
monitoring. However so is avoidance of false positives and metric collection,
...
May 12, 2013
ApacheConNA: On Security # During the security talk at Apache Con a topic commonly glossed over by
developers was covered in quite some detail: With software being developed that
is being deployed rather widely online (over 50% of all websites are powered
by the Apache webserver) natually security issues are of large concern.
Currently there are eight trustworthy people on the foundation-wide security
response team, subscribed to security@apache.org. The team was started by
...
May 11, 2013
ApacheConNA: On documentation # In her talk on documentation on OSS Noirin gave a great wrap up of the topic of
what documentation to create for a project and how to go about that task.
One way to think about documentation is to keep in mind that it fulfills
different tasks: There is conceptual, procedural and task-reference
documentation. When starting to analyse your docs you may first want to debug
...
May 10, 2013
ApacheConNA: On delegation # In her talk on delegation Deb Nicholson touched upon a really important topic in
OSS: Your project may live longer than you are willing to support it yourself.
The first important point about delegation is to delegate - and to not wait
until you have to do it. Soon you will realise that mentoring and delegation
actually is a way to multiply your resources.
In order to delegate people to delegate to are needed.
...
May 9, 2013
ApacheConNA: First keynote # All three ApacheCon keynotes were focussed around the general theme of open
source communities. The first on given by Theo had very good advise to the
engineer not only striving to work on open source software but become an
excellent software developer:
Be loyal to the problem instead of to the code: You shouldn't be
addicted to any particular programming language or framework and refuse to work
...
May 8, 2013
Apache Hadoop Get Together Berlin # This evening I joined the group over at Immobilienscout 24 for today’s Hadoop Get Together. David Obermann had invited Dr. Falk-Florian Henrich from CeleraOne to talk about their real-time analytics on live data streams.
Their system is being used by the New York Times Springer’s Die Welt for traffic analysis. The goal is to identify recurring users that might be willing to pay for the content they want to read.
...