BigDataCon #
Together with Uwe Schindler I had published a series of articles on Apache
Lucene at Software and Support
Media's Java Mag several years ago. Earlier this
year S&S kindly invited my to their BigDataCon - co-located
with JAX to give a
talk of my choosing that at least touches upon Lucene.
Thinking back and forth
about what topic to cover what came to my mind was to
give a talk on how easy it is to do text classification with
Mahout when
relying on Apache Lucene for text analysis, tokenisation and token filtering.
All classes essentially
are in place to integrate Lucene Analyzers with Mahout
vector generation - needed e.g. as a pre-processing step for
classification or
text clustering.
Feel free to check out some of my sandbox code over at
<a
href=``http://github.org/MaineC/sofia''>github</a>.
After attending the conference I can
only recommend everyone interested in Java
programming and able to understand German to buy a ticket for the
conference.
It's really well executed, great selection of talks (though the sponsored
keynotes usually aren't
particularly interesting), tasty meals, interesting
people to chat with.