Inductive Bias

FOSDEM 2013 - 01

February 13, 2013
hardware, Brussels, Fosdem, Event

FOSDEM 2013 - 01 # On Friday morning our train left for this year’s FOSDEM. Though a bit longish I have a strong preference for going by train as this gives more time and opportunity for hacking (in my case trying out Elastic Search), reading (in my case the book “Team Geek”) and chatting with other FOSDEM visitors. Monday morning was mostly busy with meeting people - at the FSFE, Debian, Apache Open Office booths, generally in the hallways. ...

Elastic Search meetup Berlin – January 2013

February 1, 2013
Lucene, tfidf, Free Software, elastic search

Elastic Search meetup Berlin – January 2013 # The first meetup this year I went to started with a large bag of good news for Elastic Search users. In the offices of Sys Eleven (thanks for hosting) the meetup started at 7p.m. last Tuesday. Simon Willnauer gave an overview of what to expect of the upcoming major release of Elastic Search: For all 0.20.x version ES features a shard allocator version that is ignorant of which index shards belong to, machine properties, usage patterns. ...

Linux vs. Hadoop - some inspiration?

January 16, 2013
Hadoop, brainstorming, standardisation, Linux, desgin, Hacking

Linux vs. Hadoop - some inspiration? # This (even for my blog’s standards) long-ish blog post was inspired by a talk given late last year at Apache Con EU as well as from discussions around what constitutes “Apache Hadoop compatibility” and how to make extending Hadoop easier. The post is based on conversations with at least one guy close to the Linux kernel community and another developer working on Hadoop. ...

ABC - die Katze lief im Schnee

January 11, 2013
Relocating to Berlin, Berlin, winter, snow

ABC - die Katze lief im Schnee # Seen this morning in Berlin: A little impression from what the city looked like the weeks before it turned green on Christmas: For winter images of other years see also previous posts. Title taken from a children’s song:

On Taming Text

January 1, 2013
Mahout, Lucene, review, book, Science

On Taming Text # This time of the year I would usually post pictures of my bicycle standing in the snow somewhere in Tierpark. This year however I was tricked into using public transport instead: a) After my husband found a new job, we now share some of the route to work - and he isn’t crazy going by bike when it’s snowing. b) I got myself a Nexus7 earlier this month which obsoleted having to take paper books with me when using public transport. ...

Thanks for all the help

December 31, 2012
Free Software, Thanks, General

Thanks for all the help # This year was a blast: It started with the ever great FOSDEM in Brussels (see you there in 2013?), an invitation to GeeCon in Poznan (if you ever get an invitation to speak there - do accept, the organisers do an amazing job at that event). In summer we had Berlin Buzzwords in Berlin for the third time with 700 attendees (to retain the community feel to the conference we decided to limit tickets in 2013, so make sure you get your’s early). ...

RecSys Stammtisch Berlin - December 2012

December 30, 2012
Mahout, Science, recommendation, music, General

RecSys Stammtisch Berlin - December 2012 # Earlier this month I attended the fourth Recommender Stammtisch in Berlin. The event was kindly hosted by Soundcloud - who on top of organising the speakers provided a really yummy buffet by Kochzeichen D. With Paul Lamere the evening started with a very entertaining but also very packed talk on why music recommendation is special - or put more generally why all recommender systems are special: ...

Elastic Search meetup Berlin

November 28, 2012
Lucene, elasticsearch, Get Together

Elastic Search meetup Berlin # Today Retresco hosted the (to my knowledge fourth) Elastic Search User Group Berlin - a group dedicated to using Lucene as part of Elastic Search. With roughly fifteen attendees the meetup attracted a decent crowd - most interestingly many of the people there were already using the software either in production or for closed beta projects. The fist talk given was by people from ferret-go - a company doing media monitoring for brands focused on the German market. ...

ApacheConEU - part 11 (last part)

November 20, 2012
buildr, ApacheCon, Apache Con, log4j, apacheconeu

ApacheConEU - part 11 (last part) # One of the last sessions covered logging frameworks for Java. Christian Grobmeier started by detailing the common requirements for all logging frameworks: Speed - developers do not want to pay a disproportional penalty for using a logging framework. Fail-safety and reliability - under no circumstances should your logging framework kill your application. In addition it would be most annoying to find that one log message that would help you de-cypher the problem your application ran into missing. ...

ApacheConEU - part 10

November 19, 2012
Lucene, tika, ApacheCon, Apache Con, apacheconeu

ApacheConEU - part 10 # In the next session Jukka introduced Tika - a toolkit for parsing content from files including a heuristics based component for guessing the file type: Based on file extension, magic and certain patterns in the file the file type can be guessed rather reliably. Some anecdotes: not all mime types are registered with IANA, there are of course conflicting file extensions, Microsoft Word not only localises their interface but also the magic in the file, ...