Science

On Taming Text

January 1, 2013
Mahout, Lucene, review, book, Science

On Taming Text # This time of the year I would usually post pictures of my bicycle standing in the snow somewhere in Tierpark. This year however I was tricked into using public transport instead: a) After my husband found a new job, we now share some of the route to work - and he isn’t crazy going by bike when it’s snowing. b) I got myself a Nexus7 earlier this month which obsoleted having to take paper books with me when using public transport. ...

RecSys Stammtisch Berlin - December 2012

December 30, 2012
Mahout, Science, recommendation, music, General

RecSys Stammtisch Berlin - December 2012 # Earlier this month I attended the fourth Recommender Stammtisch in Berlin. The event was kindly hosted by Soundcloud - who on top of organising the speakers provided a really yummy buffet by Kochzeichen D. With Paul Lamere the evening started with a very entertaining but also very packed talk on why music recommendation is special - or put more generally why all recommender systems are special: ...

Data Scientists - researchers' persectives

August 3, 2012
Science, data science, Statistics, Machine Learning

Data Scientists - researchers’ persectives # “Data scientist” as a term has caught quite some attention as of late (together with all the big data, scalability and cloud hype). Instead of re-hashing arguments seen in other sources I thought it might make more sense to link to a few of the thought provoking posts I came across recently. In his post Mikio Braun analyses the factors motivating research in academia vs. ...

FrOSCon 2012

July 31, 2012
Science, embedded, FrOSCon, FSFE, Event

FrOSCon 2012 # On August 25th/26th the Free and Open Source Conference (FrOSCon) will again kick off in Sankt Augustin/ Germany. The event is completely community organised, hosted by the FH Sankt Augustin. It covers a broad range of free software topics like Arduino microcontrollers, git goodies, politics, strace, open nebula, wireshark and others. Three highlights that are on my schedule: I’ll make sure I do not miss Thilo Fromm’s presentation on building a platform project on top of Open Embedded. ...

Book: Search Patterns

July 28, 2012
search, Science, review, patterns, book, Design, oreilly

Book: Search Patterns # I got the book months ago during FOSDEM - the O’Reilly book table always is a pretty dangerous place as a meeting point for me: Search Patterns - Design for Discovery is one of those small, deceivingly beautiful books that manages to explain effective search engine design by focusing on the end user needs but going into some detail concerning the basics of search engine backends as well. ...

Need your input: Failing big data projects - experiences from the wild

July 18, 2012
big data, Science, Hadoop, Strata, fail, Hacking, Event

Need your input: Failing big data projects - experiences from the wild # A few weeks ago my talk on “How to fail your big data project quick and rapidly” was accepted at O’Reily Strata conference in London. The basic intention of this talk is to share some anti-patterns, embarrassing failure modes and “please don’t do this at home” kind of advice with those entering the buzzwordy space of big data. ...

GeeCon - TDD and it's influence on software design

May 22, 2012
Science, Hacking, testing, Free Software, geecon

GeeCon - TDD and it’s influence on software design # The second talk I went to on the first day was on the influence of TDD on software design. Keith Braithwaite did a really great job of first introducing the concept of cyclomatic complexity and than showing at the example of Hudson as well as many other open source Java projects that the average and mean cyclomatic complexity of all those projects actually is pretty close to one and when plotted for all methods pretty much follows a power law distribution. ...

GeeCon - Randomized testing

May 21, 2012
Science, testing, Free Software, randomized, geecon, Hacking

GeeCon - Randomized testing # I arrived late during lunch time on Thursday for GeeCon – however just in time to listen to one of the most interesting talks when it comes to testing. Did you ever have the issue of writing code that runs well in your development environment but crashes as soon as it’s rolled out at customers only to find out that their Locale setting was causing the issues? ...

Learning Machine Learning with Apache Mahout

December 13, 2011
Mahout, Science, electure, linear algebra

Learning Machine Learning with Apache Mahout # Once in a while I get questions like Where to start learning more on machine learning. Other than the official sources I think there is quite good coverage also in the Mahout community: Since it was founded several presentations have been given that give an overview of Apache Mahout, introduce special features or even go into more details on particular implementations. Below is an attempt to create a collection of talks given so far without any claim to contain links to all videos or lectures. ...

Machine learning problem settings

August 6, 2011
Apache Mahout, Science, Theory, Machine Learning

Machine learning problem settings # Together with Sebastian Schelter I held a Nokia sponsored (Thank you!) lecture on large scale data analysis and data mining during the past few months. After supervising a few successful university projects based on Apache Mahout the goal of this lecture was to introduce students to some of the basic concepts and problems encountered today in a world where huge datasets are generally available and are easy to process with Apache Hadoop. ...