Archive

Archive for September, 2009

Slides are up

September 30th, 2009

The slides for yesterday’s talks just arrived. They are available online at:

Videos will be online early next week.

Apache Hadoop Get Together Berlin , ,

Apache Hadoop Get Together Berlin

September 29th, 2009

The Get Together started just a few minutes ago. The room is packed with more than 35 people this time. This is the first Hadoop Get Together in Berlin that will be recorded on video, thanks to Martin from newthinking for doing the recording and post processing as well as to Cloudera for sponsoring the videos.

The first talk was given by Thorsten Schuett on solving puzzles with map reduce. His disclaimer: Working at ZIB Berlin he had a large cluster in the basement to put to good use. However the cluster does not run Hadoop. It is based on Lustre FS and does not rely on commodity hardware. So he implemented a solver for 4×4 sliding puzzles in a map reduce framework targeted for “his” cluster.

Second talk was by Thilo Goetz on JAQL, a language for querying JSON documents that can run queries on top of a Hadoop cluster.

In the third and last talk, Uwe Schindler gave an overview of the new features and performance improvements of last weeks Lucene 2.9 release.

After raffling the Hadoop books donated by O’Reilly, we will move to a bar close by after the talks are over to have some beer and continue discussions. A summary that includes more details as well as links to the slides will be online soon.

Update: I had reserved a table at Cafe Aufsturz close to newthinking store for about 15 people - maybe less, maybe more. We ended up going there with more than 25 people - really glad there were still enough tables left for us :)

Update 2: Next meetup - December 16th, I already got one definite and two tentative proposals for talks.

Apache Hadoop Get Together Berlin

AWS User Group Berlin

September 29th, 2009

On Monday the first AWS user group took place in newthinking store, Berlin. The event featured talks by Martin Buhr from Amazon as well as presentations of AWS users like Dawanda, Peritor and Sound Cloud.

Unfortunately the most interesting question concerning Elastic Map Reduce was left unanswered by Martin: Does using EMR facilitate exploiting data locality/ rack locality optimizations that are possible in Hadoop? The question on whether Amazon is using the AWS APIs internally as well was answered positively, though of course they did not publish all of their systems infractructure.

Next meeting was scheduled to take place in two months time. Thanks to Peritor for organizing the meetup.

*Camp

Looking for a dancing school in Berlin

September 24th, 2009

I am looking for a dancing school (standard as well as Salsa Cubana) in Berlin Schöneberg. So in case you have any recommendations - please leave a comment.

Freetime

Upcoming: Apache Hadoop Get Together Berlin

September 23rd, 2009

This is a friendly reminder that the next Apache Hadoop Get Together takes place next week on Tuesday, 29th of September* at newthinking store (Tucholskystr. 48, Berlin).

  • Thorsten Schuett, Solving Puzzles with MapReduce.
  • Thilo Götz, Text analytics on jaql.
  • Uwe Schindler, Lucene 2.9 Developments.

Big thanks goes to newthinking store for providing the venue for free and to Cloudera for sponsoring videos of the talks. Links to the videos will be posted on , on the upcoming page linked above, as well as on the Cloudera Blog soon after the event. Yet another thanks goes to O’Reilly for providing three “Hadoop: The Definitive Guide” books to be raffled at the event.

The 7th Get Together is scheduled for December, 16th. If you would like to submit a talk or sponsor the event, please contact me.

Hope to see you in Berlin next week.

* The event is scheduled right before the UIMA workshop in Potsdam, which may be of interest to you if you are a UIMA user.

Apache Hadoop Get Together Berlin , , ,

Scrum Tisch

September 23rd, 2009

Title: Scrum Tisch
Description: The Scrumtisch on October 11th will feature a talk by Mary
Poppendieck.

She will join that extraordinary Scrumtisch at 6pm.

The location is not yet defined yet, because Marion first needs to know how many
of are coming.
Start Time: 18:00
Date: 2009-10-11

Scrum ,

Scrum Tisch

September 23rd, 2009

Title: Scrum Tisch
Location: La Vecchia Trattoria
Description: The next Scrum Tisch organized by Marion Eickmann takes place this Thursday. Since a pretty long time the format will be open for questions, prioritized by the participants again.

The location is in Niederbarnimstraße 25, near U-Bahn Samariterstrasse.
Start Time: 18:30
Date: 2009-09-24

Scrum ,

Mahout@TU WS 09/10

September 9th, 2009

Title: Mahout@TU WS 09/10

There is going to be a project/seminar course at TU Berlin on Apache Mahout. The goal is to introduce students to the work on a free software project, interact with the community and build production ready software.

Students will be given several potential tasks ranging from optimizing existing implementations, implementing new algorithms and (depending on their prior knowledge) improving, scaling and parallelizing existing algorithms.

Successful completion of the course depends on a number of factors: Interaction of the student with the community, ability to write tested (as in test-first-developed) code that performs well in a large scale environments, ability to show incremental development progress at each iteration, ability to review patches and improvements, usage of tools like SCM, Issue-tracker and mailinglists. Of course theoretical background - that is understanding existing publications as well extending their ideas is crucial as well.

If you are a student interessted in Mahout missing some course work, consider subscribing to the Mahout course at DIMA Berlin (linked below). Goal is that your work is to be integrated in one of the next releases, once the community is satisfied.

If you are a Mahout developer or user and have some issue that you consider suitable for a student to solve, please to provide your ideas.


Location: TU Berlin
Link out: Click here
Start Date: 2009-10-01
End Date: 2010-03-31

Apache , , ,

GSoC at Mahout

September 9th, 2009

GSoC 2009 is about to finish: Final evaluations are through, most of the code submitted by Mahout’s students has been committed to svn, code samples are on their way to Google.

In Mahout, we had three students joining the project: Robin working on an HBase based Naive Bayes extension and on frequent itemset discovery. David contributing a distributed LDA implementation. Deneche was working on a Random Forest implementation. All three of them have done great work during this summer, contributing not only code but valuable input on the project’s mailinglists as well. As a result, all three of them have been given committer status by the end of GSoC.

Apart from three new additions to the code base, summer also brought quite some traffic to the user list - not only in terms of subscriptions but also in terms of developers contributing to the discussions online. Currently, it looks like the project is really gaining momentum, as also noted in Grant Ingersoll’s post.

Discussions on the dev list on the future road map of Mahout clearly showed that the developers share the vision of a scalable, potentially distributed, stable machine learning library. That the focus should be on production ready code under a commercially friendly license instead of bleeding edge research implementations. Last but no least the goal is to build a lively, diverse community around the project to guarantee further development and user support.

2009 brought quite a few talks both in Germany as well as the US on the topic of Mahout (besides all the events on Hadoop, scalable databases and cloud computing in general) with an Apache Con US talk introducing Mahout in Oakland still to come.

Yesterday, a great article indroducing Apache Mahout with hands-on examples was published on IBM Developerworks by Grant Ingersoll. Check it out, if you want to learn more on Mahout, and Machine Learning in general.

Apache ,

First NoSQL Meetup in Germany

September 9th, 2009

On October 22nd 2009 the first NoSQL Meetup Germany is going to take place in newthinking store/ Berlin: http://nosqlberlin.de

Please submit your presentation proposals until September 22nd, accepted speakers will be notified soon after.

If you would like to sponsor the event, feel free to contact us: We would be very happy to provide videos after the event and free drinks for everyone during the event.

Hope to see you soon in Berlin.

*Camp, Apache, Events , ,