Inductive Bias

Apache Hadoop Event Blog

August 24, 2009
Hadoop, Event, Software Foundation

Apache Hadoop Event Blog # As Apache Hadoop becomes ever more popular both in industry as well as in research, user groups, conferences and hacking days are being scheduled around the world. The goal of the event calendar blog hosted on wordpress.com is to provide a common space for organizers to announce their events and potential participants to look for new conferences.

Fellow now

August 23, 2009
Free Software, FSFE

Fellow now # After two years volunteering as booth staff for the FSFE at the Chemnitzer Linuxtage explaining the advantages of becoming a FSFE fellow I am a fellow myself for two days ;) I first got in contact with the FSFE through Fernanda Weiden during my time in Zürich in 2006. In the meantime I have learned more and more about the political activities of FSFE: Mostly during the local Berlin meetups in newthinking store and as a booth member in Chemnitz. ...

Flying back home from Cologne

August 23, 2009
Mahout, Germany, Free Software, Software Foundation, FrOSCon

Flying back home from Cologne # Last weekend FrOSCon took place in Sankt Augustin, near Cologne. FrOSCon is organized on a yearly basis at the university of applied sciences in Sankt Augustin. It is a volunteer driven event with the goal of bringing developers and users of free software projects together. This year, the conference featured 5 tracks, two examples being cloud computing and the Java track. Unfortunately this year the conference started with a little surprise for me and my boyfriend: Being both speakers, we had booked a room in Hotel Regina via the conference committee. ...

Converting a git repo to svn

August 17, 2009
svn, Hacking, git

Converting a git repo to svn # Pretty unlikely though it may seem, but there are cases when one might want to convert a git repo to svn and still keep all revisions intact. There is a nice explanation online on how to do that in the Google Open Source blog.

September 2009 Hadoop Get Together Berlin

August 17, 2009
JAQL, Hadoop, Software Foundation, Lucene, Event, Get Together

September 2009 Hadoop Get Together Berlin # The newthinking store Berlin is hosting the Hadoop Get Together user group meeting. It features talks on Hadoop, Lucene, Solr, UIMA, katta, Mahout and various other projects that deal with making large amounts of data accessible and processable. The event brings together leaders from the developer and user communities. The speakers present projects that build on top of Hadoop, case studies of applications being built and deployed on Hadoop. ...

AMQP Erlang user group talk

July 10, 2009
Messaging, Hacking, Erlang, Free Software, General

AMQP Erlang user group talk # Last Wednesday at the Erlang user group Berlin Matthias Radestock from the RabbitMQ project gave a talk on RabbitMQ, AMQP and messaging in general. Slides are available online. First Matthias motivated the need for an open standard for messaging: So far, their are a few provides of middleware systems like Tibco and IBM. But those solutions are usually closed, expensive, cumbersome to handle. In short they do not fit into a world where people rely on open standards for communication, free software for development and lightweight implementations. ...

Solr at AOL

July 2, 2009
Solr, Hacking, Free Software, Software Foundation

Solr at AOL # Grant Ingersoll has posted a very interesting interview with Ian Holsman on Solr at Relegance, now AOL. It describes the business side of the decission to switch to an open source solution, provides some inside on the size of the installation and details which technological reasons have driven the decission to switch from a proprietary implementation to Solr: http://www.lucidimagination.com/Community/Hear-from-the-Experts/Podcasts-and-Videos/Interview-Ian-Holsman-Relegence</ a>

Lucene slides online

June 30, 2009
Lucene, Get Together, General

Lucene slides online # The slides of the Lucene talk at the last Apache Hadoop Get Together Berlin are available online: Lucene Slides. Especially interesting to me are the last few slides which detail both index size and machine setup: The installation is running on two standard PCs with 2 dual-core processors (usual speed, bought in January 2008 for about 4000 Euro). They have 32GB RAM, 24 GB are used as ramdisk for the index. ...

Data serialization

June 26, 2009
Avro, Data Serialization, General, Protocol Buffers, Etch, Get Together, Thrift

Data serialization # XML, JSON and others are currently standard data exchange formats. Being human-readable but still structured enough to be easily parsable by programs is their main benefit. Problems are overhead in size and parsing time. In addition at least xml is not really as human-readable as it could be. An alternative are binary formats. Yet those often are not platform independent (either C++ or Java or Python bindings) or are not upgradable (what if your boss comes along and wants you to add yet another field? ...