FOSDEM 2013 - 01

2013-02-13 22:38
On Friday morning our train left for this year's FOSDEM. Though a bit longish I have a strong preference for going by train as this gives more time and opportunity for hacking (in my case trying out Elastic Search), reading (in my case the book “Team Geek”) and chatting with other FOSDEM visitors.

Monday morning was mostly busy with meeting people - at the FSFE, Debian, Apache Open Office booths, generally in the hallways. And with getting some coffee to the Beaglebone booth where my husband helped out . For really fun videos on the hardware they had there see:

if you want to get the hardware underneath talk to circuitco.

Unfortunately I didn't make it to the community and marketing room – too full during the talks that I wanted to see (as a general shout-out to people attending conferences: If you do not find a seat, move into the room instead of standing right next to the door, if you do have a seat and a free one just next to you, move to the seat next to you).

If you missed some of the talks you might want to try your luck with the FOSDEM video archive - it's really extensive featuring videos taken at previous editions as well and is a great resource to find talks of the most important tracks.

FOSDEM - Sunday - smaller bits and pieces

2011-02-18 20:17

With WebODF the Office track featured a very interesting project that focusses on providing a means to open ODF documents in your favourite browser: Content and formatting are converted to a form that can easily be dealt with by using a combination of HTML and CSS. Advanced editing is then supported by using JavaScript.

With Open Stack the following talk focussed on an open cloud stack project that was started by NASA and Rackspace as both simultanously needed support for an open source, openly designed, developed cloud stack that strives for community inclusion. According to the speaker the goal is to be as ubiquitous a cloud project as Apache is for web servers - he probably was not quite aware of how close to even the foundation side of Apache that development model is.

The closing keynote dealt with the way kernel development takes place. There were a few very interesting pieces of information for contributors that are valid for any open source project really:

  • Out of tree code is invisible to the kernel developers and users. As such the longer it remains out of tree code the harder it becomes to actually go out there and feel the wind.
  • In contrast open code means giving up control: Maintainership means responsibility but it does not come with any power or control over the source code. Similarly opening code up as patch or separate project at Apache means giving up control - means working towards turning the project into a community that can live on its own.
  • For kernel patches the general rule is to not break things and not go backward in quality: What is working for users today must be working with the next release as well. To be able to spot any compat issues it is necessary to take part on the wider disucssion lists - not only in your limited development community. Developers should focus on coming up with a problem solution instead of getting their original code into the project.

Or in short: The kernel is no research project, as such it must not break existing applications. Visionary brilliance really is no excuse for poor implementation. Conspiracy theories such as "hey, developer x declined my patch only because it is out of scope for his employer's goals" are not going to get you anywhere. Such things do happen, but in general kernel developers first think of themselves as kernel developers - being employee somewhere only comes after that.

Keep in mind that the community remembers past actions. In the end you need not convince business people or users but the developers themselves who might end up with the maintanance burden for your patch. To get your patch accepted it greatly helps to not express it in terms of the implementation needs only but to clearly formulate your requirements - independent of implementation. And as in any open source project, helping with cleanup (that is not only white space fixes, but real cleanup as in refactoring) does help build a positive attitude.

Why you should go for kernel development never the less? It's a whole lot of fun. It's a way to influence the kernel to support the features that you need. It's sort of like becoming part of an elite club - and which developer does not like the feeling of belonging to the elite changing the way the world looks tomorrow? In addition as with an substantial open source involvement being visible in the kernel community also most likely means being visible to your future employer.

FOSDEM - HBase at Facebook Messaging

2011-02-17 20:17

Nicolas Spiegelberg gave an awesome introduction not only to the architecture that powers Facebook messaging but also to the design decisions behind their use of Apache HBase as a storage backend. Disclaimer: HBase is being used for message storage, for attachements with Haystack a different backend is used.

The reasons to go for HBase include its strong consistency model, support for auto failover, load balancing of shards, support for compression, atomic read-modify-write support and the inherent Map/Reduce support.

When going from MySQL to HBase some technological problems had to be solved: Coming from MySQL basically all data was normalised - so in an ideal world, migration would have involved one large join to port all the data over to HBase. As this is not feasable in a production environment instead what was done was to load all data into an intermediary HBase table, join the data via Map/Reduce and import all into the target HBase instance. The whole setup was run in a dark launch - being fed with parallel life traffic for performance optimisation and measurement.

The goald was zero data loss in HBase - which meant using the Apache Hadoop append branch of HDFS. The re-designed the HBase master in the process to avoid having a single point of failure, backup masters are handled by zookeeper. Lots of bug fixes went back from Facebooks engineers to the HBase code base. In addition for stability reason rolling restarts were added for upgrades, performance improvements, consistency checks.

The Apache HBase community received lots of love from Facebook for their willingness to work together with the Facebook team on better stability and performance. Work on improvements was shared between teams in an amazing open and inclusive model to development.

One additional hint: FOSDEM videos of all talks including this one have been put online in the meantime.

FOSDEM - Django

2011-02-16 20:17

The languages/ cloud computing track on Sunday started with the good, the bad and the ugly of Django's architecture. Without much ado the speaker started by giving a high level overview of the general package layout of Django - unfortunately not going into too much detail on the architecture itself.

What he loves about Django are the model layer abstractions that really are no ORM only - instead both relational and non-relational databases can be supported easily. Abstractions in Django are made by task solved - there are multiple implementations available for caching, mailing, session handling etc. There is great geo support with options for defining geo objects, querying single points on a map for all their overlaying geo objects. Being a community of test driven people Django features awesome debugging and testing tools. To avoid cross side request forgery Django comes with built in protection mechanisms.

There is multi database support for building applications. Being a small core implementation features can be turned on and off as needed. In addition the framework comes with great documentation: No feature addition is accepted unless it comes with decent documentation - which fits nicely with the common perception that anything that is untested and undocumented does not exist.

The bad things about Django according to the speaker? Well, the old CSRF protection implementation that might lead to token leakage. Schema changes and migrations currently really are hard to handle. Though there is south to handle at least some of the migrations pain. The templating implementation could use some improvement as well - being designed to make inclusion of logic in the templates hard some use cases are just to clumsy to implement.

As for the ugly things: There is quite a bit of magic at work which generally leads to harder tracing of applications - that is about to get better. Too many parts of Django rely on unwieldy regular expressions. Anything that spans more than 4 lines on a screen probably is to be considered unmanageable and unchangeable. Authentication cannot really be customised - the information that is stored per user is hard coded and fixed.

Over time what was learned: Refactoring cannot be avoided as requirements change. However being consistent in what you do makes it so much easier for users to pick up the framework. What helps with creating a great open source project: People that have the time to invest - never under estimate the time needed to really go from prototype to production ready.

FOSDEM - Saturday

2011-02-15 20:17

Day one at FOSDEM started with a very interesting and timely keynote by Eben Moglen: Starting with the example of Egypt he voted for de-centralized distributed and thus harder to take over communication systems. In terms of tooling we are already almost there. Most use cases like micro blogging, social networking and real time communications can already be implemented in a distributed, fail safe way. So instead of going for convenience it is time to think about digital independence from very few central providers.

I spent most of the morning in the data dev room. The schedule was packed with interesting presentations ranging from introductory overview talks on Hadoop to more in depth treatment of the machine learning framework Apache Mahout. With an analysis of the Wikileaks cables the schedule also included case studies on what use cases can be implemented by thourough data anlysis. The afternoon featured presentations on the background to more data analytics for better usability at Wikimedia as well as talks on buiding search applications.

In the lightning talks room a wide variety of projects was presented - in only ten minutes Pieter Hintjens explained the gist of using 0MQ for messaging. That talk included "Hintjens law of concurrency: e = m * c^2, where e is effort needed to implement and maintain, m is mass - that is the amount of code written and c is complexity.

For me the day ended with a very interesting presentation by Matthias Kirschner/FSFE on one of their campaigns: has the very narrow and well scoped goal of getting links to unfree software off of governmental web pages. Using a really intuitive example they were able to convince officials of linking to their vendor neutral list of pdf readers: "Just imagine a road in your city. At this road drivers will find a sign that tells them the road is well suited to be used by VW cars. Those cars can be obtained for test drive at the following address. Your government." As unthinkable as such as sign may be that same text is included in nearly all governmental web pages linking to the acrobat reader.

What made pdfreaders successful is the combined effort of volunteers, its very narrow and clear scope, it's scalability by nature: People were asked to submit "broken" web pages to a bug tracker, campaign participants would then go and send out paper letters to these institutions and mark the bugs fixed as soon as the links were changed. Letters were pre-written and well prepared. So all that was needed was money for toner, paper and stamps.

One final cute example of how that worked out can be seen at

FOSDEM - video recordings online

2010-02-14 20:32
As published in the FOSDEM blog the video recordings are available online - at least for the main track and the lightning talks. Happy video watching!

FOSDEM 2010 - part 2

2010-02-09 21:00
The event itself featured 306 talks - so pretty hard to choose what to watch on two days. This time, not only the main tracks were awesome, but also several dev rooms featured very interesting talks by well known FOSS developers.

Saturday started with a FOSDEM birthday dance done by all attendees. The first keynote speaker Brooks Davis explained his experiences promoting open source methods at a large company. After that Richard Clayton gave an amazing talk on the evil on the internet. He explained not only how phishing works on a technical level but also included an explanation of the economics behind these attacks, explained how the money flow from victims to attackers works.

On the afternoon Bernard Li gave an introduction to the cluster monitoring tool Ganglia. Directly after that Lindsay Holmwood gave an overview of the monitoring and notification tools flapjack and cucumber-nagios.

The evening was filled with the speakers dinner. Thanks for the organisers for providing that. We had a really nice evening together with some of the organisers, Andrew Tanenbaum and Elena Reshetova at our table.

FOSDEM 2010 - part 1

2010-02-08 21:00
Four years ago I was working in Saarbrücken. From there it is a very short ride over to FOSDEM (little more than 300km). So I decided - hey, why not stay there for a weekend. I found a very nice Brussels bed and breakfast hotel called Rovignon - featuring not only comfortable rooms at reasonable prizes but also cats in the house.

Back then, I barely knew anyone at the conference. However the lineup of speakers including St Peter from XMPP and Georg Greve from FSFE was impressive.

As a result it became a loved tradition of Thilo and myself to drive over to Brussels, attend FOSDEM and watch great talks. Over time there were more and more familiar faces, e.g. at the FSFE booth, among the Debian people...

Last weekend I had an awesome time in Brussels at FOSDEM for the fourth time in a row. I am honoured to have been invited by the FOSDEM organisers for a main track talk on Hadoop in the scalability slot (in Janson...).

We arrived on Friday afternoon, however being awefully tired we unfortunately could not join the Friday evening beer event (though, as I am not drinking beer, I would probably have missed quite a bit of the fun).