Archive

Posts Tagged ‘CouchDB’

Scalability

June 23rd, 2010 at 11:17am

For Berlin Buzzwords we concentrated pretty heavily on scalable systems and architectures: We had talks on Hadoop for scaling data analysis; HBase, Cassandra and Hypertable for scaling data storage; Lucene and Solr for scaling search.

A recurring pattern was people telling success stories involving project that either involve large amounts of data or growing user numbers. Of course the whole topic of scalability is extremely interesting for ambitious developers: Who would not be happy to solve internet-scale problems, have petabytes of data at his fingertips or tell others that their “other computer is a data center”.

There are however two aspects of scalability that people tend to forget pretty easily: First of, if you are designing a new system from scratch that implements a so far unknown business case - your problem most likely is not scalability. It’s way more likely that you have to solve marketing tasks, just getting people to use your cool new application. Only after observing what users actually do and use you have the slightest chance of spotting the real bottlenecks and optimising with clear goals in mind (e.g. reduce database load for user interaction x by 40%).

The second issue people tend to forget about scalability is that the term is about scaling systems - some developers easily mix that up with high performance. The goal is not to be able to deal with high work load, but to build a system that can deal with increasing (or decreasing) work load. Ultimately this means that not only your technology must be scalable: Any architecture can only scale to a certain load. The organisation building the system must be willing to continuously monitor the application they built - and be willing to re-visit architectural decisions if the environment changes.

Jan Lehnardt had a very interesting point in his talk on CouchDB: When talking about scalability, people usually look into the upper right corner of the application benchmark. However to be truely scalable one should also look into the lower left corner: Being scalable should not only mean to be able to scale systems up - but also to be able to scale them down. In the case of CouchDB this means that not only large installations at BBC are possible - but running the application on mobile devices should be possible without problems as well. It’s an interesting point in the current “high scalability” hype.

Free Software ,

Dev House Berlin 2.0

October 4th, 2009 at 8:04pm

This weekend DevHouseBerlin took place in the Box119, kindly organized by Jan Lehnardt, sponsored by Upstream and StudiVZ. There were about 30 people gathered in Friedrichshain, hacking and discussing various projects: Mostly Python/ Django, Ruby/ Rails and Erlang people.

The first day was reserved for hacking and exchanging ideas. Late afternoon attendees put together a list of talks that were than rated, ranked with the top three chosen for presentation on Sunday. The list included topics on CouchDB, RestMS, Hadoop, Concurrency in Erlang, P2P CouchDB and many more. The first three topics were chosen by the participants for presentation.

During the time at DevHouse I finally got a list of topics and papers up at Mahout TU project - now only the exact credit system for the Mahout course at TU is missing. I got some time to work on Mahout improvements and documentation. Unfortunately I was too tired today to complete the code review for MAHOUT-157 - promise to do that early next week.

Spending one weekend with equal-minded people, being able to pair with someone else in case of more complex problems made the weekend a great time for me. Planning to be there again next year. Thanks to the sponsors and organisers for making this happen.

*Camp, Hacking , ,