Archive

Archive for the ‘Scrum’ Category

Part 2: Tracking tasks, or - Where the hack did my time go to last week?

September 3rd, 2010

After summarising some strategies for not loosing track of tasks, meetings and conferences in the last post, this one is going to focus on the retrospect on achievements. If at some point in time you have asked yourself “Where the hack did time go to?” - maybe after two busy weeks of work this article might have a few techniques for you.

Usually when that happens to me it’s either a sign that I’ve been on vacation (where that is totally fine) or that too many different, sometimes small but numerous tasks have sneaked into my schedule.

Together with Thilo I have found a few techniques helpful in dealing with these kind of problems. The goals in applying them (at least for me) have been:

  • Configure the planned work load to a manageable amount.
  • Make transparent and trackable (to oneself and others) which and how many tasks have been finished.
  • Track over time any changes in number of tasks accomplished per time slot.

After hearing about Scrum and its way of planning tasks I thought about using it not only for software development but for task planning in general. Scrum comprises some techniques that help achieving the goals described above:

  1. In Scrum, development is split into sprints: Iterations of focussed software development that are confined to a fixed length. Each sprint is filled up with tasks. The number of tasks put into one sprint is defined by the so-called velocity of the team.
  2. Tasks are ordered by priority by the product owner. Priority here is influenced by factors like risk (riskier tasks should be attacked earlier than safe ones), ROI (those tasks that promise to increase ROI most should of course be done and launched first) and a few more. After priorisation, tasks are estimated in order - that way those tasks most important to the product owner are guaranteed to have an estimated complexity defined even if there was not enough time to estimate all items.
  3. Complexity here does not mean “amount of time to implement a feature” - it’s more like how much time do we need, how much communication overhead is involved, how complex is the feature. A workable way to come up with reasonably sensible numbers is to chose one base item, assing complexity of one to it and estimate all coming items in terms of “is as complex as the base item”, “has double the complexity” - and so on - according to the fibonacci series. Fibonacci is well suited for that task as do not increase linearly - similarly humans are better at estimating small things (be it distances or complexities). As soon as items get too big, estimation also tends to be way off the real number.
  4. To come up with a reasonable estimate of what can be done in any week, I usually just look back to past weeks and use that as an estimate. That technique is close enough to the real number to be a working approach.

To track what got done during the past week, we use a whiteboard as Scrum Board. Putting tasks into the known categories of todo, checked out and done. That way when resetting the board after one week and adding tasks for the following week it is pretty obvious which actions ate up most of the time. The amount of work that goes onto the board is restricted to not be larger than what got accomplished during the past week.

So what goes onto the whiteboard? Basically anything that we cannot track as working hours: The Hadoop Get Together can be found just next to doing the laundry. Writing and sending out the long deferred e-mail is put right next to going out for dinner with potential sponsors for free software courses at university.

Now that weekly time tracking is set-up - is there a way to also come up with a nice longer term measure? Turns out, there are actually three:

First and most obviously the whiteboard itself provides an easy measure: By tracking weekly velocity and plotting that against time it is easy to identify weeks with more or less freetime. As a second source of information a quick look into ones calendar quickly shows how many meetings and conferences one attended over the course of a year. Last but not least it helps to track talks given on a separate webpage.

It helps to look back from time to time: To evaluate the benefit of the respective activities, to not loose track of the tasks accomplished, to prioritise and maybe re-order stuff on the ToDo list. Would be great if you’d share some of your techniques of tracking and tracing time and tasks - either in the comments or as a separate blog post.

Freetime, Hacking, Scrum ,

Berlin Scrumtisch - open discussion

August 25th, 2010

This evening the Berlin Scrumtisch took place in Friedrichshain. More than thirty participants followed Marion’s invitation for discussions on Scrum, wine and pizza at Vecchia Trattoria.

As there were several new participants, Felix started out with a very brief summary of the very core concepts of Scrum itself: Most important to know is the basic assumption of Scrum, that is planning ahead of time in a very detailed way is impossible. Defining goals and letting those who do the acutal work take the decisions on how to reach that goal is way easier and more promising. The whole process relies on fast feedback loops enabling developers and business people to run experiments on how to improve their work in a controlled environment.

Scrum comes with three roles: The development team responsible for delivering quality software, the product owner responsible for defining development goals that maximise return on investment and the scrum master as the moderator and facilitator who takes care that the roles and rituals are not broken.

Scrum comes with three plus one rituals: The daily standup (about 15min) used by the development team to get everyone up to date on a daily basis on everyone’s status, the Scrum Review and the Scrum Planning. In addition very important each sprint includes a retrospective that serves the purpose of improving the scrum team’s processes.

Scrum comes with three artifacts: The sorted product backlog of all user stories, the sprint backlog and the burndown chart showing the team’s progress.

However Scrum is just a framework - it tells you more on the goals, but less on exactly how to reach them. It should serve as a basis to adapt one’s processes to the project’s needs.

In the usual meetup planning phase we collected potential topics for discussions and ranked them by voting on them in the end. The topics proposed were:

  • Applications of Scrum for non-software-development projects.
  • How to convince teams of Scrum?
  • Awareness of the definition of done.
  • How to integrate testers in a team, extended by discussing the values of cross-functional teams.
  • How to be a tech PO.
  • Adding agile/XP to Scrum.
  • How to keep the team on focus.
  • Decision making in self organising teams.
  • Bonus HR in Scrum.

The topics rated highest were on raising the awareness for the definition of done and on decision making in self organising teams.

Definition of done

To introduce the topic, Felix repeated the goal of Scrum teams to deliver potentially quality - ahem - shipable software. ;) The problem of the guest and his co-workers was described as follows: Teams have definitions that are inconsistant not only across teams but also within teams.

Ultimately the goal of the definition of done is to enable teams to produce shipable software. One option to make the team aware of the need for better quality software might be to make them feel the pain their releases cause. It does not help to dictate a company-wide definition of done: It’s up to the team to define it. However to learn more on what shipable means the team must be allowed to make mistakes. They will fail - but learn from that failure as soon as they feel and see what gets influenced by their mistakes. As a resulst, they will refine their definition of done.

As the person to make happy in Scrum iterations is the PO, this could mean that the PO simply does not accept features, after all he is the one to define what shipable means. One factor that is a pre-condition for teams to be able to learn is to keep them stable. Learning needs time - teams need to be allowed to evolve. If yesterday’s team does not feel the pain their mistakes caused just because the team does not exist anymore or has been reconfigured - how would people be able to learn at all?

Decision making in self organising teams

The person proposing this question has the problem that in his teams some developers turn into leaders dictating the way software gets implemented. Other team members rarely join into that discussion and close to never take decisions. The result are endless discussions w/o real results.

The first idea that came up was for the Scrum Master to act as moderator. Marion came up with the proposal to use well known mediation techniques. She promised to share the links - would be great to have them published on the Scrumtisch Berlin blog as well. Thilo mentioned there are courses on mediation and moderation that can help him play that role.

As for long discussions: Felix mentioned a few typical patterns (or anti-patterns) that tend to lead to developers discussing endlessly:

  • Fear: fear of punishment for taking the wrong decision usually leads developers to avoid decisions altogether. Fix for that would be establishing and open culture that allows for failure and that enables people to learn from failure.
  • Striving for the 100% solution: Developers are not used to incremental thinking and try to solve all problems at once. Fix would be to teach them they get time for refactoring and are thus not punished for adhering to the YAGNI principle.
  • Personal conflicts in teams can lead to the described situation as well and can only be fixed by double-checking the team configuration, potentially changing it.

There is a very good book by Cohen on “Succeeding with agile” that has a whole chapter on what makes a good Scrum Master. Checking these properties against your chosen Scrum master might help as well.

When discussing this topic we soon discovered one problem with the team configuration as-is: Scrum masters used to be system architects or senior software developers - that is, highly respected, influencial people. Maybe simply re-configuring teams might help already.

Thanks to Marion for organising the evening - and thanks to all attendees for your questions and input on discussion topics. Looking forward to the next edition of the Scrumtisch.

Disclaimer: I usually just take notes on an old-fashioned paper-notebook, typing stuff into the blog after the meeting is over. Only reason I do it the same evening is the goal of keeping the list of draft postings as short as possible.

Scrum , , , , ,

Scrumtisch August Berlin

August 14th, 2010

Just seen it - the next Scrumtisch Berlin has been scheduled for 25th August 2010 at 18:30 Uhr. So far, no official talk has been scheduled, so please expect two topics on Scrum and its application to be selected for discussion according to Marion’s agile topic selection algorithm.

Please talk to Marion Eickmann if you would like to attend the next meetup.

Scrum , , ,

Series: Getting things done

July 30th, 2010

Probably not too unusual for people working on free software mostly (though no longer exclusively) in their spare time, the number of items that appear in my private calendar have increased steadily in the past months and years:

  • Every three months I am organising the Apache Hadoop Get Together in Berlin.
  • I have been asked (and accepted the offer) to publish articles on Hadoop and Lucene in magazines.
  • There are various conferences I attend - either as speaker or simply as participant: FOSDEM, Froscon, Apache Con NA, Devoxx, Chemnitzer Linuxtag - to name just a few.
  • For Berlin Buzzwords I did get quite a bit of time for organisation, still some issues leaked over to what others would call free time.
  • I am mentoring one of Mahout’s GSoC students which is a lot of fun.
  • At least I try to spend as much time as possible on the Mahout mailing lists keeping up with what is developed and discussed there.

There are various techniques to cope with increased work load and still find enough time to relax. Some of them involve simply remembering what to do at the right time, some involve prioritization, others deal with measuring and planning what to do. In this tiny series I’ll explain the techniques I employ - or at least try to - in the hope of getting your feedback, and comments on how to improve the system. After all, the most important task is to constantly improve ones own processes.

Freetime, Hacking, Scrum , , ,

Scrum - prepare your meetings

June 27th, 2010

“But Scrum involves so many meetings - we already have meetings like all day and don’t get to coding anything.” - “However we do need transparency and communication to build great software.” - “Actually scheduling and re-prioritising items during scrum planning takes so much time.” Does that sound familiar to you?

What if you could fit Sprint Review and Planning I for a team of three people doing three week sprints into one hour? Impossible? Well, not quite so. As always there is a very easy trick: Be prepared.

Before the review re-visit all items you have accomplished. Get the application into a state that makes it easy to demonstrate all new features to your product owner. If you are a team working on internal projects with many internal clients - get one of them to test our the new features early on (as in way before review) and get their input onto the table.

Get a list of all items up on a whiteboard. Then with the PO work you way through these items, either demonstrating them or referring to what the “real client” said about the feature. After that compute the velocity of your past sprint.

Then go over to the list of still to be done items. (You did estimate them in separate estimation meetings already, didn’t you? You also got the prioritised by your PO beforehand, didn’t you?) According to the computed velocity simply check out what you can do in the upcoming sprint.

After that team and PO are done. Guess what: Working with pen, whiteboard and paper did speed up our process considerably. After all, these are meetings of at least four people: Don’t want to waste working hours of four people by not using the fastest and most flexible planning method.

So - who’s responsible for getting all this information up? And - if using a digital planning tool, who is supposed to get it back into the tool? Trivial: It’s the scrum master’s job to provide all help and tools to the team to make it a high performing team. It’s way better to spend 2 extra hours preparing and “tidying up” the sprint planning than have four people spend 2 hours (total 6 vs. 8 hours) in a sub-optimal meeting.

Together with estimation meetings for me this means that each sprint one to two days go into organisational work. Still this is very low overhead compared to what the team is able to accomplish in that time.

Scrum

Using Scrum for software development

June 25th, 2010

A few months ago I entered a new team of until then two software developers. Being so small and with a rather busy product owner, until then people had followed the rituals of Scrum only loosely. When starting to work on a new component the three of us decided to change a few thing several weeks ago:

  • We would setup infrastructure to be somewhat similar to what we knew from Apache projects: All issues to be accomplished were to be tracked in our issue tracker. We would have a dev list that mirrors whatever goes to JIRA, a commit list to mirror whatever goes into svn and a user list.
  • We would try to follow Scrum rituals more closely: Dailies were re-introduced, we setup estimation meetings and got a cooking timer to stick with 60min per meeting, review and planning was supposed to be prepared and done with paper and pens on a whiteboard. After each planning we would have a planning II as well as a retrospective to improve our processes.

After the first weeks of following these ideas we did notice several improvements that are going to be described in upcoming blog posts.

Lets start with the first (trivial) change we made: Introducing a commit mailing list. With developers sitting all together in one room, communicating openly and regularly this may seem like a huge overkill. However there are a few changes that brought to how we develop:

People would suddenly start to think about what goes into svn: Being very obviously publicly readable, commit messages became much more readable, explaining what’s going on. Changes would be checked-in only if they belong to one logical change set. In return others would review what was checked in sort of automatically - spotting problems or questionable code and configuration very early. Changes that got done were made transparent to other team members very easily.

Of course the information is available anyway. However to be honest - I never really closely check each change set that got committed after issuing an svn up. Getting the stuff pushed to my inbox hugely improved the situation while still keeping a faster development cycle than when working with a review-before-commit model. This is especially important for cases where only smaller adjustments are being made. Where larger refactoring steps were necessary we would still get the code reviewed by our colleagues before checking it in.

Scrum

Velocity update

June 14th, 2010

After Berlin Buzzwords is over now, I thought it might be time to update a formerly published velocity graph. If you have been following this blog during the past few months you may know already, that we are using Scrum @ Home for several months already. I even published results of a Scrum Nokia test earlier this year.

Today I created another one of these nice velocity charts. What is interesting about this graph are two things. First of all: There are two spikes in the velocity graph. As our scrum board is used only for things done during our spare time, these spikes roughly correspond to German vacation days (Christmas and Easter).

More interestingly the work load shortely before Berlin Buzzwords happend was not much higher than usual. Weekend was totally free of any work - except for meeting with friends, going to town and attending the Barcamp. A sure sign of fantastic preparation and organisation with that few tasks left to do. Julia, thanks for helping with scheduling tasks such that no panicky late-night action was needed before, during or after the conference.

Scrum ,

Coaching self-organising teams

March 30th, 2010

Today, the Scrumtisch organised by Marion Eickmann from Agile 42 met in Berlin Friedrichshain. Though no talk was scheduled for this evening the room was packed with guests from various companies and backgrounds interested in participating in discussions on Scrum.

As usual we started collecting topics (timeboxed to five minutes). The list was rather short, however it contained several interesting pieces:

  • (6) Management buy-in
  • (6+) CSP - Certified Scrum Professional - what changes compared to the practitioner?
  • (4) Roles of Management in Scrum - how do they change?
  • (13) Coaching self-organising teams - team buy in.

Team buy-in

As prioritised by the participants the first topic discussed was on coaching self organising teams - with a heavy focus on team buy-in. The problem described dealt with transforming water fall teams that are used to receiving their work items into self organising teams that voluntarily accept responsibility for the whole project instead of just their own little work package.

The definition of self organising here really is about teams, that have no (and need no) dedicated team leader. On the contrary: leadership is automatically transferred to the person who - based on his skills and experiences - is best suited for the user story that is being implemented at any given time.

The problem the question targets is about teams, that really are not self organising, where developers do not take responsibility for the whole project, but just for their little pieces: They have their gardens - with fences around that protect them from others but also protect themselves from looking into other pieces of the project. Even worse - they tend to take these fences with them as soon as work items change.

Several ways to mitigate the problem were discussed:

  • Teams should work in a collaborative environment, should have clear tasks and priorities, whould get some pressure from the outside to get things done.
  • Some teams need to learn what working in a team - together - really means. It may sound trivial, but what about solving problems together: Spending one day climbing hills?
  • Committments should not happen on tasks (which by definition are well defined and small) but rather on Features/ user stories. Task breakdown should happen after the committment.
  • There are patterns to break user stories that are too large into multiple user stories. (Marion: Would be great, if I could add a link here ;) )
  • Teams need to be coached - not only the scrum master should get education, but the complete team. There are people interested in management that tend to read up on the topic after working hours - however these are rather rare…
  • Teams must be empowered - they must be responsible for the whole project and for the user stories they commit to. In return they must get what the need to get their tasks done.
  • Newly formed teams inexperienced with Scrum have to get the chance to make mistakes - to fail - and to learn from hat.

A great way to explain Scrum really is as a two-fold process: First half is about getting a product done, reviewing quality by the end of each sprint during the review. Second half is about improving the process to get the product done. Meeting to review the process quality is called retrospective.

Management buy-in

The second topic discussed was on the role of management in scrum - and how to convince management of Scrum. To some extend, Scrum means loosing power and control for management. Instead of micro-manageing people it’s suddenly about communicating your vision and scope. To get there, it helps to view lean management as the result of a long transformation:

  • First there is hierarchical management - with the manager at the top and employees underneath.
  • Second there is shared management - with the manager sitting between his employees enabling communication.
  • Third there is collaborative management - here the manager really is part of the team.
  • Fourth comes empowering management - this time the manager is only responsible for defining goals.
  • Last but not least there is lean management - where managers are merely coordinating and communicating the vision of a project.

To establish a more agile management form, there are several tasks to keep in mind: First and foremost, do talk to each other. Explain your manager what you are doing and why you are working in pairs, for instance. Being a manager, do not be afraid to ask questions - understanding what your developers do, helps you trust their work. Scrum is established, however there needs to be a clear communication of what managers loose - and what they win instead.

Scaling can only be done via delegation - however people need to learn how to delegate tasks. In technology we are used to learning new stuff every few years. In management this improvement cycle is not yet very common. However especially in IT it should be.

Being able to sell Scrum to customers is yet another problem: You need good marketing to sell Scrum to your customers. “Money for nothing change for free” is a nice to read on formulating agile contracts. Keep in mind, that the only way to really win all benefits is by doing all of Scrum - cherry picking may work to some extend, however you won’t get the full benefit from it. In most cases it works worse than traditionally managed projects.

After two very interesting and lively discussions moderated by Andrea Tomasini we finally had pizza, pasta and drinks - taking some of the topics offline.

Looking forward to seeing you in F-Hain for the next Scrumtisch in April.

Scrum , , , , , ,

Alan Atlas at Scrumtisch Berlin

February 17th, 2010

At the last Berlin Scrumtisch, @AlanAtlas gave a presentation on how he introduced Scrum at Amazon (starting as early as back in 2004). Introducing Scrum at Amazon by that time seemed natural due to a few factors:

  • Amazon was and is always very customer centric. The original methodology of working backwards in time - that is starting with the press release, from there writing the FAQ and manual and finally get to the code - really made people concentrate on the product.
  • Teams were sized according to the two-pizza rule: Each team is supposed to be no larger than can reasonably be fed for lunch with two pizzas. Turns out that is about five to ten people, given regular american pizzas.
  • Management didn’t really care exactly *how* the job of software development was done. They only cared *that* it was done. This proved as both - an advantage as it gave quite a bit of freedom - and a disadvantage - as indifference leads to impediments in the process that cannot be easily resolved.

The goal of Alan’s team was to build an infinitely scalable, zero downtime storage system that had a web service based interface - it was the S3 team. Given the task at hand, it was clear that the task wasn’t going to be anything even close to trivial. Alan’s idea: Try Scrum on that.

He himself came to the idea after attending an Agile conference and hearing about Scrum for the first time there. Alan requested a Scrum training from his manager, who approved it, provided it took place in Seattle - close to Amazon HQ that is. Turns out, the Scrum trainer available in Seattle happened to be Ken Schwaber.

The idea of Scrum basically was spread by word of mouth propaganda: In the hallways, in the smokers lounge, close to the coffee machine. People started adopting it - with some teams doing better, some worse. Allan himself got two days a quarter to give people an introduction to this funny new methodology called Scrum.

18 months later, S3 was shipped - as a side effect, the number of subscribers to the internal scrum mailing list had increased from three to 100, there were 150 CSM graduates, still three teams using XM, reputation of scrum was mostly positive.

In January 2007 others started giving their own introductions to Scrum. Developers generally liked it, there were significant failure cases, lots of resistance and misunderstanding mostly on the management side who sometimes confused it with lean production.

In June 2008 Allen became a fulltime internal Scrum trainer. First thing organised was a Scrum gathering - in an open spaces like setup people were invited to discuss their issues and questions on Scrum.

In January 2009, about 50% Scrum usage was reached, their were 600 subscribers on the Scrum mailing list, 450 CSM graduates, reputation of Scrum was as good (or bad) as months before. However it became more and more clear that further restructuring would only be possible against a lot of resistance.

There were a few lessons learned from introducing Scrum in other companies as well:

  • You cannot introduce Scrum if there is no notion of comparably stable teams (as in for at least six months, not five projects at a time per developer).
  • You need permission and ownership of committments to really get going.
  • You need to know at least the basics of Scrum.

There are a few factors that make introduction easier: Community matters. People need to be able to talk to each other, exchange experiences and learn from one another. Coaching matters. More support, immediate success in lots of cases is the result of getting a coach on the team. Credibility for management increases, Scrum implementation is easier understood that way, skepticism and resistance reduced. Management matters. Middle management must be part of the process. They will mess up scrum if not educated correctly. Going bottom-up works up to a point - but to go the whole way, you do need management.

For Amazon, that is the point where implementation got stuck for Allan: Management was neither interested nor really involved in Scrum. However there are some impediments that cannot be fixed w/o. Still Scrum is working to some extend - teams are still trying to get better and improve the process. So it is not really “Scrum, but”. Even going only part of the way helped a lot already.

As for the questions: There were a few unusual ones - e.g. on what to do with a team that is itself skeptical about Scrum introduction - especially if that was done by higher management. There are a few ways to remedy that: Point out success to the negative team member. Make that member a Scrum master to let him go through the process and really understand what is going on. And above all make the reasons for going that way clear and transparent. Including promises and wishes from that change.

Another question dealt with how to convince management of Scrum. Clearly better performing teams with happy developers (”you cannot buy developers with money only”) are some valid reasons.

Yet another question dealt with convincing customers. Allan’s way is to sneak Scrum into the process: Speak the customers language. If he does not like spring planning, call it a project status meeting.

Asked for cases of developers feeling like they work in a hamster wheel when doing Scrum, the general consensus was that it needs to be made sure that people do not only get more responsibilities but get the benefits from Scrum as well. Otherwise Scrum with all its numbers and measurements can be perfectly abused and turned into an amazing micro management tool.

Asked for other bay area companies using Scrum - yes, apple does so, Microsoft at least tries to. Google certainly uses the ideas, without calling it Scrum. Facebook seems not to use it. Xing (not bay area) uses Scrum. There seem to be about 50% of all software development companies worldwide who are using Scrum.

After the talk there was time to gather and have some pizza and pasta, time for drinks and discussions. I really appreciated the comments and ideas exchanged after the talk. Hope to see you all next time around.

Hadoop, Scrum , ,

The 7 deadly sins of (Java) software developers

January 23rd, 2010

On Lucid Imaginations Blog Jay Hill published a great article on The seven deadly sins of solr. Basically it is a collection of his experiences “analyzing and evaluating a great many instances of Solr implementations, running in some of the largest Fortune 500 companies”. It is a collection of common mistakes, mis-configurations and pitfalls in Solr installations in production use.

I loved the article very much. However, many of the symptoms that Jay described in his article do not apply to Solr installations only. In the following I will try to come up with a more general classification of errors that occur when your average Java developer starts using a sufficiently large framework that is supposed to make his work easier. Happy about any input on your favourite production issues.

Remark: What is printed in italic is quoted as is.

Sin number 1: Sloth - I’ll do it later

Let’s define sloth as laziness or indifference. This one bites most of us at some time or another. We just can’t resist the impulse to take a shortcut, or we simply refuse to acknowledge the amount of effort required to do a task properly. Ultimately we wind up paying the price, usually with interest.

There is even a name for it in Scrum: Technical debt. It may be ok to take a shortcut, given this is done based on an informed decision. As with regular debt, you may get a big advantage like launching way earlier than your competitor. However as with real debt, it does come at a prize.

Lack of commitment

Jay describes the problems that are especially frequent when switching search applications: Humans in general do not like giving up their habits. A nice example described in more detail in a recent Zeit article is what happens each year in December/ January when the first snow falls: It is by no means irregular or not to be expected that it starts snowing in December in Germany. However there will be lots of people who are not prepared for that. They refuse to put on winter tiers in late autumn. They use their car instead of public transport despite warnings in public press. The conclusion of the article was simple: People are simply not willing to change habits they got used to. It does take longer and is a bit less flexible to get to work by public transport instead of your own car. It does require adjusting your daily routine, optimising your processes.

Something similar happens to a developer that is “forced” to switch technology, be it the search server, the database, the build system or simply the version control system: The old ways of doing stuff simply may not work as expected. New tools might be called for. New technologies to learn. However in not so seldom cases developers just blame the new tools: “But with the old setup this would always work.”

Developing software - probably more than anything else - means constant development, constant change. Technologies shift as tasks shift, tools are improved as workflows change. Developing software means to constantly watch closely what you are doing, reflecting on what works and what doesn’t and changing things that don’t work. Accepting change, seeing it as a chance rather than an obstacle is critical.

If however change is imposed on developers though good arguments in favour of the old approach exist, it may be worth the effort to at least take the technical view into account to make an informed decision.

Not reviewing, editing, or changing the default configuration files.

I have extended this one a bit: Developers not changing default configuration files are not that uncommon. Be it the default database configuration, default logging configuration for your modules or default configuration of your servlet container. Even if you are using software pre-packed by your distribution, it is still worth the effort to review configuration files for your services and adjust them to your needs. Usually they are to be used as examples that still need tweaking and customization after roll-out.

JVM settings and GC

If you are running Java application there is no way around to adjust GC settings as well as general JVM settings to your particular use case. There are great tutorials at sun.com that explain both the settings themselves as well as several rules-of-thumb of where to start. Still nothing should stop you from measuring your particular application and its specific needs - both, before and after tuning. Along with that goes the obvious recommendation to simply “know-your-tools” - learning load testing tools shortly before launch time is certainly no good choice. Trying to find out more on Java memory analysis late in the development cycle just because you need to find that stupid memory leak like *now* is no good idea neither.

There are several nice talks as well as several tutorials available online on the topic of JVM tuning, debugging memory as well as threading issues, one of them being the talk by Rainer Jung at Frocson 2008.

Sin number 2: Greed

Running a service on insufficient hardware (be it main memory, harddisks, bandwidth, …) is not only an issue with Solr installations. There are many cases where just adding hardware may help in the short run, but is a dead-end in the long run:

  • Given a highly inefficient implementation, identifying bottlenecks, profiling, benchmarking and optimization go a long way.
  • Given an inappropriate architecture, redesign, reimplementation and maybe even switching base technologies does help.

However as Jay pointed out, running production servers with less power than your average desktop Mac has does not help neither.

Sin number 3: Pride

Engineers love to code. Sometimes to the point of wanting to create custom work that may have a solution in place already, just because: a) They believe they can do it better. b) They believe they can learn by going through the process. c) It “would be fun”. This is not meant to discourage new work to help out with an open-source project, to contribute bug fixes, or certainly to improve existing functionality. But be careful not to rush off and start coding before you know what options already exist. Measure twice, cut once.

Don’t re-invent the wheel.

As described in Jay’s post, there are developers who seem to be actively searching for reasons to re-invent the wheel. Sure, this is far easier with open source software than with commercial software. Access to code here makes the difference: Understanding, learning from, sharing and improving the software is key to free software.

However there are so many cases where improve does not mean re-implement but submitting patches, fixing bugs, adding new features to the orignal project or just refactoring the original code and ironing out some well known bumbs to make life easier for others.

Every now and then a new query abstraction language for map reduce pops up. Some of those really solve distinct problem settings that cannot (and should not) be solved within one language. Especially if a technology is young, this is pretty usual as people try out different approaches to see what works and what does not work out so well. Good and stable things come from that - in general the fittest approach survives. However, too often I have heard developers excusing their re-invention by “having had too few time to do a throughough evaluation of existing frameworks and libraries”. The irony here really is that usually, coding up your own solution does take time as well. In other cases the excuse was missing support for some of the features needed. How about adding those features, submitting them upstream and benefitting from what is already there and an active community supporting the project, testing it, applying fixes and adding further improvements?

Make use of the mailing lists and the list archives.

Communication is key to success in software development. According to Conway’s law “Organizations
which design systems are constrained to produce systems which are copies of the communication structures of these organizations.” I guess it is pretty obvious that developing software today generally means designing complex systems.

In Open source, mailing lists (and bug trackers, the code itself, release notes etc.) are all ways for communication. (See also Bertrand’s brilliant talk on open source collaboration tools for that). With in-house development there is even added benefit as face-to-face communication or at least teleconferencing is possible.

However software developers in general seem to be reluctant to ask questions, to discuss their design, their implementation and their needs for changes. It just seems simpler to work-around a situation that disturbs you instead of propagating the problem to its source - or just asking for the information you need. A nice article on a related topic was published recently it-republik.

However asking these questions, taking part in these discussions is what makes software better. It is what happens regularly within open source projects in terms of design discussions on mailing lists, discussions on focussed issues in the bug tracker as well as in terms of code review.

There are several best practices that come with Agile Development that help starting discussions on code. Pair programming is one of these. Code reviews are another example. Having more than two eye balls look at a problem usually makes the solution more robust, gives confidence in what was implemented and as a nice side effect spreads knowledge on the code avoiding a single point of failure with just one developer being familiar with a particular piece of code.

Sin number 4: Lust

Must have more!You’ll have to grant me artistic license on this one, or else we won’t be able to keep this blog G-rated. So let’s define lust as “an unnatural craving for something to the point of self-indulgence or lunacy”. OK.

Setting the JVM Heap size too high, not leaving enough RAM for the OS.

Jay describes how setting the JVM RAM allocation too high can lead to Java eating up all memory and leaving nothing for the OS. The observation does not apply to Solr deployments only. Tomcat is just yet another application where this applies as well. Especially with IO-bound applications giving too much memory to the JVM is grave as the OS does not longer have enough space for disk caches.

The general take-away probably should be to measure and tune according to the real observed behaviour of your application. A second take-home message would be to understand your system - not only the Java part of it, but the whole machine from Java, the OS down to the hardware - to tune it effectively. However that should be a well known fact anyway. For Java developers, it sometimes helps to simply talk to your operations guys to get the bigger picture.

Too much attention on the JVM and garbage collection.

There are actually two aspects here: For one, as described by Jay it should not be necessary to try every arcane JVM or GC setting unless you are a JVM expert. More precisely, simply trying various options w/o understanding, what they mean, what side-effects they have and in which situations they help obviously isn’t a very good idea.

The second aspect would be developers starting with JVM optimization only to learn later on that the real problem is within their own application. Tuning JVM parameters really should be one of the last steps in your optimization pipeline. First should be benchmarking and profiling your own code. At the same stage you should review configuration parameters of your application (size of thread pools, connection pools etc.) as well your libraries and frameworks (here come solr’s configuration files, Tomcat’s configuration, RDBMs configuration parameters, cache configurations…). Last but not least should be JVM tuning - starting with adjusting memory to a reasonable amount, setting the GC configuration that makes most sense to your application.

Sin number 5: Envy

Bah!

Wanting features that other sites have, that you really don’t need.

It should be good engineering practice to start with your business needs and distill user stories from that and identify the technology that solves your problem. Don’t go from problem to solution without first having understood your problem. Or even worse: Don’t go from solution (that is from a technology you would love to use) to searching for a problem that this solution might solve: “But there must be a RDBMS somewhere in our architecture, right?”

Wanting to have a bigger index than the other guy.

The antithesis of the “greed” issue of not allocating enough resources. “Shooting for the moon” and trying to allow for possible growth over the next 20 years. Another scenario would be to never fix your system but leave every piece open and configurable, in the end leading to a system that is harder to configure than sendmail is. Yet another scenario would be to plan for billions of users before even launching: That may make sense for a new Google gadget, however for the “new kid on the block”? Probably unlikely, unless you have really good marketing guys. Plan for what is reasonable in your project, observe real traffic and identify real bottlenecks once you see them. Usually estimations of what bottlenecks could be are just plain wrong unless you have lot’s of experience with the type of application you are building. As Jeff Dean pointed out in his WSDM 2009 keynote, the right design for X users may still be right with 10x the amount of users. But do plan a rewrite at about the time you start having 100x and more the amount of users.

Sin number 6: Gluttony

“Staying fit and trim” is usually good practice when designing and running Solr applications. A lot of these issues cross over into the “Sloth” category, and are generally cases where the extra effort to keep your configuration and data efficiently managed is not considered important.

Lack of attention to field configuration in the schema.

Storing fields that will never be retrieved. Indexing fields that will never be searched. Storing term vectors, positions and offsets when they will never be used. Unnecessary bloat. Understand your data and your users and design your schema and fields accordingly.

On a more general scale that might be wrapped into the general advise of keeping only data that is really needed: Rotate logs on a schedule fit to your business, operations needs and based on available machines. Rotate data written into your database backend: It may make sense to keep users that did not interact with your application for 10 years. If you have a large datacenter for storage that may make even more sense. However usually keeping inactive users in your records simply eats up space.

Unexamined queries that are redundant or inefficient.

Queries that catch too much information, are redundant or multiple queries that could be folded into one are not only a problem for Solr users. Anyone using data sources that are expensive to query probably knows how to optimize those queries for reduced cost.

Sin number 7: Wrath

Now! While wrath is usually considered to be synonymous with anger, let’s use an older definition here: “a vehement denial of the truth, both to others and in the form of self-denial, impatience.”

Assuming you will never need to re-index your data.

Hmm - don’t only backup. Include recovery in your plans! Admittedly with search applications, this includes keeping the original documents - it is not unusual to add more fields or to want to parse data differently from the first indexing run. Same applies if you are post-processing data that has been entered by users or spidered from the web for tasks like information extraction, classifier training etc.

Rushing to production.

Of course we all have deadlines, but you only get one chance to make a first impression. Years ago I was part of a project where we released our search application prematurely (ahead of schedule) because the business decided it was better to have something in place rather than not have a search option. We developers felt that, with another four weeks of work we could deliver a fully-ready system that would be an excellent search application. But we rushed to production with some major flaws. Customers of ours were furious when they searched for their products and couldn’t find them. We developed a bad reputation, angered some business partners, and lost money just because it was deemed necessary to have a search application up and running four weeks early.

Leaving that as is - just adding, this does not apply to search applications only ;)

So keep it simple and separate, stay smart, stay up to date, and keep your application on the straight-and-narrow (YAGNI ;) ). Seek (intelligently) and ye shall find.

Free Software, Hacking, Lucene, Scrum , , ,