Open Source Summit - Day 3

2017-10-29 08:35
Open source summit Wednesday started with a keynote by members of the Banks family telling a packed room on how they approached raising a tech family. The first hurdle that Keila (the teenage daughter of the family) talked about was something I personally had never actually thought about: Communication tools like Slack that are in widespread use come with an age restriction excluding minors. So by trying to communicate with open source projects means entering illegality.

A bit more obivious was their advise to help raise kids' engagement with tech: Try to find topics that they can relate to. What works fairly often are reverse engineering projects that explain how things actually work.

The Banks are working with a goal based model where children get ten goals to pursue during the year with regular quarterly reviews. An intersting twist though: Eight of these ten goals are choosen by the children themselves, two are reserved for parents to help with guidance. As obvious as this may seem, having clear goals and being able to influence them yourselves is something that I believe is applicable in the wider context of open source contributor and project mentoring as well as employee engagement.

The speakers also talked about embracing children's fear. Keila told the story of how she was afraid to talk in front of adult audiences - in particular at the keynote level. The advise that her father gave that did help her: You can trip on the stage, you can fall, all of that doesn't matter for as long as you can laugh at yourself. Also remember that every project is not the perfect project - there's always something you can improve - and that's ok. This is fairly in line with the feedback given a day earlier during the Linux Kernel Panel where people mentioned how today they would never accept the first patch they themselves had once written: Be persistant, learn from the feedback you get and seek feedback early.

Last but not least, the speakers advised to not compare your family to anyone, not even to yourself. Everyone arrives at tech via a different route. It can be hard to get people from being averse to tech to embrace it - start with a tiny little bit of motivation, from there on rely on self motivation.

The family's current project turned business is to support L.A. schools to support children get a handle on tech.

The TAO of Hashicorp

In the second keynote Hashimoto gave an overview of the Tao of Hashicorp - essentially the values and principles the company is built on. What I found interesting about the talk was the fact that these values were written down very early in the process of building up Hashicorp when the company didn't have much more than five employees, comprised vision, roadmap and product design pieces and has been applied to every day decisions ever since.

The principles themselves cover the following points:
  • Workflows - not technologies. Essentially describing a UX first approach where tools are being mocked and used first before diving deeper into the architecture and coding. This goes as far as building a bash script as a mockup for a command line interface to see if it works well before diving into coding.
  • Simple, modular and Comosable. Meaning that tools built should have one clear purpose instead of piling features on top of each other for one product.
  • Communicating sequential processes. Meaning to have standalone tools with clear APIs.
  • Immutability.
  • Versioning through Codification. When having a question, the answer "just talk to X" doesn't scale as companies grow. There are several fixes to this problem. The one that Hashicorp decided to go for was to write knowledge down in code - instead of having a README.md detailing how startup works, have something people can execute.
  • Automate.
  • Resilient systems. Meaning to strive for systems that know their desired state and have means to go back to it.
  • Pragmatism. Meaning that the principles above shouldn't be applied blindly but adjusted to the problem at hand.


While the content itself differs I find it interesting that Hashicorp decided to communicate in terms of their principles and values. This kind of setup reminds me quite a bit about the way Amazon Leadership principles are being applied and used inside of Amazon.

Integrating OSS in industrial environments - by Siemens

The third keynote was given by Siemens, a 170 year old, 350k employees rich German corporation focussed on industrial appliances.

In their current projects they are using OSS in embedded projects related to power generation, rail automation (Debian), vehicle control, building automation (Yocto), medical imaging (xenomai on big machines).

Their reason for tapping into OSS more and more is to grow beyond their own capabilities.

A challenge in their applications relates to long term stability, meaning supporiting an appliance for 50 years and longer. Running there appliances unmodified for years today is not feasible anymore due to policies and corporate standards that requrire updates in the field.

Trouble they are dealing with today is in the cost of software forks - both, self inflicted and supplier caused forks. The amount of cost attached to these is one of the reasons for Siemens to think upstream-first, both internally as well as when choosing suppliers.

Another reason for this approach is to be found in trying to become part of the community for three reasons: Keeping talent. Learning best practices from upstream instead of failing one-self. Better communication with suppliers through official open source channels.

One project Siemens is involved with at the moment is the so-called Civil Infrastructure Platform project.

Another huge topic within Siemens is software license compliance. Being a huge corporation they rely on Fossology for compliance checking.

Linus Torvalds Q&A

The last keynote of the day was an on stage interview with Linus Torvalds. The introduction to this kind of format was lovely: There's one thing Linus doesn't like: Being on stage and giving a pre-created talk. Giving his keynote in the form of an interview with questions not shared prior to the actual event meant that the interviewer would have to prep the actual content. :)

The first question asked was fairly technical: Are RCs slowing down? The reason that Linus gave had a lot to do with proper release management. Typically the kernel is released on a time-based schedule, with one release every 2.5 months. So if some feature doesn't make it into a release it can easily be integrated into the following one. What's different with the current release is Greg Kroah Hartman having announced it would be a long term support release, so suddenly devs are trying to get more features into it.

The second question related to a lack of new maintainers joining the community. The reasons Linus sees for this are mainly related to the fact that being a maintainer today is still fairly painful as a job: You need experience to quickly judge patches so the flow doesn't get overwhelming. On the other hand you need to have shown to the community that you are around 24/7, 365 days a year. What he wanted the audience to know is that despite occasional harsh words he loves maintainers, the project does want more maintainers. What's important to him isn't perfection - but having people that will stand up to their mistakes.

One fix to the heavy load mentioned earlier (which was also discussed during the kernel maintainers' panel a day earlier) revolved around the idea of having a group of maintainers responsible for any single sub-system in order to avoid volunteer burnout, allow for vacations to happen, share the load and ease hand-over.

Asked about kernel testing Linus admitted to having been sceptical about the subject years ago. He's a really big fan of random testing/ fuzzing in order to find bugs in code paths that are rarely if ever tested by developers.

Asked about what makes a successful project his take was the ability to find commonalities that many potential contributors share, the ability to find agreement, which seems easier for systems with less user visibility. An observation that reminded my of the bikeshedding discussions.

Also he mentioned that the problem you are trying to solve needs to be big enough to draw a large enough crowd. When it comes to measuring success though his insight was very valuable: Instead of focussing too much on outreach or growth, focus on deciding whether your project solves a problem you yourself have.

Asked about what makes a good software developer, Linus mentioned that the community over time has become much less homogenuous compared to when he started out in his white, male, geeky, beer-loving circles. The things he believes are important for developers are caring about what they do, being able to invest in their skills for a long enough period to develop perfection (much like athletes train a long time to become really sucessful). Also having fun goes a long way (though in his eyes this is no different when trying to identify a successful marketing person).

While Linus isn't particularly comfortable interacting with people face-to-face, e-mail for him is different. He does have side projects beside the kernel. Mainly for the reason of being able to deal with small problems, actually provide support to end-users, do bug triage. In Linux kernel land he can no longer do this - if things bubble up to his inbox, they are bound to be of the complex type, everything else likely was handled by maintainers already.

His reason for still being part of the Linux Kernel community: He likes the people, likes the technology, loves working on stuff that is meaningful, that people actually care about. On vacation he tends to check his mail three times a day to not loose track and be overwhelmed when he gets back to work. There are times when he goes offline entirely - however typically after one week he longing to be back.

Asked about what further plans he has, he mentioned that for the most part he doesn't plan ahead of time, spending most of his life reacting and being comfortable with this state of things.

Speaking of plans: It was mentioned that likely Linux 5.0 is to be released some time in summer 2018 - numbers here don't mean anything anyway.

Nobody puts Java in a container

Jörg Schad from Mesosphere gave an introduction to how container technolgies like Docker really work and how that applies to software run in the JVM.

He started off by explaining the advantages of containers: Isolating what's running inside, supplying standard interfaces to deployed units, sort of the write once, run anywhere promise.

Compared to real VMs they are more light weight, however with the caveat of using the host kernel - meaning that crashing the kernel means crashing all container instances running on that host as well. In turn they are faster to spin up, need less memory and less storage.

So which properties do we need to look at when talking about having a JVM in a container? Resource restrictions (CPU, memory, device visibility, blkio etc.) are being controlled by cgroups. Process spaces for e.g. pid, net, ipc, mnt, users and hostnames are being controlled through libcontainer namespaces.

Looking at cgroups there are two aspects that are very obviously interesting for JVM deployments: For memory settings one can set hard and soft limits. However much in contrast to the JVM there is no such thing as an OOM being thrown when resources are exhausted. For CPUs available there are two ways to configure limits: cpushares lets you give processes a relative priority weighting. Cpusets lets you pin groups to specific cpus.

General advise is to avoid cupsets as it removes one level of freedom from scheduling, often leads to less efficiency. However it's a good tool to avoid cup-bouncing, and to maximise cache usage.

When trying to figure out the caveats of running JVMs in containers one needs to understand what the memory requirements for JVMs are: In addition to the well known, configurable heap memory, each JVM needs a bit of native JRE memory, perm get/ meta space, JIT bytecode space, JNO and NIO space as well as additional native space for threads. With permgen space turned native meta space that means that class loader leaks are capable of maxing out the memory of the entire machine - one good reason to lock JVMs in containers.

The caveats of putting JVMs into containers are related to JRE intialisation defaults being influenced by information like the number of cores available: It influences the number of JIT compilation threads, hotspot thresholds and limits.

One extreme example: When running ten JVM containers in a 32 core box this means that:
  • Each JVM believes it's alone on the machine configuring itself to the maximally availble CPU count.
  • pre-Java-9 the JVM is not aware of cpusets, meaning it will think that it can use all 32 cores even if configured to use less than that.


Another caveat: JVMs typically need more resources on startup, leading to a need for overprovisioning just to get it started. Jörg promised a blog post to appear on how to deal with this question on the DC/OS blog soon after the summit.

Also for memory Java9 provides the option to look at memory limits set through cgroups. The (still experimental) option for that: -XX:+UseCGroupMemLimitForHeap

As a conclusion: Containers don't hide the underlying hardware - which is both, good and bad.

Goal - question - metric approach to community measurement

In his talk on applying goals question metrics to software development management Jose Manrique Lopez de la Fuente explained how to successfully choose and use metrics in OSS projects.

He contrasted the OKR based approach to goal setting with the goal question metric approach. In the latter one first thinks about a goal to achieve (e.g. "We want a diverse community."), go from there to questions to help understand the path ot that goal better ("How many people from underrepresented groups do we have."), to actual metrics to answer that question.

Key to applying this approach is a cycle that integrates planning, making changes, checking results and acting on them.

Goals, questions and metrics need to be in line with project goals, involve management and involve contributors. Metrics themselves are only useful for as long as they are linked to a certain goal.

What it needs to make this approach successful is a mature organisation that understands the metrics' value, refrains from gaming the system. People will need training on how to use the metrics, as well as transparency about metrics.

Projects dealing with applying more metrics and analytics to OSS projects include Grimoire Lab, CHAOSS (Community Health Analytics for OSS).

There's a couple interesting books: Managing inner source projects. Evaluating OSS projects as well as the Grimoire training which are all available freely online.

Container orchestration - the state of play

In his talk Michael Bright gave an overview of current container orchestration systems. In his talk he went into some details for Docker Swarm, Kubernetes, Apache Mesos. Technologies he left out are things like Nomad, Cattle, Fleet, ACS, ECS, GKE, AKS, as well as managed cloud.

What became apparent from his talk was that the high level architecture is fairly similar from tool to tool: Orchestration projects make sense where there are enough microservices to be unable to treat them like pets with manual intervention needed in case something goes wrong. Orchestrators take care of tasks like cluster management, micro service placement, traffic routing, monitoring, resource management, logging, secret management, rolling updates.

Often these systems build a cluster that apps can talk to, with masters managing communication (coordinated through some sort of distributed configuration management system, maybe some RAFT based consensus implementation to avoid split brain situations) as well as workers that handle requests.

Going into details Michael showed the huge takeup of Kubernetes compared to Docker Swarm and Apache Mesos, up the point where even AWS joined CNCF.

For Thursday I went to see Rich Bowen's keynote on the Apache Way at MesosCon. It was great to hear how people were interested in the greater context of what Apache provides to the Mesos project in terms of infrastructure and mentoring. Also there were quite a few questions on what that thing called The Apache Software Foundation actually is at their booth at MesosCon.

Hopefully the initiative started on the Apache Community development mailing list on getting more information out on how things are managed at Apache will help spread the word even further.

Overall Open Source Summit, together with it's sister events like e.g. KVM forum, MesosCon as well as co-located events like the OpenWRT summit was a great chance to meet up with fellow open source developers and project leads, learn about technologies and processes both familiar was well as new (in my case the QEMU on UEFI talk clearly was above my personal comfort zone understanding things - here it's great to be married to a spouse who can help fill the gaps after the conference is over). There was a fairly broad spectrum of talks from Linux kernel internals, to container orchestration, to OSS licensing, community management, diversity topics, compliance, and economics.

Open source summit - Day 2

2017-10-25 10:58
Day two of Open Source summit for me started a bit slow for lack of sleep. The first talk I went to was on "Developer tools for Kubernetes" by Michelle Noorali and Matt Butcher. Essentially the two of them showed two projects (Draft and Brigade to help ease development apps for Kubernetes clusters. Draft here is the tool to use for developing long running, daemon like apps. Brigade has the goal of making event driven app development easier - almost like providing shell script like composability to Kubernetes deployed pipelines.

Kubernetes in real life

In his talk on K8s in real life Ian Crosby went over five customer cases. He started out by highlighting the promise of magic from K8s: Jobs should automatically be re-scheduled to healthy nodes, traffic re-routed once a machine goes down. As a project it came out of Google as a re-implementation of their internal, 15 years old system called Borg. Currently the governance of K8s lies with the Cloud Native Foundation, part of the Linux Foundation.

So what are some of the use cases that Ian saw talking to customers:
  • "Can you help us setup a K8s cluster?" - asked by a customer with one monolithic application deployed twice a year. Clearly that is not a good fit for K8s. You will need a certain level of automation, continuous integration and continuous delivery for K8s to make any sense at all.
  • There were customers trying to get into K8s in order to be able to hire talent interested in that technology. That pretty much gets the problem the wrong way around. K8s also won't help with organisational problems where dev and ops teams aren't talking with each other.
  • The first question to ask when deploying K8s is whether to go for on-prem, hosted externally or a mix of both. One factor pulling heavily towards hosted solution is the level of time and training investment people are willing to make with K8s. Ian told the audience that he was able to migrate a complete startup to K8s within a short period of time by relying on a hosted solution resulting in a setup that requires just one ops person to maintain. In that particular instance the tech that remained on-prem were Elasticsearch and Kafka as services.
  • Another client (government related, huge security requirements) decided to go for on-prem. They had strict requirements to not connect their internal network to the public internet resulting in people carrying downloaded software on USB sticks from one machine to the other. The obvious recommendation to ease things at least a little bit is to relax security requirements at least a little bit here.
  • In a third use case the customer tried to establish a prod cluster, staging cluster, test cluster, one dev cluster per developer - pretty much turning into a maintainance nightmare. The solution was to go for a one cluster architecture, using shared resources, but namespaces to create virtual clusters, role based access control for security, network policies to restrict which services can talk to each other, service level TLS to get communications secure. Looking at CI this can be taken one level furter even - spinning up clusters on the fly when they are needed for testing.
  • In another customer case Java apps were dying randomly - apparently because what was deployed was using the default settings. Lesson learnt: Learn how it works first, go to production after that.

Rebuilding trust through blockchains and open source

Having pretty much no background in blockchains - other than knowing that a thing like bitcoin exists - I decided to go to the introductory "Rebuilding trust through blockchains and open source" talk next. Marta started of by explaining how societies are built on top of trust. However today (potentially accelerated through tech) this trust in NGOs, governments and institutions is being eroded. Her solution to the problem is called Hyperledger, a trust protocol to build an enterprise grade distributed database based on a permissioned block chain with trust built-in.

Marta went on detailing eight use cases:
  • Cross border payments: Currently, using SWIFT, these take days to complete, cost a lot of money, are complicated to do. The goal with rolling out block chains for this would be to make reconcillation real-time. Put information on a shared ledger to make it audible as well. At the moment ANZ, WellsFargo, BNP Paribas and BNY Mellon are participating in this POC.
  • Healthcare records: The goal is to put pointers to medical data on a shared ledger so that procedures like blood testing are being done just once and can be trusted across institutions.
  • Interstate medical licensing: Here the goal is to make treatment re-imbursment easier, probably even allowing for handing out fixed-purpose budgets.
  • Ethical seafood movement: Here the goal is to put information on supply chains for seafood on a shared ledger to make tracking easier, audible and cheaper. The same applies for other supply chains, think diamonds, coffee etc.
  • Real estate transactions: The goal is to keep track of land title records on a shared ledger for easier tracking, auditing and access. Same could be done for certifications (e.g. of academic titles etc.)
  • Last but not least there is a POC to how how to use shared ledgers to track ownership of creative works in a distributed way and take the middleman distributing money to artists out of the loop.

Kernel developers panel discussion

For the panel discussion Jonathan Corbet invited five different Linux kernel hackers in different stages of their career, with different backgrounds to answer audience questions. The panel featured Vlastimil Babka, Arnd Bergmann, Thomas Gleixner, Narcisa Vasile, Laura Abbott.

The first question revolved around how people had gotten started with open source and kernel development and what advise they would have for newbies. The one advise shared by everyone other than scratch your own itch and find something that interests you: Be persistant. Don't give up.

Talking about release cycles and moving too fast or too slow there was a comment on best practice to get patches into the kernel that I found very valuable: Don't get started coding right away. A lot of waste could have been prevented if people just shared their needs early on and asked questions instead of diving right into coding.

There was discussion on the meaning of long time stability. General consensus seemed to be that long term support really only includes security and stability fixes. No new features. Imaging adding current devices to a 20 year old kernel that doesn't even support USB yet.

There was a lovely quote by Narcisa on the dangers and advantages of using C as a primary coding language: With great power come great bugs.

There was discussion on using "new-fangled" tools like github instead of plain e-mail. Sure e-mail is harder to get into as a new contributor. However current maintainer processes heavily rely on that as a tool for communication. There was a joke about implementing their own tool for that just like was done with git. One argument for using something less flexible that I found interesting: Aparently it's hard to switch between subsystems just because workflows differ so much, so agreeing on a common workflow would make that easier.
  • Asked for what would happen if Linus was eaten by a shark when scuba diving the answer was interesting: Likely at first there would be a hiding game because nobody would want to take up his work load. Next there would likely develop a team of maintainers collaborating in a consensus based model to keep up with things.
  • In terms of testing - that depend heavily on hardware being available to test on. Think like the kernel CI community help a lot with that.

    I closed the day going to Zaheda Bhorat's talk on "Love would you do - everyday" on her journey in the open source world. It's a great motiviation for people to start contributing to the open source community and become part of it - often for life changing what you do in ways you would never have imagined before. Lots of love for The Apache Software Foundation in it.
  • Open Source Summit Prague 2017 - part 1

    2017-10-23 11:18
    Open Source Summit, formerly known as LinuxCon, this year took place in Prague. Drawing some 2000 attendees to the lovely Czech city, the conference focussed on all things Linux kernel, containers, community and governance. The first day started with three crowded keynotes: First one by Neha Narkhede on

    Keynotes

    Apache Kafka and the Rise of the Streaming Platform. Second one by Reuben Paul (11 years old) on how hacking today really is just childs play: The hack itself might seem like toying around (getting into the protocol of children's toys in order to make them do things without using the app that was intended to control them). Taken into the bigger context of a world that is getting more and more interconnected - starting with regular laptops, over mobile devices to cars and little sensors running your home the lack of thought that goes into security when building systems today is both startling and worrying at the same time.

    The third keynote of the morning was given by Jono Bacon on what it takes to incentivise communities - be it open source communities, volunteer run organisations or corporations. According to his perspective there are four major factors that drive human actions:

    • People thrive for acceptance. This can be exploited when building communities: Acceptance is often displayed by some form of status. People are more likely to do what makes them proceed in their career, gain the next level in a leadership board, gain some form of real or artificial title.
    • Humans are a reciprocal species. Ever heart of the phrase "a favour given - a favour taken"? People who once received a favour from you are more likely to help in the long run.
    • People form habits through repetition - but it takes time to get into a habit: You need to make sure people repeat the behaviour you want them to show for at least two months until it becomes a habit that they themselves continue to drive without your help. If you are trying to roll out peer review based, pull request based working as a new model - it will take roughly two months for people to adapt this as a habit.
    • Humans have a fairly good bullshit radar. Try to remain authentic, instead of automated thank yous, extend authentic (I would add qualified) thank you messages.


    When it comes to the process of incentivising people Jono proposed a three step model: From hook to reason to reward.

    Hook here means a trigger. What triggers the incentivising process? You can look at how people participate - number of pull requests, amount of documentation contributed, time spent giving talks at conferences. Those are all action based triggers. What's often more valuable is to look out for validation based triggers: Pull requests submitted, reviewed and merged. He showed an example of a public hacker leaderboard that had their evaluation system published. While that's lovely in terms of transparency IMHO it has two drawbacks: It makes it much easier to evaluate known wanted contributions than what people might not have thought about being a valuable contribution when setting up the leadership board. With that it also heavily influences which contribtions will come in and might invite a "hack the leadership board" kind of behaviour.

    When thinking about reason there are two types of incentives: The reason could be invisible up-front, Jono called this submarine rewards. Without clear prior warning people get their reward for something that was wanted. The reason could be stated up front: "If you do that, then you'll get reward x". Which type to choose heavily depends on your organisation, the individual giving out the reward as well as the individual receiving the reward. The deciding factor often is to be found in which is more likely authentic to your organisation.

    In terms of reward itself: There are extrinsic motivators - swag like stickers, t-shirts, give-aways. Those tend to be expensive, in particular if shipping them is needed. Something that in professional open source projects is often overlooked are intrinsic rewards: A Thank You goes a long way. So does a blog post. Or some social media mention. Invitations help. So do referrals to ones own network. Direct lines to key people help. Testimonials help.

    Overall measurement is key. So is concentrating on focusing on incentivising shared value.

    Limux - the loss of a lighthouse



    In his talk, Matthias Kirschner gave an overview of Limux - the Linux rolled out for the Munich administration project. How it started, what went wrong during evaluation, which way political forces were drawing.

    What I found very interesting about the talk were the questions that Matthias raised at the very end:

    • Do we suck at desktop? Are there too many depending apps?
    • Did we focus too much on the cost aspect?
    • Is the community supportive enough to people trying to monetise open source?
    • Do we harm migrations by volunteering - as in single people supporting a project without a budget, burning out in the process instead of setting up sustainable projects with a real budget? Instead of teaching the pros and cons of going for free software so people are in a good position to argue for a sustainable project budget?
    • Within administrations: Did we focus too much on the operating system instead of freeing the apps people are using on a day to day basis?
    • Did we focus too much on one star project instead of collecting and publicising many different free software based approaches?


    As a lesson from these events, the FSFE launched an initiative to drive developing code funded by public money under free licenses.

    Dude, Where's My Microservice

    In his talk Dude, Where's My Microservice? - Tomasz Janiszewski from Allegro gave an introduction to what projects like Marathon on Apache Mesos, Docker Swarm, Kubernetes or Nomad can do for your Microservices architecture. While the examples given in the talk refer to specific technologies, they are intented to be general purpose.

    Coming from a virtual machine based world where apps are tied to virtual machines who themselves are tied to physical machines, what projects like Apache Mesos try to do is to abstract that exact machine mapping away. Is a first result from this decision, how to communicate between micro services becomes a lot less obvious. This is where service discovery enters the stage.

    When running in a microservice environment one goal when assigning tasks to services is to avoid unhealthy targets. In terms of resource utilization instead of overprovisioning the goal is to use just the right amount of your resources in order to avoid wasting money on idle resources. Individual service overload is to be avoided.

    Looking at an example of three physical hosts running three services in a redundant matter, how can assigning tasks to these instances be achieved?

    • One very simple solution is to go for a proxy based architecture. There will be a single point of change, there aren't any in-app dependencies to make this model work. You can implement fine-grained load balancing in your proxy. However this comes at the cost of having a single point of failure, one additional hop in the middle, and usually requires using a common protocol that the proxy understands.
    • Another approach would be to go for a DNS based architecture: Have one registry that holds information on where services are located, but talking to these happens directly instead of through a proxy. The advantages here: No additional hop once the name is resolved, no single point of failure - services can work with stale data, it's protocol independent. However it does come with in-app dependencies. Load balancing has to happen local to the app. You will want to cache name resolution results, but every cache needs some cache invalidation strategy.


    In both solutions you will also still have logic e.g. for de-registrating services. You will have to make sure to register your service only once is successfully booted up.

    Enter the Service Mesh architecture, e.g. based on Linker.d, or Envoy. The idea here is to have what Tomek called a sidecar added to each service that talks to the service mesh controller to take care of service discovery, health checking, routing, load balancing, authn/z, metrics and tracing. The service mesh controller will hold information on which services are available, available load balancing algorithms and heuristics, repeating, timeouts and circuit breaking, as well as deployments. As a result the service itself no longer has to take care of load balancing, ciruict breaking, repeating policies, or even tracing.

    After that high level overview of where microservice orchestration can take you, I took a break, following a good friend to the Introduction to SoC+FPGA talk. It's great to see Linux support for these systems - even if not quite as stable as would be an ideal world case.

    Trolling != Enforcement

    The afternoon for me started with a very valuable talk by Shane Coughlan on how Trolling doesn't equal enforcement. This talk was related to what was published on LWN earlier this year. Shane started off by explaining some of the history of open source licensing, from times when it was unclear if documents like the GPL would hold in front of courts, how projects like gplviolations.org proofed that indeed those are valid legal contracts that can be enforced in court. What he made clear was that those licenses are the basis for equal collaboration: They are a common set of rules that parties not knowing each other agree to adhere to. As a result following the rules set forth in those licenses does create trust in the wider community and thus leads to more collaboration overall. On the flipside breaking the rules does erode this very trust. It leads to less trust in those companies breaking the rules. It also leads to less trust in open source if projects don't follow the rules as expected. However when it comes to copyright enforcement, the case of Patrick McHardy does imply the question if all copyright enforcement is good for the wider community. In order to understand that question we need to look at the method that Patrick McHardy employs: He will get in touch with companies for seemingly minor copyright infringements, ask for a cease and desist to be signed and get a small sum of money out of his target. In a second step the process above repeats, except the sum extracted increases. Unfortunately with this approach what was shown is that there is a viable business model that hasn't been tapped into yet. So while the activities by Patrick McHardy probably aren't so bad in and off itself, they do set a precedent that others might follow causing way more harm. Clearly there is no easy way out. Suggestions include establishing common norms for enforcement, ensuring that hostile actors are clearly unwelcome. For companies steps that can be taken include understanding the basics of legal requirements, understanding community norms, and having processes and tooling to address both. As one step there is a project called Open Chain publishing material on the topic of open source copyright, compliance and compliance self certification.

    Kernel live patching

    Following Tomas Tomecek's talk on how to get from Dockerfiles to Ansible Containers I went to a talk that was given by Miroslav Benes from SuSE on Linux kernel live patching.

    The topic is interesting for a number of reasons: As early as back in 2008 MIT developed something called Ksplice which uses jumps patched into functions for call redirection. The project was aquired by Oracle - and discontinued.

    In 2014 SuSE came up with something called kGraft for Linux live patching based on immediate patching but lazy migration. At the same time RedHat developed kpatch based on an activeness check.

    In the case of kGraft the goal was to be able to apply limited scope fixes to the Linux kernel (e.g. for security, stability or corruption fixes), require only minimal changes to the source code, have no runtime cost impact, no interruption to applications while patching, and allow for full review of patch source code.

    The way it is implemented is fairly obvious - in hindsight: It's based on re-useing the ftrace framework. kGraft uses the tracer for inception but then asks ftrace to return back to a different address, namely the start of the patched function. So far the feature is available for x86 only.

    Now while patching a single function is easy, making changes that affect multiple funtions get trickier. This means a need for lazy migration that ensures function type safety based on a consistency model. In kGraft this is based on a per-thread flag that marks all tasks in the beginning and makes waiting for them to be migrated possible.

    From 2014 onwards it took a year to get the ideas merged into mainline. What is available there is a mixture of both kGraft and kpatch.

    What are the limitations of the merged approach? There is no way right now to deal with data structure changes, in particular when thinking about spinlocks and mutexes. Consistency reasoning right now is done manually. Architectures other than X86 are still an open issue. Documentation and better testing are open tasks.

    Open development and inner source for fun and profit

    2017-05-26 07:17
    Last in a row if interesting talks at Adobe Open Source Summit was on Open Development/ Inner Source and how it benefits internal projects given by Michael Marth. Note: He knows there's subtle differences between inner source and open development, but mentioned to use the terms interchangeably in his talk.

    So what is inner source all about? Essentially: Use all the tools and processes that already work for open source projects, just internally. (Company) public mailing lists, documentation, chat software, issue trackers. Taken to it's core this is very simplistic though. The more interesting aspects are waiting when looking at the people interaction patterns that emerge.

    First off: The goal of making all interaction public and easy to follow for anyone is to attract more contributors. The richest source of contributors can be tapped into if your users are tech savvy as well. Being based on this assumption inner source works best when dealing with infrastructure software, or platform software where downstream users are developers themselves.

    As a general rule of thumb: From 100 users, 10 will contribute, one will stick around and become a long term committer. This translates into a lot of effort for gaining additional hands for your project.

    So - assuming what you want is a wildly successful open source project: You put your code on Github (or whatever the code hosting site of the day is), start some marketing effort - but still not magic happens, no stars, no unicorns, maybe there's 10 pull requests, but that's about it. What happened?

    Architecting a community around an open source project is a long term investment: Over time you'll end up training numerous newbies, help people getting started and convince some of those to contribute back.

    According to the speaker Michael Marth where that works best is for infrastructure projects: Where users can be turned into contributors and where projects can be turned into platform software that lasts for a decade and longer. In his opinion what is key are two factors: Enabling distributed decision making to let others participate, and a management style that lets the community take their own decisions instead of having one entity control the project. Usually what emerges from that is a distributed, peer-peer networked organisational structure with distributed teams, no calls, no standups, consensus based decision making.

    In Michaels experience what works best it to adopt an open source working model from the very start. His recommendation for projects coming from comercial entities is to go to the Apache Software Foundation: There, proven principles and rules have been identified already. In addition going there gives the project much more credibility when it comes to convincing partners that decisions are actually being made in a distributed fashion without being controllable by one single entity. So telling a customer "We have to check this with the community first" as an answer to a feature request becomes much more credible.

    The result of this approach are projects that under his guidance gained ten times as many people contributing to the project outside of the original entity than inside of it. The result Michael observed were partners that were much more likely to stick with the technology by means of co-owning it. Partners were participating in development. Also the project made for a lovely recruiting pipeline filled with people already familiar with the technology.

    Note to self - slides for staying sane when maintaining a popular open source project

    2017-05-26 07:00
    For further reference - Simon MacDonald has a great collection of good advise on how to stay sane when running and maintaining a popular open source project. Link here: http://s.apache.org/sanity

    Some things he mentioned:

    Include a README. It should tell people what the project is about but also what the project is not about. It should have some sort of getting started guide. Potentially link to a CONTRIBUTING doc.

    Contribution guidelines should outline social rules like a code of conduct, technical instructions like how to submit a pull request, a style guide, information on how to build the project and make changes etc.

    Add a LICENSE file - any OSS license really, because by default it won't be open source in no jurisdiction. Add file headers to each file you publish.

    Decide how to handle questions vs. issues: Either both in the issue tracker, or in separate venues.

    Add an issue template that asks the user if they searched for the issue already, asks for expected behaviour, actual behaviour, logs, reproduction code, version number used. A note on issues: Having many issues is a good thing - it shows your project is popular. Having only many stale issues is a bad thing - nobody is caring for the project anymore.

    Close issues that don't follow the template. Close issues that are duplicates. Close issues that are non active after asking for user input a while ago. Repeated issues asking for seamingly obvious things: Turn those into additional documentation. Asks for easy to add functionality: Let it sit for a while to give others a chance to do it and get involved. Same for bugs that are easy to fix.

    Overall people are more difficult than code. Expect trolls to show up. Remain empathetic, respectful but firm in your communication. Don't be afraid to say no to external requests even if they are urgent for the requester.

    Add a pull request template that asks for a description, related issue, type tag. Remember that you don't have to merge every pull request.

    Build a community: Make it easy to contribute, identify beginner bugs, document the hell out of your project, turn contributors into maintainers, thank people for their effort.

    Have tests but keep build times low.

    Add documentation, at the very least a README file, a how to contribute file, break those files into a separate website once they grow too large.

    As for releasing: Automate as much as you can. Three options: time based release schedule, release on every commit, release "when it's done".

    Async decision making

    2017-05-16 06:45
    This is the second in a series of posts on inner source/open source. Bertrand Delacretaz gave an interesting talk on how to avoid meetings by introducing an async way of making decisions.

    He started off with a little anecdote related to Paul Graham's maker's vs. manager's schedule: Bertrand's father was a carpenter. He was working in the same house that his family was living in, so joining the family for lunch was common for him. However there was one valid excuse for him to skip lunch: "I'm glueing." How's that? Glueing together a chair is a process that cannot be interrupted once started without ruining the entire chair - it's a classical maker task that can either be completed in one go, or not at all.

    Software development is pretty similar to that: People typically need several hours of focus to get anything meaningful done, in my personal experience at least two (for smaller bugs) or four (for larger changes). That's the motivation to keep forced interruptions low for development teams.

    Managers tend to be on a completely different schedule: Context switching all day, communicating all day adding another one hour meeting to the schedule doesn't make much of a difference.

    The implication of this observation: Adding one hour of meeting time to an engineer's schedule comes with an enourmous cost if we factor the interruption into the equation. Adding to the equation that lots of meetings actually fail for various reasons (lack of preparation, lack of participants getting prepared, bad audio or video quality, missing participants, delayed start time, bad summarisation) it seems valid to ask if there is a way to reduce the number of face to face meetings while still remaining operational.

    As communication still remains key to a functional organisation, one approach taken by open development at Adobe (as well as at the Apache Software Foundation really) is to differentiate between things that can only be discussed in person and decisions that can be taken in an asynchronous fashion. While this doesn't reduce the amount of time communicating (usually quite the contrary happens) it does allow for participants to participate pretty much on their own schedule thus reducing the number of forced interruptions.

    How does that work in practice? In Bertrand's experience decision making tends to be a four step process: Coming from an open brainstorming sessions, options need to be condensed, consensus established and finally a decision needs to be made.

    In terms of tooling in his experience what works best is to have one and only one shared communication medium for brainstorming. At Apache those are good old mailing lists. In addition there is a need for a structured issue tracker/ case management tool to make options and choices obvious, decisions visible and archived.

    When looking at tooling we are missing one important ingredient though: Each meeting needs extensive preparation and thourough post processing. As an example lets take the monthly Apache board of directors' meeting: It's scheduled to last no longer than two hours. Given each of hundreds of projects are required to report on a quarterly basis, given that executive officers need to provide reports on a monthly basis, given that each month at least one major decision item comes up and given that there is still day to day decisions about personel, budget and the like to be taken: How does reducing that to two hours work? The secret sauce is a text file in svn + a web frontend called whimsy to it.

    Directors will read through those reports ahead of the meeting. They will add comments to them (which will be mailed automatically to projects), often those comments are used by directors to communicate with each other as well. They will pre-approve reports, they will mark them for discussion if there is something fishy. Some people will check projects' lists to match that up with what's being discussed in the report, some will focus on community stuff, some will focus on seeing releases being mentioned. If a report gets enough pre-approvals an no mark "to be discussed" they are not being shown or touched in the real meeting.

    That way most of the discussion happens before the actual meeting leaving time for those issues that are truely contentious. As the meeting is open for anyone in te foundation to attend questions raised beforehand that could not be resolved in writing can be answered in the voice call fairly quickly.

    Speaking of call: How does the actual meeting proceed? All participants dial in via good old telephone. Everyone is on a telephone so the issue of "discussions among people in the same room are hard to understand for remote participants" doesn't occur. In addition to telephone there's an IRC backchannel for background discussion, chatter, jokes and less relevant discussion. All discussion that has to be archived and that relates to discussions is kept on the voice channel though. During the meeting the board's chair is moderating through the agenda. In addition the secretary will make notes of who attended, which discussions were made and which arguments exchanged. Those notes are being shared after the meeting, approved at the following month's meeting and published thereafter. If you want to dig deeper into any project's history, there's tooling to drill down into meeting minutes per project until the very beginning of the foundation.

    Does the above make decision making faster? Probably not. However it enables an asynchronous work mode that fits best with a group of volunteers working together in a global, distributed community where participants do not only live in different geographies and timezones but are on different schedules and priorities as well.

    Notebook - OSS office at Adobe

    2017-05-12 06:46
    tl;dr: This post summarises what I learnt at the Adobe Open Source Summit last week about which aspects to think of when running an open source office. It's mainly a mental note for myself, hopefully others will find it useful as well.

    Longer version:

    This is another post in a series of articles on "random stuff I leant in Basel last week". When I was invited to Adobe's open source summit I had no idea what to expect from it. In retrospect I'm glad I accepted the invitation - it was a great mix of talks, a valuable insight into how others are adopting open source processes and their motivation for open sourcing code and helping out upstream.

    The first in a series of talks gave an overview of Adobe's open source office. Currently the expertise collected there includes one human with a software development background, one human with a marketing background and one human with program management background showing the breadth of topics faced when interacting with open source projects.

    The self-set goal of these people is to make contributing to open source as easy as possible. That includes making people comfortable internally with the way many open source projects work. It means making (e.g. legal, branding) review processes of contributions internally as easy as possible. It means supporting teams with checking contributions for license conformance, sensitive (e.g. personal) information that shouldn't be shared publicly. It also means supporting teams that want to run their own open source projects.

    One idea that I found very intriguing was to run their team through a public github repository, in a sort-of eat-your-own-dogfood kind-of way.

    One of the artifacts maintained in that repository is a launch checklist that people can follow which includes reminders for creating a blog post, publishing on the website, sharing information on social media. Most likely those are aspects that are often forgotten even in established projects when it comes to topics like releasing and promoting a new software release version.

    Another aspect I found interesting was the creation of an OSS advisory board - a group of people intersted in open development and open source. A wider group of people tasked with figuring out how to best approach open development, how to get the organisation to move away from a private by default strategy towards an open by default way of collaborating across team boundaries.

    Many of these topics fit well within what was shared earlier this year in the Linux foundation webcast on how to go about creating an open source office in organisations.

    Agile, inner source, open development, open source development

    2017-05-07 15:53
    Last week I had the honour of giving a keynote at Adobe's Open Source Summit EU in Basel. Among many interesting talks they hosted a panel to answer questions around all things open source, strategy, inner source, open development. One of the questions I found intersting is how inner source/ open development and agile are related.

    To get everyone on the same page, what do I mean when talking about inner source/ open development? I first encountered the concept at ApacheCon EU Amsterdam in 2008: Bertrand Delacretaz, then Day Software, now Adobe was talking about how to apply open source development principles in a corporate environment. A while ago I ran across the term http://innersourcecommons.org that essentially wants to achieve the same thing. What I find intersting about it is the idea of applying values like transparency, openess to contributions as well as concepts like asynchronous communication and decision making at a wider context.

    What does that have to do with Agile? Looking at the mere mechanics of Agile like being co-located, talking face to face frequently, syncing daily in in person meetings this seems to be pretty much opposite of what open source teams do.

    Starting with the values and promises of Agile though I think the two can compliment each other quite nicely:

    "Individuals and interactions over processes and tools" - there's a saying in many successful popular projects that the most important ingredient to a successful open source project is to integrate people into a team working towards a shared vision. Apache calls this Community over code, ZeroMQ's C4 calls it "People before code", I'm sure there are many other incarnations of the same concept.

    Much like in agile, this doesn't mean that tools are neglected alltogether. Git was created because there was nothing quite like this tool before. However at the core of it's design was the desire for ideal support for the way the Linux community works together.

    "Working software over comprehensive documentation" - While people strive for good documentation I'd be so bold as to suggest that for the open source projects I'm familiar with the most pressing motivation to provide good documentation is to lower the number of user questions as well as the amount of questions people involved with the project have to answer.

    For projects like Apache Lucene or Apache httpd I find the documentation that comes with the project to be exceptionally good. In addition both projects are supported with dead tree documentation even. My guess would be that working on geographically distributed teams who's members work on differing schedules is one trigger for that: Given teams that are geographically distributed over multiple time zones means that most communication will happen in writing anyway. So instead of re-typing the same answers over and over, it's less cost intensive to type them up in a clean way and post them to a place that has a stable URL and is easy to discoveri. So yes, while working software clearly is still in focus, the lack for frequent face to face communication can have a positive influence on the availability of documentation.

    "Customer collaboration over contract negotiation" - this is something open source teams take to the extreme: Anybody is invited to participate. Answers to feature requests like "patches welcome" typically are honest requests for support and collaboration, usually if taken up resulting in productive work relationships.

    "Responding to change over following a plan" - in my opinion this is another value that's taken to the extreme in open source projects. Without control over your participants, no ability to order people to do certain tasks and no influence over when and how much time someone else can dedicate to a certain task reaciting to changes is core to each open source project cosisting of more than one maintaining entity (be it single human being or simply one company paying for all currently active developers).

    One could look further, digging deeper into how the mechanics of specific Agile frameworks like Scrum match against what's being done in at least some open source projects (clearly I cannot speak for each and ever project out there given that I only know a limited set of them), but that's a topic for a another post.

    The idea of applying open source development practices within corporations has been hanging in the air for over 15 years, driven by different individuals over time. I think now that people are gathering to collect these ideas in one place and role them out in more corporations discussions around the topic are going to become interesting: Each open source community has their own specifics of working together, even though all seem to follow pretty much the same style when looked at from very high above. I'm curious to what common values and patterns can be derived from that which work across organisations.

    Note to self: Backup bottlenecks

    2014-03-23 18:26
    I learnt the following relations the hard way 10 years ago when trying to backup a rather tiny amount of data, went through the computation again three years ago. Still I had to re-do the computation this morning when trying to pull a final full backup from my old MacBook. Posting here for future reference: Note 1: Some numbers like 10BASE-T included only for historic reference. Note 2: Excluded the Alice DSL uplink speed - if included the left-hand chart would no longer be particularly helpful or readable...

    How hard can it be - organising a conference

    2014-03-15 19:37
    Setup a CfP, select a few talks, publish a schedule, book a venue, sell a few tickets - have fun: Essentially all it takes to organise a conference, isn't it?

    In theory maybe - in practice - not so much. Without scaring you away from running your own here's my experience with setting up Berlin Buzzwords (after two years of running the Berlin Hadoop Get Together, putting up a NoSQL half day meetup).

    Disclaimer: Though there's a ton of long-ish posts in my blog, this one is going to be exceptionally long. It's years since I wanted to write all this up for others to learn from and in order to point others to it - never got around to it though.

    Venue booking

    Lets start with the most risky part: Booking a venue. If you are a student - great - talk to your university to get your event hosted. As much hassle this may entail when it comes to bureaucracy this is by far the largest monetary risk factor in your calculation, so if there is any way to get rid of this factor use it. FOSDEM, FrOSCon, Linux Tage Chemnitz and several other events have handled this really well.

    If this is not an option for you depending on the number of targeted attendees you are faced with a smaller (some 500 Euro for 100 people per day and track) or larger (some 20.000 Euros for 300 people, two days and two tracks including chairs, tables, microphones, screens, projectors, lighting, security and technicians) sum that you will have to pay up front, potentially even before your attendees purchased any tickets (here's one reason why there is such a thing as an early bird ticket: The earlier money flows in the less risk there is around payments).

    In Germany there's three ways to limit this risk: You create a so-called e.V., essentially a foundation that will cover the risk. You create a GmbH, essentially a company that assumes limited risk, namely only up to the 25.000 Euros you had to put into the company when founding it. The third way is to find a producing event agency that assumes all or part of the risk. After talking to several people (those behind Chaos Communication Congress, those setting up the Berlin Django Con back in 2010, those working on JSConf, those backing Hadoop World) for me it was logical to go for the third way: I had no idea how to handle ticket sales, taxes etc. and I was lucky enough to find an agency that would assume all risk in turn for receiving all profit made from the event.

    One final piece of advise when selecting your venue: Having a place that is flexible when it comes to planning goes a long way to a relaxed conference experience. As much as it is desirable you won't be able to anticipate all needs in advance. It also helps a lot to keep the event on your home turf: One of the secrets of Berlin Buzzwords is having a lot of local knowledge in the organising team and lots of contacts to local people that can help, support and provide insight:

    By now at Berlin Buzzwords we have a tradition of having an after conference event on Monday evening. The way the tradition started relied heavily on local friends: What we did was to ask friends to take out up to 20 attendees to their favourite bar or restaurant close to the venue in turn for a drastically reduced ticket price. This approach was highly appreciated by all attendees not familiar with Berlin - several asked for a similar setup for Tuesday evening even. Starting from the second edition we were able to recruit sponsors to pay for beers and food for Monday evening. Pro-Tipp: Don't ship your attendees to the BBQ by bus - otherwise the BBQ cook will hate you.

    Something similar was done for our traditional Sunday evening barcamp: In the first two years it took place at the famous Newthinking Store - a location that used to be available for rent for regular events and free for community events. Essentially a subsidary of our producer. In the second year the Barcamp moved to one of the first hacker spaces world - c-base. Doing that essentially was possible only because some of the organisers had close contacts among the owners of this space.

    The same is true for the Hackathons and meetups on Wednesday after the main conference: Knowing local (potential) meetup organisers helps recruiting meetups. Knowing local startups helps recruiting space for those people organising a meetup though themselves travelling e.g. from the UK.

    Catering

    Another risk factor is providing catering: If catering is included in your ticket price, the caterer will probably want to know at least one week before door open roughly how many people will attend (+/-10 people usually is fine, +/- 100 people not really). When dealing with geeks this is particularly tricky: These guys and gals tend to buy their tickets Saturday/Sunday before Berlin Buzzwords Monday morning. In the first two years that meant lots of sleepless nights and lots of effort spent advertising the conference. In the third night I decided that it's time for some geek education: I suggested to introduce a last minute ticket that is expensive enough to motivate the majority of attendees to buy their tickets at least two weeks in advance. Guess what: Ever since we the problem of "OMG everyone is buying their ticket last minute" went away - and at least I got my sleep back.

    The second special thing about catering: As much as I would like to change that Berlin Buzzwords has an extremely skewed gender distribution. When dealing with caterers what they usually only want to know is how many people will attend. If you forget to tell them that all your attendees are relatively young and male they will assume a regular gender distribution - which often leads to you running out of food before everyone is fed.

    Sponsoring

    Except for tickets - are there any other options to acquire money apart from tickets? Sure: convince companies to support your event as sponsors. The most obvious perk is to include said company's logo on your web page. You can also sell booth space (remember to rent additional space for that at your venue). There's plenty of other creative ways to sell visibility. For package sizing I got lots of support from our producer (after all they were responsible for budgeting). Convincing companies to give us money that was mostly on Simon, Jan and myself. Designing the contracts and retrieving the money again was left to the producer. Without a decent social network on our side finding this kind of financial support would not have been possible. In retrospect I'm extremely glad the actual contract handling was not on me - guess what, there are sponsors who just simply forget to pay the sum promised though there is a signed contract...

    Speaking

    So, what makes people pay for tickets? A convincing speaker line-up of course. In our case we had decided to go for two invitation only keynote speakers and two days with two tracks of CfP based talks. Keynote speaker slots are still filled by Simon and myself. In early years where CfP ratings, schedule grid (when and how long are breaks? how many talks will fit?) and scheduling itself were on Simon, Jan and myself. As submission numbers went up we decided to share the review load with people whose judgement we trust - ensuring that each submission gets three reviews. All after that is fairly algorithmic: See an earlier blog post for details.

    A note on review feedback: As much as we would like to give detailed feedback to those who didn't make it: We usually get three times as many submissions as there are open slots. So far I haven't seen a single submission (except for very clear product pitches lacking technical details) that wasn't great. So in the end, most feedback would boil down to "It was really close, but we only have a limited number of slots to fill. Sorry."

    Back in 2011 we tried an experiment to fit more talks into the schedule than usual: Submissions ranked low that were supposed to be long talks were accepted as short versions forcing speakers to focus. Unfortunately this experiment failed: Seems like people rather get a reject mail than getting accepted as a short version. Also you need really good speakers who prepare exceptionally well for the event - in our case many shortened talks would still include all introductory slides even though the speaker just one slot earlier had covered those already.

    Marketing

    Next step is to tell people that there is going to be an event. In the case of Berlin Buzzwords this meant telling people all over the world and convincing them to not only buy a conference ticket, but also a hotel room and a flight to the venue (we started with only half of all attendees coming from Germany and pretty much kept this profile until today). As a first step this meant writing a press release and convincing local tech publishers (t3n, heise, Golem, Software und Support, Open Source Press and many more) to actually publish on the event. For some of these I am an author myself, so I knew which people to talk to, for some of these newthinking as the producing event agency could help. It was only years later that I participated in a media training at Apache Con by Sally Khudairi to really learn how to do press releases.

    From that day on, a lot of time went into convincing potential speakers to submit talks, potential sponsors to donate money, potential attendees to make it to the event. With a lot of event organising experience on their back (they are running a multiple thousand attendees new media conference called re:publica each year - in addition to many "smaller" events) newthinking told me upfront that they would need half of my time to cover those marketing activities, essentially as I was the only one who knew the community. The offer was to re-imburse 20h per week from the conference budget. I was lucky enough to be able to convince my manager at neofonie (the company I was working for back then) though that sponsoring the event with my time in turn for a silver sponsorship would be a great opportunity. One of the reasons for doing that on their side was that they were themselves providing services in the big data and search space. Without this arrangement though Berlin Buzzwords 2010 would have been a whole lot harder to do.

    Now how do you reach people for a tech conference? You talk to the press, you create and promote a twitter account. I still have the credentials for @berlinbuzzwords - by now though it is fully managed by our social media expert, back until 2011 I was the face behind it. Ever since my twitter client is set to follow anything that contains bbuzz, berlinbuzzwords or "berlin buzzwords" - so I can still follow any conversation relating the conference. You create and maintain LinkedIn and Xing as well as Facebook groups for people to follow. You use whatever channels your target audience are reading - in our case several Apache mailing lists, a NoSQL mailing list, sourceforge hosted lists - remember to sign up for these before posting, otherwise your mail won't get through. Also make sure that people at least roughly know your name, trust you and you provide enough context in your mail for others to not view your invitation as plain spam.

    Finally you go through your personal address book and talk to people you know would be interested personally. As a result you will wake up to 20 new mails each morning and answer another 20 new mails every evening. For me this got better after having shaped all contacts into a format that I could hand over to newthinking. However even today, every single mail you send to [target]@berlinbuzzwords.de, every comment you submit through the contact form, every talk wish you submit through our wishlist still ends up in my personal inbox - as well as the inbox of Simon and everyone involved with the conference at newthinking.

    Essentially you need this kind of visibility once for announcing the event, once for filling the CfP with proposals, once the schedule in published, once to convince sponsors to support the event and finally once to convince people to buy tickets.

    As for the website - the most frustrating part is being a technical person but lacking the time to "just do things right". Drupal, I'm sorry, but I have learnt to hate your blogging functionality - in particular the WYSIWYG editor. Your administrative backend could be much simpler (I gave those rights away I believe in 2012). I learnt to hate comment spam more than anything else - in particular given the fact that I pretty much would have loved to get everyone involved and able to contribute content. The only thing that helped accepting the deficiencies here was to force myself to hand of any and all content handling to the capable event managers at newthinking.

    Misc

    Videos: Great way to get the word out (and follow the talks yourself, though organisers may find time to go to talks they'll never remember the actual content due to information overload). Make them free - people pay for tickets to take part in the live event. If you fear selling less tickets as a result, make them available only a little while after the event is over.

    Pictures: Get someone with a good camera - can be a volunteer, can be a professional or anything in between. I've had it many times that it took half a year for me to go over the pictures again and suddenly realise how many nice people attended Buzzwords without me knowing them when they did - except they remembered my face when we met again (sorry if you are one of those people I should have remembered once :( )

    Inbox: Your's will never be empty again. Especially if as in our case your mail address was used as primary point of contact and reference in the first few years. It took two editions to train people to use info@berlinbuzzwords.de instead of my personal mail address. Trust me - this mail address actually does get attention: Mails sent there end up in my private inbox, they end up in Simon's private inbox and most importantly they end up in those event managers' inboxes involved with Buzzwords at newthinking. Answers typically don't take much longer than half a day even during holidays. Today there is no reason left to contact neither Simon or myself privately - using info@ is way faster. If you still aren't convinced: Even I'm using that same address for general inquiries and proposals.

    Incentives: Think early about which behaviour you would like to see. On site behave accordingly. Pre-conference set incentives: Two weeks before doors open our ticket prices go up drastically to motivate people to buy tickets before our catering deadline bites us. Speakers get travel support and hotel room paid for. However it costs us nothing to list their employer as travel sponsor should they decide to pay for the speaker - and it provides an incentive for the speaker to get their employer to pay for travel costs.

    Ticket prices: You will get people arguing that ticket prices are too high. Know where you stand in comparison to other conferences of the same type. Clearly you want students if the main focus of your sponsors is to recruit - provide drastically reduced student tickets to anyone with a student id, that includes PhD. students. For everyone else: Buying early should be rewarded. Also you'll need help on site (people moderating sessions, keeping full rooms closed, people helping with video taping, networking etc.) - hand out free tickets to helpers - if those complaining aren't willing to help their need for a cheap ticket probably isn't large enough.

    The "they are stealing our attendees syndrome": Unless there is a clear trademark infringement there's no way to stop other people from running events that on first sight look similar to yours. First of all start by making it hard to beat your offering - not in terms of prices but in terms of content and experience. After you've done that follow the "keep your friends close but your enemies closer" principle by embracing those who you believe are running competing events. What we had in the past was events close in topic to ours but not quite overlapping. Where there was enough overlap but still enough distinction we would go out and ask for partnering. This usually involved cross-raffling tickets. It also meant getting better visibility for our event though different channels. Usually the end result was one of two: A) The competing event was a one-of or otherwise short-lived. B) The seemingly competing event targeted an audience that was much more different from ours than we first believed.

    On being special

    What makes people coming back? I have been told Berlin Buzzwords has a certain magic to it that makes people want to come back. I'm not sure about the magical part - however involving people, providing space and time for networking, choosing a venue that is not a conference hotel, always at least trying to deliver the best experience possible goes a long way to make attendees feel comfortable. As a last note: If you ever once organised a meetup or conference you will never attend other events without at least checking what others do - you will suddenly see all the little glitches that otherwise slip from your attention (overly full trash bins anyone?). On the other hand each event brings at least one story that when it happened looked horrible but turns out to be hilarious and told over and over again later ;)