Strata EU - part 2

Strata EU - part 2 #

The second keynote touched upon the topic of data literacy: In an age in which growing amounts of data are being generated being able to make sense of these becomes a crucial skill for citizens just like reading, writing and computing. The speaker’s message was two-fold: a) People currently are not being taught how to deal with that data but are being taught that all that growing data is evil. Like an enemy hiding under their bed just waiting to jump at them. b) When it comes to getting the people around you literate the common wisdom is to simplify, simplify, simplify. However her approach is a little different: Don’t simplify. Instead give people the option to learn and improve. As a trivial comparison: Just because her own little baby does not yet talk doesn’t mean she shouldn’t talk to it. Over time the little human will learn and adapt and have great fun communicating with others. Similarly we shouldn’t over-simplify but give others a chance to learn.

The last keynote dealt gave a really nice perspective on information overload and the history of information creation. Starting back in the age of clay tablets where writing was to 90% used for accounting only – tablets being tagged for easier findability. Continuing with the invention of paper – back then still as roles as opposed to books that facilitated easy sequential reading but made random access hard. The obvious next step being books that allow for random access read. Going on to initial printing efforts in an age where books were still a scarce resource. Continuing to the age of the printing press with movable types when books became ubiquitous – introducing the need for more metadata attached to books like title pages, TOCs and indexes for better findability. As book production became simpler and cheaper people soon had to think of new ways to cope with the ever growing amount of information available to them. Compared to that the current big data revolution does not look to familiar anymore: Much like the printing press allowed for more and more books to become available , Hadoop allows for more and more data to be stored in clusters. As a result we will have to think about new ways to cope with the increasing amount of data at our disposal, time to start going beyond the mere production processes and deal with the implications for society. Each past data revolution left both – winners and loosers – mainly unintentioned by those who invented the production processes. Same will happen with today’s data revolution.

After the keynotes I joined some of the nerdcore track talks on Clojure for data science and Cascalog for distributed data analysis, briefly joined the talk on data literacy for those playing with self tracking tools to finally join some friends heading out for an Apache Dinner. Always great to meet with people you know in cities abroad. Thanks to the cloud of people who facilitated the event!