I am thoroughly impressed by Finnish efficiency!

On the first day I had signed my contract, got my badge, a laptop, a phone, the keys to my personal office, setup my university account, and my license to operate the coffee machine. 🙂

On the second day I had registered at the police, got my foreigner ID, registered as a resident, applied for social security, got my tax card, applied for a transportation card, and opened a bank account.

Couldn’t be any faster!

On the other hand, some things don’t seem very welcoming for foreigners. For example:

  • I had to pay 50€ to register at the police, which boils down to a hefty “immigration tax” which should not be there.
  • All my social benefits (social security, health insurance, transport discount, etc..) expire the moment my contract terminates
  • I need to wait 3 payslips before the bank gives me access to web banking (and I can change the PIN of my card).

Some friends even told me about circular dependencies in the set of documents you need to register (though I didn’t have any problem). Definitely an interesting mix!


The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here's an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 11,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 4 sold-out performances for that many people to see it.

Click here to see the complete report.

2012 was for sure the year “big data” went ballistic, and throughout 2013 and 2014 it became commonplace and commodified. It is so prevalent nowadays in both industry and academia that is has almost lost any meaning. But when did this trend start? Or, to be more concrete, when was the term “big data” coined?

It was for sure before 2010. One of the possible culprit is Randal E. Bryant, who also coined the term DISC (Data-Intensive Scalable Computing), which I prefer over “big data” to describe tools such as Hadoop — it’s just much more precise. However, this happened just around the corner in 2008.

You might think that “big data” is a recent things, at least more recent than 2000. Well, think again. This paper by Diebold shows a few references from the ’90s. In particular, footnote 9 says:

On the academic side, Tilly (1984) mentions Big Data, but his article is not about the Big Data phe- nomenon and demonstrates no awareness of it; rather, it is a discourse on whether statistical data analyses are of value to historians. On the non-academic side, the margin comments of a computer program posted to a newsgroup in 1987 mention a programming technique called “small code, big data.” Fascinating, but off-mark. Next, Eric Larson provides an early popular-press mention in a 1989 Washington Post article about firms that assemble and sell lists to junk-mailers. He notes in passing that “The keepers of Big Data say they do it for the consumer’s benefit.” Again fascinating, but again off-mark. (See Eric Larson, “They’re Making a List: Data Companies and the Pigeonholing of America,” Washington Post, July 27, 1989.) Finally, a 1996 PR Newswire, Inc. release mentions network technology “for CPU clustering and Big Data applications…” Still off-mark, neither reporting on the Big Data phenomenon nor demonstrating awareness of it, instead reporting exclusively on a particular technology, the so-called high-performance parallel interface.

The best guess at when the term was coined is 1998, by John Mashey (retired former Chief Scientist at SGI) who produced slide deck entitled “Big Data and the Next Wave of InfraStress”. However, the famous 3V’s of big data came around 2001 introduced by Laney at Gartner.

A thoughtful piece on industrial labs and research styles. My favorite bit is the part about “empowering researchers”. All the managers dealing with researchers should read this paper by Roy Levin.

Windows On Theory

The closing of MSR-SV two months ago raised a fair bit of discussion, and I would like to contribute some of my own thoughts. Since the topic of industrial research is important, I would like the opportunity to counter some misconceptions that have spread. I would also like to share my advice with anyone that (like me) is considering an industrial research position (and anyone that already has one).


On Thursday 09/18/2014, an urgent meeting was announced for all but a few in MSR-SV. The short meeting marked the immediate closing of the lab. By the time the participants came back to their usual building, cardboard boxes were waiting for the prompt packing of personal items (to be evacuated by the end of that weekend). This harsh style of layoffs was one major cause for shock and it indeed seemed unprecedented for research labs of this sort. But I find the following…

View original post 1,227 more words

Science is the belief in the ignorance of experts.

Richard Feynman

Watch your thoughts; they become words.
Watch your words; they become actions.
Watch your actions; they become habit.
Watch your habits; they become character.
Watch your character; it becomes your destiny.

Lao Tzu

Because when we sit down and think about a problem, when we take the time to not only understand what our feature space “is” and what it “implies” in the real-world — then we are acting like machine learning scientists. Otherwise, we [are] just a bunch of machine learning engineers, blindly performing black box learning and operating a set of R, MATLAB, and Python libraries.

The takeaway is this: machine learning isn’t a tool. It’s a methodology with a rational thought process that is entirely dependent on the problem we are trying to solve.

Get off the deep learning bandwagon and get some perspective – PyImageSearch