Archive for the ‘Technology’ Category

Against any law of physics, you can create entire worlds out of nothing!


Read Full Post »

In the future, thanks to the Grain, a chip which can be implanted on a hard drive in the brain,every single action that a person makes is recorded and may be played back. Liam, a lawyer, married with a child, suspects that his wife Fi is having a fling with the brash Jonas,whom they meet at a dinner party. After playing clips from his own ‘Grain’ his suspicions are confirmed, and he gets drunk and attacks Jonas, forcing the guilty pair to show him what is in their memory banks. In fact the affair has been going on for months and Liam attacks Jonas, demanding that he erase all memories of Fi from his brain. This, however, does not result in marital reconciliation.

The more I look at videos about Google Glass the more it reminds me of the Grain from “The Entire History of You”, Black Mirror s1e3.

I am not sure I share the enthusiasm with which Google speaks about its project. It’s definitely a technological wonder, but how will it affect personal relationship and society in general? What about split attention? Will we be able to handle the constant information flow?

Sometimes entrance barriers and usage fees (in terms of effort and time in this case) are useful. See for example the tragedy of commons. How will the ability to check Facebook every second of your waking time change your life?

Read Full Post »

On Data Science:

The word Data tells you that I transform raw information into actionable information. The word Scientist emphasizes my commitment to making sure that the analyses my colleagues and I produce are verifiable and repeatable—as all good science should be.

Not sure I agree on the whole argument in the post, but the definition of data science is the best I have seen so far.

Melinda Thielbar

“Any field of study followed by the word “science”, so goes the old wheeze, is not really a science, including computer science, climate science, police science, and investment science.”—Ray Rivera, Forbes Magazine

I too have engaged in my fair share of hand-wringing over “data science”, how the term is used and mis-used, the high quantity of snake oil available, and some generally sloppy practices that seem to be becoming the norm in the internet’s new data-based gold rush.

However, as my mama used to say, “I can beat up on my brothers all I want, but you, sir, are not family.”

Data, harnessed for good, is going to transform our world and the way we do business. People who understand data, the mathematics of how data streams relate to each other, and how computers interact with that data, are going to be indispensable to this process. I don’t always…

View original post 444 more words

Read Full Post »

WhatsApp is a nice mobile app for real time messaging that I bought some time ago (0.99€). The app is extremely simple and many people I know use it, so it quickly became part of my toolkit for social life.

Now for the bad pard. I own an iPhone 3G and the latest os that runs on my phone is iOS 4.2. The last version of WhatsApp dropped support for this iOS version, which is normal in today’s accelerated market. What’s not normal, is that WhatsApp forced the upgrade on my mobile. This means that the app stopped working until I upgraded it. And of course after I upgraded it I was not able to install it anymore due to the new requirements.

End of the story, I am left with a blank tile in my iPhone, I have lost the ability to connect with my friends, I have lost all my conversations and finally I have been robbed of my 0.99€.

Of course they blame it on Apple. Here is what the support answered to my inquiry:


Thanks for your message.

In order to connect to WhatsApp, or activate your number, you will need the latest version of WhatsApp Messenger – v2.8.7 – available from the App Store on your iPhone. Please note that iPod and iPad are not supported devices.

The latest version of WhatsApp for iPhone requires iOS 4.3 or later. Regretfully, Apple does not allow new app updates to be compatible with both iOS 6 and older versions of iOS, effectively ending support for the iPhone 3G. Because of Apple’s decision to stop supporting these devices, we can no longer provide new app updates for iPhone 3G users.

If you have an iPhone 3GS or newer device, please update to the latest version of iOS. Instructions for updating can be found at this Apple Support page: http://support.apple.com/kb/HT4972

WhatsApp is also supported on Android, Windows Phone, BlackBerry, and select Nokia devices. Find out more at http://www.whatsapp.com/

However this is total BS. In any normally managed company backwards compatibility is a priority if you don’t want to alienate the user base. It’s perfectly normal to stop supporting an old version of the app and stop providing new features. It’s totally crazy to force the upgrade on people and make the app stop working.

Thanks WhatsApp, seriously great customer care.

Update: WhatsApp has changed its mind and now allows the old version (2.8.4) to access its servers. So if you have an iPhone 3G and the old version you should be all set. If you (like me) tried to update, then you have to download WhatsApp 2.8.4.ipa from somewhere on the internet, and downgrade your version (as easy as double-clicking on the .ipa). I think I am not allowed by law to put my .ipa bundle online, however here is its md5 hash in order to avoid fakes:

Disclaimer: I am reasonably sure the .ipa I have is the correct one, but you never know. Don’t blame me if things blow up 🙂

Read Full Post »

S4 and Storm are two distributed, scalable platforms for processing continuous unbounded streams of data.

I have been involved in the development of S4 (I designed the fault-recovery module) and I have used Storm for my latest project, so I have gained a bit of experience on both and I want to share my views on these two very similar and competing platforms.

First, some commonalities.
Both are distributed stream processing platforms, run on the JVM (S4 is pure Java while Storm is part Java part Clojure), are open source (Apache/Eclipse licenses), are inspired by MapReduce and are quite new. Both frameworks use keyed streams as their basic building block.

Now for some differences.

Programming model.

S4 implements the Actors programming paradigm. You define your program in terms of Processing Elements (PEs) and Adapters, and the framework instantiates one PE per each unique key in the stream. This means that the logic inside a PE can be very simple, very much like MapReduce.

Storm does not have an explicit programming paradigm. You define your program in terms of bolts and spouts that process partitions of streams. The number of bolts to instantiate is defined a-priori and each bolt will see a partition of the stream.

To make things more clear, let’s use the classic “hello world” program from MapReduce: word count.

Let’s say we want to implement a streaming word count. In S4, we can define a word to be a key, and our PE would need to keep track of the number of instances it processes by using a single long (again, very much like MapReduce). In Storm, we need to program each bolt as if it had to process the whole stream, so we would use a data structure like a Map<String, Long> to keep track of the word counts. The distribution and parallelism are orthogonal to the program.

In synthesis, in S4 you program for a single key, in Storm you program for the whole stream. Storm gives you the basic tools to build a framework, while S4 gives you a well-defined framework. To use an analogy from Java build systems, Storm is more like Ant and S4 is more like Maven.

My personal preference here goes to S4, as it makes programming much easier. Most of the times in Storm you will anyway end mimicking the Actors model by implementing a hash based structure on a key, like in the example above.

Data pipeline.

S4 uses a push model, events are pushed to the next PE as fast as possible. If receiver buffers get full events are dropped, and this can happen at any stage in the pipeline (from the Adapter to any PE).

Storm uses a pull model. Each bolt pulls event from its source, be it a spout or another bolt. Event loss can thus happen only at ingestion time, in the spouts if they cannot keep up with the external event rate.

In this case my preference goes to Storm, as it makes deployment much easier: you need to tune buffer sizes in order to deal peaks and event loss only at single place, the spout. If your deployment is badly sized in terms of parallelism level, at worst you get a performance hit in terms of throughput and latency, but the algorithm will produce the same result.

Fault tolerance.

S4 provides state recovery via uncoordinated checkpointing. When a node crashes, a new node takes over its task and restarts from a recent snapshot of its state. Events sent after the last checkpoint and before the recovery are lost. Indeed, events can be lost in any case due to overload, so this design makes perfect sense. State recovery is very important for long running machine learning programs, where the state represents days or weeks worth of data.

Storm provides guaranteed delivery of events/tuples. Each tuple traverses the entire pipeline within a time interval or is declared as failed and can be replayed from the start by the spout. Spouts are responsible to keep tuples around for replay, or can rely on external services to do so (like Apache Kafka). However, the framework provides no state recovery.

I declare a tie here. State recovery is needed for many ML applications, although guaranteed delivery makes it easier to reason about the state of applications. Having both would be ideal, but implementing both of them without performance penalties is not trivial.


There are many other differences, but for sake of brevity I just present a short summary of the pros of each platform that the other one lacks.

S4 pros:

  • Clean programming model.
  • State recovery.
  • Inter-app communication.
  • Classpath isolation.
  • Tools for packaging and deployment.
  • Apache incubation.

Storm pros:

  • Pull model.
  • Guaranteed processing.
  • More mature, more traction, larger community.
  • High performance.
  • Thread programming support.
  • Advanced features (transactional topologies, Trident).

Now the hard question: “Which one should I use for my new project?”.

Unfortunately there is no easy answer, it mostly depends on your needs. I think the biggest factor to consider is whether you need guaranteed processing of events or state recovery. Also worth considering, Storm has a larger and more active user community, but the project is mainly a one-man effort, while S4 is in incubation with the ASF. This difference might be important if you are a large organization trying to decide on which platform to invest for the long term.

Read Full Post »

Great explanation of workers, executors and tasks in Storm, one of the most confusing bits in my opinion, by Michael G. Noll.

Understanding the parallelism of a Storm topology

Read Full Post »

In building there are three stages: Preparation, Production and Proving.

Preparation. The environment must be prepared before work can commence. When painting a room, we have to choose the color scheme, measure, tape up the woodwork, and buy the paint, all before we can start putting it on the wall.

Production is the steepest slope where the maximum rate of measurable work occurs—the most code is written, the most paint is applied.

Proving is the final long tail of the process. This always seems to take longer than it should partly because we invariably find out things we were not expecting—which is where the Second Order Ignorance (unknown unknowns) comes in. In painting, this is the detail work, the tricky corners and, of course, the cleanup—which also always seems to take longer than it should.

All this means that when the product is 90% complete, the activity is only about halfway through its total time (so true).

Cumulative competion curve.

Cumulative completion when production follows a Rayleigh curve.

Read more here (you will need access to CACM): How We Build Things

Read Full Post »

« Newer Posts - Older Posts »