Feeds:
Posts
Comments

Posts Tagged ‘Google’

In the future, thanks to the Grain, a chip which can be implanted on a hard drive in the brain,every single action that a person makes is recorded and may be played back. Liam, a lawyer, married with a child, suspects that his wife Fi is having a fling with the brash Jonas,whom they meet at a dinner party. After playing clips from his own ‘Grain’ his suspicions are confirmed, and he gets drunk and attacks Jonas, forcing the guilty pair to show him what is in their memory banks. In fact the affair has been going on for months and Liam attacks Jonas, demanding that he erase all memories of Fi from his brain. This, however, does not result in marital reconciliation.

The more I look at videos about Google Glass the more it reminds me of the Grain from “The Entire History of You”, Black Mirror s1e3.

I am not sure I share the enthusiasm with which Google speaks about its project. It’s definitely a technological wonder, but how will it affect personal relationship and society in general? What about split attention? Will we be able to handle the constant information flow?

Sometimes entrance barriers and usage fees (in terms of effort and time in this case) are useful. See for example the tragedy of commons. How will the ability to check Facebook every second of your waking time change your life?

Read Full Post »

It looks like people are now realizing the need for powerful real-time analytics engines. Dremel was designed by Google as an interactive query engine for cluster environments. Dremel is a scalable, interactive ad-hoc query system for analysis of read-only nested data. It’s the perfect tool to power your Big Data dashboards. Now a bunch of open source clones are appearing. Will they have the same luck as Hadoop?

Here the contenders:

  • The regular: Apache Drill
    Mainly supported by MapR Technologies and currently in incubator. At the moment of writing there are a total of 12 issues in Jira (for a comparison, we just got to 3000 in Pig this week). Even the name still needs to be confirmed. As usual in the Apache style, this will most likely be a community-driven project, with all the pros and cons of the case. The Java + Maven + Git combo should be familiar and enable contributors to get up to speed quickly.
  • The challenger: Cloudera Impala
    Just open-sourced a couple of days ago. Surprisingly, it even offers the possibility of joining tables (something Dremel didn’t do for efficiency reasons). Unfortunately it is all C++, which I have hanged on the nail after my master’s thesis. I hope this choice won’t scare away contributors. However I suspect that Cloudera wants to drive most of the development in-house rather than build a community project.
  • The outsider: Metamarkets Druid
    Even though it has been around for a year, it has only recently become open source. It’s interesting to read how these guys were frustrated by existing technology so they just decided to roll their own solution. My (unsupported) feeling is that this is by far the most mature of the three clones. One interesting feature of Druid is real-time ingestion of data. From what I gather, Impala relies on Hive tables and Drill on Avro files, so my guess is that both of them cannot do the same. (For the records, also here Java + Maven + Git).

As a technical side note, and a personal curiosity, I wonder whether these project would benefit from YARN integration. I guess it will be easier for Drill than for the others. However startup latency could be an issue in this case.

The whole situation seems like a déjà vu of the Giraph/Hama/GoldenOrb clones of Pregel. Who will win the favor of the Big Data crowd?
Who will be able to grow the largest community? Technical issues are only a part of the equation in this case.

I am quite excited to see this missing piece of the Big Data ecosystem getting more attention and thrilled by the competition.

PS: I have read somewhere around the Web that this will be the end of Hadoop and MapReduce. Nothing can be more wrong than this idea. Dremel is the perfect complement for MapReduce. Indeed, how better could you analyze the results of you MapReduce computation? Often the results are at least as big as the inputs, so you need a way to quickly generate small summaries. Hadoop counters have been (ab)used for this purpose, but more flexible and powerful post-processing capabilities are needed.

PPS: Just to be clear, there is nothing tremendously innovative in the science behind these products. Distributed query execution engines have been around for a while in parallel database systems. What’s yet to see is whether they can deliver their promise of extreme scalability, which parallel database system have failed to offer.

Read Full Post »

Google 20%


Dilbert - Google 20%

Read Full Post »

Are you really aware of the price you are paying for Web commodities?

So I can only imagine the reaction in the boardrooms of those traditional firms when Facebook and Google built their Psychographic Marketing Honeypots and disguised them as a social network and a search engine. “All that data we’ve worked so hard to source! Merde! People just sit there all day giving it to them!”

The world has changed though, hasn’t it? We have entered the Matrix, but it’s not our body heat they want. They want the preference model encoded in our amygdala and a list of all the people that might influence that model tomorrow.

You can read the whole post here at O’Reilly Radar: Amygdala FarmVille.

Read Full Post »

Three executives of Google Italy have been sentenced to 6 month of reclusion for not avoiding the diffusion of a video on Google video. The video portrayed a group of students beating up and insulting another student who has Down syndrome. They are being held responsible for the content of the video and this puts Google video on the same level of a newspaper.

Repubblica (in Italian) and New York Times

Update: on the same subject and tones TechCrunch, TechDirt, Mashable and BBC.

Read Full Post »