I was watching Jeff Dean’s keynote presentation for the ACM Symposium on Cloud Computing 2010 (SOCC) that was held yesterday and I found this very interesting bit of information. This is so useful that every Computer Scientist and Engineer should learn it by heart!
Operation |
Time (nsec) |
L1 cache reference |
0.5 |
Branch mispredict |
5 |
L2 cache reference |
7 |
Mutex lock/unlock |
25 |
Main memory reference |
100 |
Compress 1KB bytes with Zippy |
3,000 |
Send 2K bytes over 1 Gbps network |
20,000 |
Read 1MB sequentially from memory |
250,000 |
Roundtrip within same datacenter |
500,000 |
Disk seek |
10,000,000 |
Read 1MB sequentially from disk |
20,000,000 |
Send packet CA -> Netherlands -> CA |
150,000,000 |
These numbers give you some insight into why random reads from a disk are a really bad idea.
This piece information complements the very nice image from Adam Jacobs, and his excellent “The Pathologies of Big Data” article.

Random is BAD (and SSD is NOT going to solve the problem)
What should we learn from all this stuff?
- Do your back-of-the-envelope calculations
- Do avoid random operations
- Do benchmarks your system
Read Full Post »