注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

中吴南顾惟一笑

成功法则就是那19个字

 
 
 

日志

 
 

[转]Project Cassandra: Facebook's Open Source Alternative to Google BigTable  

2012-01-03 21:28:37|  分类: dbms |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

http://www.25hoursaday.com/weblog/2008/07/14/ProjectCassandraFacebooksOpenSourceAlternativeToGoogleBigTable.aspx

The Cassandra project has been described as a cross between Google's BigTable and Amazon's Dynamo storage systems. An overview of the project is available in the SIGMOD presentation on Cassandra available at SlideShare. A summary of the salient aspects of the project follows.

The problem Cassandra is aimed at solving is one that plagues social networking sites or any other service that has lots of relationships between users and their data. In such services, data often needs to be denormalized to prevent having to do lots of joins when performing queries. However this means the system needs to deal with the increased write traffic due to denormalization. At this point if you're using a relational database, you realize you're pretty much breaking every major rule of relational database design. Google tackled this problem by coming up with BigTable. Facebook has followed their lead by developing Cassandra which they admit is inspired by BigTable. 

The Cassandra data model is fairly straightforward. The entire system is a giant table with lots of rows. Each row is identified by a unique key. Each row has a column family, which can be thought of as the schema for the row. A column family can contain thousands of columns which are a tuple of {name, value, timestamp} and/or super columns which are a tuple of {name, column+} where column+ means one or more columns. This is very similar to the data model behind Google's BigTable.

As I mentioned earlier, denormalized data means you have to be able to handle a lot more writes than you would if storing data in a normalized relational database. Cassandra has several optimizations to make writes cheaper. When a write operation occurs, it doesn't immediately cause a write to the disk. Instead the record is updated in memory and the write operation is added to the commit log. Periodically the list of pending writes is processed and write operations are flushed to disk. As part of the flushing process the set of pending writes is analyzed and redundant writes eliminated. Additionally, the writes are sorted so that the disk is written to sequentially thus significantly improving seek time on the hard drive and reducing the impact of random writes to the system. How important is improving seek time when accessing data on a hard drive? It can make the difference between taking hours versus days to flush a hundred gigabytes of writes to a disk. Disk is the new tape.

Cassandra is described as "always writable" which means that a write operation always returns success even if it fails internally to the system. This is similar to the model exposed by Amazon's Dynamo which has an eventual consistency model.



[Just like ARIES does, it doesn’t do in-place update at the commit time, instead, it just write the update image in the commit log, then later write the update pages as a batch mode. ]
  评论这张
 
阅读(67)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017