最近有很多关于卡桑德拉的话题。
Twitter, Digg, Facebook等都在使用它。
什么时候有意义:
使用卡桑德拉, 不用卡桑德拉,还有 使用RDMS而不是Cassandra。
最近有很多关于卡桑德拉的话题。
Twitter, Digg, Facebook等都在使用它。
什么时候有意义:
使用卡桑德拉, 不用卡桑德拉,还有 使用RDMS而不是Cassandra。
当前回答
在这里,我将重点介绍一些重要的方面,这些方面可以帮助你决定是否真的需要卡桑德拉。这个清单并不详尽,只是我脑海中最重要的一些观点
Don't consider Cassandra as the first choice when you have a strict requirement on the relationship (across your dataset). Cassandra by default is AP system (of CAP). But, it supports tunable consistency which means it can be configured to support as CP as well. So don't ignore it just because you read somewhere that it's AP and you are looking for CP systems. Cassandra is more accurately termed “tuneably consistent,” which means it allows you to easily decide the level of consistency you require, in balance with the level of availability. Don't use Cassandra if your scale is not much or if you can deal with a non-distributed DB. Think harder if your team thinks that all your problems will be solved if you use distributed DBs like Cassandra. To start with these DBs is very simple as it comes with many defaults but optimizing and mastering it for solving a specific problem would require a good (if not a lot) amount of engineering effort. Cassandra is column-oriented but at the same time each row also has a unique key. So, it might be helpful to think of it as an indexed, row-oriented store. You can even use it as a document store. Cassandra doesn't force you to define the fields beforehand. So, if you are in a startup mode or your features are evolving (as in agile) - Cassandra embraces it. So better, first think about queries and then think about data to answer them. Cassandra is optimized for really high throughput on writes. If your use case is read-heavy (like cache) then Cassandra might not be an ideal choice.
其他回答
NoSQL的一般思想是,您应该使用最适合您的应用程序的数据存储。如果您有一个财务数据表,请使用SQL。如果您的对象需要复杂/缓慢的查询才能映射到关系模式,请使用对象或键/值存储。
当然,你遇到的任何现实问题都处于这两个极端之间,没有一个解决方案是完美的。您需要考虑每个存储的功能以及使用其中一个的后果,这将非常具体于您试图解决的问题。
在评估分布式数据系统时,您必须考虑CAP定理——您可以选择以下两个:一致性、可用性和分区容差。
Cassandra是一个可用的、支持最终一致性的分区容忍系统。要了解更多信息,请参阅我写的这篇博客文章:NoSQL系统的可视化指南。
它不支持跨 表。 不支持二级索引。 二级索引必须依赖Elastic search /Solr,并且必须编写自定义同步组件。 非ACID兼容系统。 查询支持有限。
根据DataStax,当需要Cassandra时,它并不是最好的用例
1-高端硬件设备。 2- ACID兼容,无回滚(银行交易)
在这里,我将重点介绍一些重要的方面,这些方面可以帮助你决定是否真的需要卡桑德拉。这个清单并不详尽,只是我脑海中最重要的一些观点
Don't consider Cassandra as the first choice when you have a strict requirement on the relationship (across your dataset). Cassandra by default is AP system (of CAP). But, it supports tunable consistency which means it can be configured to support as CP as well. So don't ignore it just because you read somewhere that it's AP and you are looking for CP systems. Cassandra is more accurately termed “tuneably consistent,” which means it allows you to easily decide the level of consistency you require, in balance with the level of availability. Don't use Cassandra if your scale is not much or if you can deal with a non-distributed DB. Think harder if your team thinks that all your problems will be solved if you use distributed DBs like Cassandra. To start with these DBs is very simple as it comes with many defaults but optimizing and mastering it for solving a specific problem would require a good (if not a lot) amount of engineering effort. Cassandra is column-oriented but at the same time each row also has a unique key. So, it might be helpful to think of it as an indexed, row-oriented store. You can even use it as a document store. Cassandra doesn't force you to define the fields beforehand. So, if you are in a startup mode or your features are evolving (as in agile) - Cassandra embraces it. So better, first think about queries and then think about data to answer them. Cassandra is optimized for really high throughput on writes. If your use case is read-heavy (like cache) then Cassandra might not be an ideal choice.