Zookeeper Paxos和 cassandra 中的W+R>=N有什么区别?

qzwqbdag  于 2022-12-09  发布在  Apache
关注(0)|答案(5)|浏览(222)

类似于Dynamo的数据库(如Cassandra)可以通过仲裁(quorum)的方式来实现一致性,即选择同步写入的副本数(W)和要读取的副本数(R),使得W+R〉N,其中N是复制因子。
这两种方法有什么区别?PAXOS是否提供W+R〉N模式没有提供的保证?

piv4azn7

piv4azn71#

Paxos is non-trivial to implement, and expensive enough that many systems using it use hints as well, or use it only for leader election, or something. However, it does provide guaranteed consistency in the presence of failures - subject of course to the limits of its particular failure model.
The first quorum based systems I saw assumed some sort of leader or transaction infrastructure that would ensure enough consistency that you could trust that the quorum mechanism worked. This infrastructure might well be Paxos-based.
Looking at descriptions such as https://cloudant.com/blog/dynamo-and-couchdb-clusters/ , it would appear that Dynamo is not based on an infrastructure that guarantees consistency for its quorum system - so is it being very clever or cutting corners? According to http://muratbuffalo.blogspot.co.uk/2010/11/dynamo-amazons-highly-available-key.html , "The Dynamo system emphasizes availability to the extent of sacrificing consistency. The abstract reads "Dynamo sacrifices consistency under certain failure scenarios". Actually, later it becomes clear that Dynamo sacrifices consistency even in the absence of failures: Dynamo may become inconsistent in the presence of multiple concurrent write requests since the replicas may diverge due to multiple coordinators." (end quote)
So, it would appear that in the case of quorums as implemented in Dynamo, Paxos provides stronger reliability guarantees.

7hiiyaii

7hiiyaii2#

Paxos和W+R〉N仲裁试图解决稍微不同的问题。Paxos通常被描述为一种复制状态机的方式,但实际上它更像是一个分布式日志:写入日志的每个项都有一个索引,不同的服务器最终会保存相同的日志项+它们的索引。(复制状态机可以通过将状态机的输入写入日志来实现,每个服务器根据它们的索引对一致的输入重放状态机)。
W+R〉N仲裁解决了多个服务器共享单个值的问题,在学术界称之为“共享寄存器”,一个共享寄存器有两个操作:读和写,我们希望读操作返回上一次写操作的值。
因此,Paxos和W+R〉N仲裁位于不同的域中,并且具有不同的属性(例如,Paxos保存项目的有序列表)。然而,Paxos可以用于实现共享寄存器,而W+R〉N仲裁可以用于实现分布式日志(尽管效率非常低)。
综上所述,有时W+R〉N法定数并不是以“完全鲁棒”的方式实现的,因为它需要不止一个通信回合。因此,在想要低延迟的系统中,W+R〉N法定数的实现可能提供较弱的属性(例如,冲突的值可以共存)。
总而言之,理论上,Paxos和W+R〉N可以实现相同的目标。实际上,这将是非常低效的,每一个都更适合于稍微不同的东西。更实际的是,W+R〉N并不总是完全实现,因此破坏了一些速度一致性属性。

更新:Paxos支持非常通用的故障模型:W+R〉N仲裁方案有不同的实现方式,其中许多实现方式假设了不太常见的故障。因此,两者之间的区别还取决于对所支持的可能故障的假设。

zc0qhyus

zc0qhyus3#

没有区别。法定人数的定义是任何两个法定人数的交集不为空。简单多数法定人数是一个例子而不是一个定义。看看Lamport博士后来的论文“Vertical Paxos”,他在那里给出了法定人数的一些其他可能的配置。
多判决Paxos协议(AKA Multi-Paxos),在稳定状态下,它只是两个阶段的提交。只有当领导者失败时,才需要改变投票数。
Zookeeper的复制协议(ZAB)和RAFT都是基于Paxos的,不同之处在于错误检测和领导者失败后的过渡。

wqsoz72f

wqsoz72f4#

如其他答案中所述,在R+W〉N系统中,并非所有节点上的写入都是原子的,这意味着当写入正在进行时(或写入失败时),一些节点将具有较新的值,而一些节点将具有较旧的值。以n=3、r=2和w=2的系统为例。为清楚起见,我们假设3个节点命名为A、B和C。考虑以下场景:写入正在进行;节点A已更新,而B和C仍在接收更新的值。从A和B阅读的客户端将看到较新的值(使用版本向量或上次写入成功解析),而从B和C读取的客户端将看到旧值。这种类型的读取不被视为可线性化。使用适当的可线性化系统(如Paxos或Raft)时不会发生此类问题。

ldioqlga

ldioqlga5#

Yes, Paxos provides guarantees that are not provided by the Dynamo-like systems and their read-write quorums. The difference is how failures are handled and what happens during a write. After a successful write, both kind of systems behave similarly. The data will be saved and available for reading afterwards (until overwritten or deleted) and so on.
The difference appears during a write and after failures. Until you get a successful answer from W nodes when writing something to the eventually consistent systems, then the data may have been written to some nodes and not to others and there is no guarantee that the whole system agrees on the current value. If you try to read the data back at this point, some clients may get the new data back and some the old data back. In other words, the system is not immediately consistent. This is because writes aren't atomic across nodes in these systems. There are usually mechanisms to "heal" an inconsistency like this and "eventually" the system will become consistent again (i.e. reads will once again always return the same value, until something new is written). This is the reason why they are often called "eventually consistent". Inconsistencies can (and will) appear, but they will always be dealt with and reconciled eventually.
With Paxos, writes can be made atomic across nodes and inconsistencies between nodes are therefore possible to avoid. The Paxos algorithm makes it possible to guarantee that non-faulty nodes never disagree on the outcome of a write, at any point in time. Either the write succeeded everywhere or nowhere. There will never be any inconsistent reads at any point (if it's correctly implemented and if all the assumptions hold, of course). This comes at a cost, however. Mainly, the system may need to delay some requests and be unavailable when for example too many nodes (or the communication between them) aren't working. This is necessary to assure that no inconsistent replies are given.
To summarize: the main difference is that the Dynamo-like systems can return inconsistent results during writes or after failures for some time (but will eventually recover from it), whereas Paxos based systems can guarantee that there are never any such inconsistencies by sometimes being unavailable and delaying requests instead.

相关问题