A distributed system is collection of nodes, each which there own local memory, which are able to communicate via message passing, cooperate to perform a computation. CAP theorem [1,2] argues that its not possible to achieve consistency, availability and partition tolerance. But machines will fail and there failures are tolerated (to varying extents) by replication. These replicates now need to agree on consistent worldviews, leading to the problem of consensus, originally proposed by , Consensus is when these nodes agree on a value, applications for this include:
A protocol for consensus must provide the following to be “correct”:
A correct node is a node that will eventency make progress so its hasn’t yet and will not experience any of the failures listed below:
QUESTION: Some papers  , list a 4th condition called unanimity, stating that if all nodes propose the same value, this will be the value chosen. It appears to me that this is just a case of validity, since the value chosen must have proposed by a node, so if all nodes propose the same value then there is only one possible value to be chosen, according to the validity condition. But this used in a few paper so I can’t be right, so why ?
The following failures are possible:
Assuming synchronous communication (reliable unicast with known bounded message delay and execution of nodes) and less than 2/3ds of nodes experience Byzantine failures , we can achieve consensus  but a completely asynchronous consensus protocol cannot guarantee consensus with just a single fail-stop node , the intuition behind this is that you can’t detect a fail-stopped node, as it my just be slow or its messages not yet delivered. This can be masked by trying to detect failures with waiting on unresponsive nodes, comprising liveness or timeouts/heartbeats comprising accuracy. All consensus algorithms provide consistency but for this they comprises partition tolerance and/or availability.
Henceforth, I would like to consider processes communicating over TCP/IP, making the following assumptions:
Two phase commit is depend on a single node, whereas three phase commit  can tolerant half its nodes failing but not network partitions or unbounded network delays.
My next post will consider some more complex approaches meeting the above requirements
 Brewer, E. Rowards Robust Distributed System. Symposium on Principles of Distributed Computing (PODC).(2000).
 Gilbert, Seth, and Nancy Lynch. “Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services.” ACM SIGACT News 33, no. 2 (2002): 51-59.
[3 Lamport, Leslie, Robert Shostak, and Marshall Pease. “The Byzantine generals problem.” ACM Transactions on Programming Languages and Systems (TOPLAS) 4, no. 3 (1982): 382-401.
 Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. 1985. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2 (April 1985), 374-382.
 Skeen, Dale; Stonebraker, M. (May 1983). “A Formal Model of Crash Recovery in a Distributed System”. IEEE Transactions on Software Engineering: 219–228
 Pease, M., Shostak, R., Lamport, L.: Reaching agreement in the presence of faults. J. ACM 27(2) (April 1980) 228–234