Release of “ARC: Analysis of Raft Consensus”

 “ARC: Analysis of Raft Consensus” is now available online as a UCAM technical report. 
http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-857.pdf

Abstract

The Paxos algorithm, despite being synonymous with distributed consensus for a decade, is famously difficult to reason about and implement due to its non-intuitive approach and underspecification. In response, this project implemented and evaluated a framework for constructing fault-tolerant applications, utilising the recently proposed Raft algorithm for distributed consensus. Constructing a simulation framework for our implementation enabled us to evaluate the protocol on everything from understandability and efficiency to correctness and performance in diverse network environments. We propose a range of optimisations to the protocol and released to the community a testbed for developing further optimisations and investigating optimal protocol parameters for real-world deployments.

Thank you everyone for your feedback.

Seeking Feedback on “ARC: Analysis of Raft Consensus”

My undergraduate dissertation “ARC: Analysis of Raft Consensus” will be submitted as a UCAM tech report. A draft is available here and I would be very grateful of any feedback.

Title: ARC: Analysis of Raft Consensus

Abstract:
The Paxos algorithm, despite being synonymous with distributed consensus for a decade, is famously difficult to reason about and implement due to its non-intuitive approach and underspecification. In response, this project implemented and evaluated a framework for constructing fault-tolerant applications, utilising the recently proposed Raft algorithm for distributed consensus. Constructing a simulation framework for our implementation enabled us to evaluate the protocol on everything from understandability and efficiency to correctness and performance in diverse network environments. We propose a range of optimisations to the protocol and released to the community a testbed for developing further optimisations and investigating optimal protocol parameters for real-world deployments.

EDIT 1: Regarding the difference between this tech report and my dissertation. I have cut out material i didn’t believe would be of general interest, such as how i used VC or lessons learned. If you would like a copy of the original dissertation (probably because your a Part 2 student yourself), just email me and I’ll be happy to provide you a copy.

EDIT 2: I’m pretty much happy to take feedback by any format, Comment below or email me at hh360 @ cam . ac . uk

EDIT 3: A massive thankyou to everyone who has provided feedback and help to disseminate this draft (by retweeting it)

EDIT 4: The code is open source (MIT licence) and available on GitHub. I’ve not linked to as its currently undergoing a refactoring / documenting process ready for release of v0.1. My plan is split the code base into two separate libraries, one will be a event-based simulator for distributed system and the other will be a standalone Raft implementation. I’ll update this blog (& twitter) when the code is ready

EDIT 5:  Wow. The response to this draft has been much greater than I expected (300+ downloads so far). Thank you so much to everyone in the community and of course Diego Ongaro. Diego’s Raft paper is online here and the Raft consensus site is here.

 

Project Zygote (working title) @ CamJam

Tomorrow we will be demonstrating an early prototype of Zygote (only the working title) at CamJam, the Cambridge based Raspberry Jam, organised by  and . Despite being only a few weeks into the project, we are keen to join the very welcoming Raspberry Pi community in Cambridge and get feedback on our idea as early as possible so they can shape the development of project, instead of simplify being an after through.

If you want to test it out yourself, the code in on Github and the Raspberry Pi compilation instructions are in the README.md. This is a very early version and has many bugs, so be warned.

Screen Shot 2014-07-04 at 16.11.37

Dreaming of a new life on the edge network

Motivation

The internet has abandoned the end-to-end principles on which it was established. With IPv4 addresses depleted, devices are left behind NATs, with the transition to IPv6 yet to restore their public identity. Users have been left isolated by their ISPs, they are pushed to depend on opaque centralised services boosting usability and availability. However, data breaches, DDoS attacks, censorship and mass-surveillance have made individuals re-evaluate their decisions and look for alternatives, a search hindered by data lock-in and network externalities.

The infrastructure exists for building secure distributed systems over a user’s personal cloud of devices. Current approaches require intricate configuration to deal with the diversity of devices, middleboxes and network environments. Developers each try to re-implement solutions to establishing authenticated identities, distributed consensus and availability in the face of mobile nodes, pervasive network partitioning, asymmetric channels and Byzantine failures. Applications sit on top of an unstable stack, which without modification and violation, falls down in the face of everyday challenges, fails to utilise the resources available and slow at deploying new protocols. For example, without Explicit Congestion Notifications wireless traffic is unnecessarily throttled in the face of interference and without Multi-path TCP multiple NICs offer no resilience/speedup for a connection.

With trust in internet services wavering and ever more private data becoming available from the Internet of Things, we must improve on today’s opaque terms of service which minimise legal responsibility and offer few availability guarantees. Can we build a new representation for legally binding contracts between applications and their users, which provides upfront guarantees that are understandable to the user and provably enforced by the application?

State of the Art

Most of the time, devices are underutilised: CPUs idle, storage to spare and bandwidth unused. The premise that the required physical infrastructure already exists, relies in part, on people being willing to share their resources given a good incentive model. BitTorrent will reward you for sharing files with faster downloads, Bitcoin will trade your computation and storage on the blockchain for cryptocurrency and BOINC allows you to contribute to scientific research. Project Tor allows you to share your bandwidth with people around the world seeking anonymity or bypassing censorship, whilst the Public Access WiFi Service (PAWS) allows you to share bandwidth with your local community.

Giving data back to its owners allows individuals to make informed decisions about how exactly to distribute their data. Even if the owner chooses to utilise cloud storage for their data, they can still remain in control with systems like Priv.io which allow the user to provide their own cloud storage and grant 3rd party apps access via their browser. Community efforts to address the usability challenges often involve packaging a collection of P2P alternatives into a plug and play solution such as Freedom Box and arkOS

Approach

Inspired by the previous work in SLAs and financial contracts, I dream of replacing opaque terms of service with a formally defined contract in a domain specific language (DSL). This would allow it to be easily understood by users, stand up in a court of law and be dynamically enforced by the verified applications. But why would service provider choice to adopt such as scheme? Perhaps to minimise expensive legal battles with customers who argue that they didn’t give informed consent} and the poor publicity that follows. Or to difference themselves from the competition, by bowing to users pressure. Ultimately if adopted by a sufficient minority, then regulatory changes could make it the new norm.

Building a personal cloud of devices, ultimately depends on establishing and revoking layers of trust between devices. A popular technique is public key infrastructure, as used in SSL and DNSSEC, but this relies heavily on a trusted certificate authority and sensible key management. I intend to develop an alternative such as utilising a web of trust scheme such as PGP, authenticating a host’s public key by observing it from a range of network vantage points as used in Perspectives or authenticating hosts by consensus as used in Unmanaged Internet Architecture.

I dream we will put aside many of the assumptions which have dominated the discussion on distributed systems, to focus on life at the edge, to build a new federated layer for applications. One which provides consensus algorithms, so data will always be consistent no matter where it is accessed from, even if malicious agents try to gain control of the system. One which puts users first and manages their data responsibly. Unifying an individual’s collection of devices into a secure resilient personal cloud with incentive systems to stimulate fair sharing of excess resources, improving utility and fault tolerance.

Evaluating the project will begin with building applications such as social networking, content distribution or micro blogging over the personal cloud and testing there performance on typical set-ups, against that of centralised services and popular P2P alternatives. Followed by, formal verification of many of the components such as the enforcement of the term of service, as defined by the DSL and the consistency, availability and fault tolerance of distributed system. While a threat model will consider the authentication, encryption and confidentiality properties.

Graduation & Raspberry Pi Internship

As of Wednesday, I officially hold a BA in Computer Science. Now that the days of lectures and supervisions are up, I am in week 2 of 10, of an internship working on an education game for the Raspberry Pi.

Our open-world game teaches 11/12 years old python programming through a combination of guided challenges and independent exploration. Think Sonic Pi and the Sims, meets Minecraft Pi and Code Academy.

The problem with consensus

A distributed system is collection of nodes, each which there own local memory, which are able to communicate via message passing, cooperate to perform a computation. CAP theorem [1,2] argues that its not possible to achieve consistency, availability and partition tolerance. But machines will fail and there failures are tolerated (to varying extents) by replication. These replicates now need to agree on consistent worldviews, leading to the problem of consensus, originally proposed by [6], Consensus is when these nodes agree on a value, applications for this include:

  • mutable exclusion locks
  • committing a transaction to a database
  • distributed storage such as NFS
  • Implementing reliable broadcast
  • leader election

A protocol for consensus must provide the following to be “correct”:

  • Agreement: all correct nodes arrive at the same value, (the safety property)
  • Validity: the value chosen is one that was proposed by a correct node, (the non-triviality property)
  • Termination: all correct nodes eventually decide on a value, (the liveness property)

A correct node is a node that will eventency make progress so its hasn’t yet and will not experience any of the failures listed below:

QUESTION: Some papers [7] , list a 4th condition called unanimity, stating that if all nodes propose the same value, this will be the value chosen. It appears to me that this is just a case of validity, since the value chosen must have proposed by a node, so if all nodes propose the same value then there is only one possible value to be chosen, according to the validity condition. But this used in a few paper so I can’t be right, so why ? 

The following failures are possible:

  • Fail-stop – nodes may stop but will not restart
  • Fail-recover – nodes may stop and restart
  • Byzantine – nodes behavior unexpectedly, from fault or malicious

Assuming synchronous communication (reliable unicast with known bounded message delay and execution of nodes)  and less than 2/3ds of nodes experience Byzantine failures , we can achieve consensus [3] but a completely asynchronous consensus protocol cannot guarantee consensus with just a single fail-stop node [4], the intuition behind this is that you can’t detect a fail-stopped node, as it my just be slow or its messages not yet delivered. This can be masked by trying to detect failures with waiting on unresponsive nodes, comprising liveness or timeouts/heartbeats comprising accuracy. All consensus algorithms provide consistency but for this they comprises partition tolerance and/or availability.

Henceforth, I would like to consider processes communicating over TCP/IP, making the following assumptions:

  • Nodes have local state, no shared memory and local clocks (though they may be shared when processes are running on the same host)
  • If a path is available, message passing is reliable unicast with unbounded delay (thanks to TCP’s reliable delivery, data integrity and in order delivery) but network partitions are possible
  • Failure detection methods don’t need to be accurate, thus its ok to use timeouts and assume a node is dead when it is infact live
  • A node can tell which node sent a message (e.g. MACs/IPs) so a Byzantine node cannot forge a message which appears to come from an honest host

Approaches:

Two phase commit is depend on a single node, whereas three phase commit [5] can tolerant half its nodes failing but not network partitions or unbounded network delays.

My next post will consider some more complex approaches meeting the above requirements

 

[1] Brewer, E.  Rowards Robust Distributed System. Symposium on Principles of Distributed Computing (PODC).(2000).

[2] Gilbert, Seth, and Nancy Lynch. “Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services.” ACM SIGACT News 33, no. 2 (2002): 51-59.

[3 Lamport, Leslie, Robert Shostak, and Marshall Pease. "The Byzantine generals problem." ACM Transactions on Programming Languages and Systems (TOPLAS) 4, no. 3 (1982): 382-401.

[4] Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. 1985. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2 (April 1985), 374-382.

[5] Skeen, Dale; Stonebraker, M. (May 1983). “A Formal Model of Crash Recovery in a Distributed System”. IEEE Transactions on Software Engineering: 219–228

[6] Pease, M., Shostak, R., Lamport, L.: Reaching agreement in the presence of faults. J. ACM 27(2) (April 1980) 228–234

OCaml Monthly Meeting – Live Blog

Today’s OCaml Labs Monthly Meeting is all about practise talks for OCaml2013 so in that spirit, I’ll practising a bit of live-blogging too.

13:53 – Today’s SRG Meeting is over and its time for some work before the OCaml Labs meeting at 4:00, see you then …

16:02 Techincal difficulties delayed the start

16:02 Intro from Anil

introducing Gabriel Scherer who is visiting us this week and going we are going to Maypole after this meeting. We had a cash prise from ASPLOS after winning the HiPEAC paper award and the money will go towards SRG wine for XMAS party. Signpost paper was accepted to FOCI and a HotNet paper on Trevi was also just accepted

OCL Website – Too much manual management at the moment, moving to an ocaml planet feed of blog posts. David has been busy hacking on OPAM2web, OPAM has 512 packages, Opam2web takes a subset of the OPAM packages and makes the metadata into a minisite, like on OPAM. Doesn’t require manual updates, like an ATOM feed.

Upcoming events – Tomorrow is the 2nd compiler hacking event, at the makespace. Anil will be talking at QCon on Mirage, Mirage 1.0 release date is October 22nd, so maybe a workshop before. We 3 talks for Ocaml2013 (Platform, OcamlOT and Ctypes) so here we go …

16:09 Anil practice talk on OCaml Platform 1.0

Languages take many difference approaches to platform, but what does platform even mean? As a late mover in this field, we can learn from other languages. A platforms is NOT a group of temporarily motivated hackers to build a replacement standard library. Its hard to adopt a particular approach without a domain specific purpose, there are too many opinions, we need objective way to determine what belongs in the platform, we need a genie community that is sustainable (even if a large party leaves). A platform is a bundle of tools that interoperate, with quantitative metric to judge success, built in agility and supporting developers thought the whole development life cycle. Industrial partners have a range of needs, as each work in different domains.

Tooling – Overview of 5 areas: OPAM from OCamlPro, IDE Tools, OPAM-DOC, OCaml compiler itself and Ocaml.org.

OPAM – 1.1 released today (maybe), over 100 contributors to OPAM,  500+ packages, 1500+ unique versions, external dependency solver using CUDF

IDE Support – OCaml has many intermediate files. In OCaml 4.0 onwards, we have a binary format of an abstract syntax tree with type annotations called cmt (and cmti for interface files), we can now create external tools to query this like opam-doc. ocp-index and ocp-indent from OCamlPro, and Merlin (I thinks this is EPIC) are also now available

opam-doc – Now we have cmt files, we need unified documentation across packages, this is much harder than it sounds as it touches every part of the tool stack. Not all packages can be installed at once due to conflicts. Module inclusion is tough to code in static html. (Need to make a demo) bindoc takes the Typed AST (in cmt) and generates cmd, which include the ocamldoc comments, Opamdoc takes the cmt database for opam and output a single website with your universe of packages.

ocaml.org - Demo of ocaml.org at ocaml-redesign.github.io/pkg/index.html, feedback is welcome says amir

Now we have the tools, what metrics can we extract to see how well our tools are doing.

Portability – windows compatibility ?

Maintainer – is there a place for docs and will people response to issues/comments/emails, where can issues be submitted ?

Tests – code coverage, multi variant benchmarking in core-bench

Stability – OPAM support pining, how stable are the interfaces of libraries ?

opam tracks compiler constraint, statically analyses the build system from logs (OCamlOT)

Agility – Building a platform is EXHAUSTING. We want to ask “WANT IF” questions: what if let was monomophic? what if we removed camlp4? what is the syntax precedence changes ?

Distrusted workflow – build on git, distributing tasks between 3 actors: Author (library writers), OCamlOL workers and maintainers. As we become more stable we move from staging to stable to inclusion in the platform.

We are building a tussle, we want to launch a game in janurary and let people put standard libraries into the ring, running OCamlOT to discover the winner

No clear winner: Lwt – portability, Batteries – free of syntax extensions, core – comprehensive.

16:36  Discussion over the battle of the standard libraries and talk feedback

C: talk is a bit long, not sure what to cut..

C: OPAM was dicussed last year at OCaml2013, we want to update everyone and follow on without overlapping too much

Q: Haven’t we already decided on JS’s core ?

A: No, we use all of them, i.e. Mirage used lwt extensively

Q: What if we don’t want any of the new standard libraries ? maybe I just want to use domain specific libraries from OPAM as and when I need them

A: We are not forcing the new standard libraries on anyone, but they are useful for beginners, nice to have consistent style, interoperability and few open statements e.g. Open Core.Std

Q: What if I have already decided which standard library I want to use ?

A: Again we are not forcing standard libraries on anyone, we are just trying to force effort more directly. OCaml tools will always be standard library agnoctic

C: the diagram of OCamlOT is confustion

C: how to not overlap with david talks

16:41 Davids talk on OCamlOT

State for the open source OCaml community

Outline: what is quality software? what is the user experience? what is feedback loop for package authors? How do we represent the thing underneath this all? utopian future ?

Quality: Work on every core (ANIL: We want multi-core :P ), consistent results: work or die nicely with obvious solution, not more “What have I forgotten?” questions, it should just tell you. We need addictive actions (not sure what they are), consistency, quality functions…

Universal concerns: compiler hypothesis “what if” questions (anil already said this), build system hypotheses “what strange assumuptions is the buid system making?”, package manager hypothesis and environmner hypothesis

Workflow: Make a pull request, curator observes the proposal, predict the future, proposes amendments, feedback loop and finally agreement is reached. Core is release weekly for example, we are trying to work like linux kernal patches

New workflow: promote health of OCaml community, preaching compatibility, “observe, orient, decide and act”, Computer assisted curator will help a human, to run the loop faster, human can pose questions to the computer assisted curator e.g  “will this run on ARM ?”

Repository Observation: github binding with web hooks but we are not tied to github. We merge into the world and we need dependences from each possible users prospective of the world

Dependency Orientation: capabilities with environmental dependances, packages with constriant-based dependencies, repositories with revision dependencies and artifact dependencies. example of the android repo

Triage Decisions: taking plain text error and parsing them into categories such as unsatisfiability (can’t have these two packages), dependencies (if my dependency is broken, then I am broken), transient (network down), system, metadata, external dependences (you forgot to write a dependency), build errors and a combo of many of the above.

State Action: commit intention, build, error analysis and buid results

Internet res: The agents negotiates over REST API on HTTPS, independent metadata layers (not sure about this) ,everythings an s-exp, branch consistent store explained, like git or Irminsule

Current state: github web hooks, we are conservative so one byte changes and we rebuild everything, basic triage heuristics completed, no amendment are proposed by the system atm, we don’t commit the outcome but the evidence, simple reactions to results, a website with green and red boxes in the large table

History: we have found lots of metadata issues, many packages bugs, some tool bugs like a non relocatable compiler and ocamlbuild PATH ignorer, we currently have 30+ x84-64 30+x84-32, 8 ARMs , many Linux distros , dead Raspberry Pi, panicking *nix filesystems and lots of people have set warning as error

Future: opamfu for DAG analysis, schema migration overhead, lower overhead for administrating exotic workers contributed to OCamlOT, we need to authenticate machines using ocaml-sodium, we need more advanced automation, proposed amendments, lets have a dialogue, better website integration, benchmarking your upgrades (how much improves cost), run experiments on whole OPAM universe with differential analysis and VM-based test system, to specific the worker finely.

What I think quantity is, vision of the future, how its represented underneath and what’s next,

Discussions

C: that was 20mins, feedback to David regarding content to be cut,

17:23 Ctypes by Jeremy 

This is a update not a practice talk

An examples of puts from C, how we can write no C and link in OCaml,

NEW things in Ctypes:
prettyprinting – for C types and C values, making it much eaiser to examine values for debuygging

biarray – support for lump of C memory

More type – nullable string, complex numbers

String conversions – much faster

Memory management issues – ctypes now gives the programmer more control over lifetime of OCaml passed to C,

finaliser – which you can attach to memory

Future

stub generation – instead of dynamically binding, it will generate stub code to act to the API

capability-style memory safty – one rogue pointer in a C library, can cause hell, loading each C library in a seperate address space so i library can only kill itself, you can then even run on C library on a foreign host or on a virtual machine

static strcut/union layout – checking layout of structures and unions against the API

17:40 Amir demo of ocaml-resdesign.githuib.io/docs/opam, (its look great :))

ocaml