Unanimous: System Research Group talklet

I’m looking forward to sharing my thoughts on consensus for the edge network with the SRG today, abstract below

Many projects in the SRG at the moment (HAT, UCN, contacts app, MirageOS for ARM, Jitsu, databox, signposts) are trying to give individuals an viable alternative to 3rd party centralised services and put them back in control of their personal data. However developing applications for the hostile edge network, with its heterogeneous hosts and networks, trust issues and poorly understood middle boxes is tricky. This is made worse by the fact that consensus algorithms are famously difficult to use, underspecified and based on decade old assumption about the internet. In this talklet, I will motivate Unanimous, a new consensus algorithm for the modern internet.

NB: this is a practice talk for EuroSys doctoral workshop next Tuesday, thus this 5 min talk will simply motivate a research direction instead of presenting a complete solution.
EDIT (17/4): these slides are now online

MOOC for edge networking and IoT

This course maybe of interest to readers, titled “Fog Networks and the Internet of Things”.

This course teaches the fundamentals of Fog Networking, the network architecture that uses one or a collaborative multitude of end-user clients or near-user edge devices to carry out storage, communication, computation, and control in a network. It also teaches the key results in the design of the Internet of Things, including consumer and industrial applications.

Link: https://www.coursera.org/course/fog

New abstraction for edge network systems

[this post is also available as a pdf]

The end-to-end principle of the internet is a fallacy. Modern distributed system rely on the cloud rather than deal with the complexity of the edge network. We propose to explore how to provide primitives such as consistency, integrity, accessibility and authentication in the context of edge network distributed systems.

Motivation

The internet has abandoned the end-to-end principles on which it was established [2]. With IPv4 addresses depleted and the transition to IPv6 yet to restore public identities, devices are left behind NATs and firewalls. Instead of dealing with the complexity of the edge network, users opt to use centralized cloud services, offering usability and high availability.

It’s a jungle on the edge network. By Original – Highways Agency [CC BY 2.0], via Wikimedia Commons

In this post-Snowden era, users are beginning to question their decision out of fear of censorship and mass-surveillance. Furthermore, a series of highly publicized data breaches and DoS attacks have shed light on the weak guarantees provided by opaque terms of service [8], which are engineered to minimize legal responsibility. Classes of applications such as multiplayer games and video conferencing can benefit from low latency characteristics of direct peer to peer connections whilst others such as local file sync and sharing can benefit from the high bandwidth and scalability. Even in this mod- ern world, users need the ability to establish inter-device connectivity without a full internet connection, for example isolating processing of personal data from the Internet Of Things or connecting between personal devices on the go.

In response to this demand, developers are building new applications for the edge network. They are reimplementing solutions to establishing authenticated identities, consensus and availability in the face of mobile nodes, network partitions and asymmetric channels. Without a clear stack and layers of abstraction, systems fail to provide even the most basic safety guarantees. Protocols are layered on top of each other without formal agreement on the services provided at each layer. Even after this engineering effort by developers, systems still require intricate configuration to deal with the diversity of devices, middleboxes and network environments [6] on the edge network, if they are able to work at all.

Challenges

For this discussion we make the following distinction. Data requirements are needs specified by the application, for example a distributed file system may specify that file meta- data must be strongly consistent whilst the files themselves need only be eventually consistent. In contrast, environmental requirements is the set of network environments that the application needs to operate in. For example, an application might specify that the nodes may be mobile and intermittently connected, however there will always be a cloud node which is highly reliable and publicly addressable but run on untrusted infrastructure.

Our focus on the edge network means we lose the data center assumptions, typical in distributed systems for decades. The environmental requirements now spans:

We’re in the real world, no longer the datacenter. By Hugovanmeijeren (Own work) [GFDL or CC BY-SA 3.0], via Wikimedia Commons

  • Heterogeneous network topologies — Middleboxes plague the edge network, network topologies are complex, devices may have asymmetric reachability, there is a wide range of link characteristics and traffic can be treated differently depending on its class.
  • Mobile nodes — We can no longer rely on IP ad- dresses to identify nodes. Nodes may move between networks and have multiple network interfaces. Inter- mittent connectivity and network partitions are com- mon.
  • Diverse hardware — Devices can vary in the con- straints of CPU, power supply or memory. Utilizing different networks may come at different costs.
  • New failure models — We no longer assume homogeneous trust between nodes. Different nodes suffer with different failure models and expected failure patterns.

Developers make crude assumptions about their applications’ requirements. The data requirement space is large, it includes some regions that have been proved impossible and others which may prove impossible.

Research Questions

The key research question is how can we provide services such as consistency, accessibility and authentication in the context of edge network distributed systems, this encompasses other questions such as:

  • Which areas of the space of data and environmental requirements are covered by existing distributed algorithms, which areas are not yet covered and which areas are provably impossible to cover?
  • How can we formally express the assumptions and guarantees of distributed algorithms and their trade-offs, data and environmental requirements such that our engine can resolve them?
  • How can we evaluate such systems given the diversity of possible environmental requirements and combina- tions of data requirements?
  • How can we ensure that the distributed algorithms provide the stated guarantees under the assumptions? How can we construct and reason about these algorithms such that they provide stronger guarantees then conventional systems?
  • How can we combine the above to provide a stack of protocols which fulfils the data requirements, given the environmental requirements?

Approach

We propose a new common abstraction between applications and networked devices to form personal clouds. Programmers (and ultimately users) formally specify the data and environmental requirements, these requirements span domains in fault tolerance, replication, consistency, caching, accessibility, security levels and confidentiality. From a col- lection of distributed algorithms, each with their own set of formally specified assumptions and guarantees, an engine will stack the protocols to provide the data requirements in the environmental requirements. From this foundation, we can build new distributed systems including new systems for personal data [4]. We are currently considering building upon a suite of existing tools in this domain including a unikernel operating system [10], TLS implementation [11], a git-style distributed data store [3] and Raft consensus implementation [5].

State of the Art

Sapphire [13] is a programming platform to separate application and deployment logic in cloud and mobile applications. Whilst Sapphire’s motivation is similar to ours, it covers a limited space of data requirements and environmental requirementss and doesn’t provide any guarantees to applications running on the platform.

The systems community is beginning to design distributed protocols specifically to tolerate the edge network, such as achieving consistency in an environment of heterogeneous trust [12, 7]. But quantifying the environmental requirements of such protocols requires a much richer abstraction than those currently used. Some authors [1, 9] suggest we can provide stronger guarantees for distributed protocols by changing the basic programming constructs and languages used, this is something we intend to explore further.

REFERENCES

[1] Peter Alvaro, Tyson Condie, Neil Conway, Joseph M. Hellerstein, and Russell Sears. I do declare: Consensus in a logic language. SIGOPS Oper. Syst. Rev., 43(4), January 2010.

[2] Marjory S. Blumenthal and David D. Clark. Rethinking the design of the Internet: the end-to-end arguments vs. the brave new world. ACM Transactions on Internet Technology, August 2001.

[3] Thomas Gazagnaire. Irminsule; a branch-consistent distributed library database. OCaml 2014 Workshop, 2014.

[4] Hamed Haddadi, Heidi Howard, Amir Chaudhry, Jon Crowcroft, Anil Madhavapeddy, and Richard Mortier. Personal data: Thinking inside the box. arXiv preprint arXiv:1501.04737, 2015.

[5] Heidi Howard, Malte Schwarzkopf, Anil Madhavapeddy, and Jon Crowcroft. Raft refloated: Do we have consensus? SIGOPS Oper. Syst. Rev., 49(1), January 2015.

[6] Christian Kreibich, Nicholas Weaver, Boris Nechaev, and Vern Paxson. Netalyzr: illuminating the edge network. In Proceedings of the 10th annual conference on Internet measurement, IMC ’10, pages 246–259. ACM, 2010.

[7] Jed Liu, Michael D. George, K. Vikram, Xin Qi, Lucas Waye, and Andrew C. Myers. Fabric: A platform for secure distributed computation and storage. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP ’09, 2009.

[8] Ewa Luger, Stuart Moran, and Tom Rodden. Consent for all: Revealing the hidden complexity of terms and conditions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 2687–2696. ACM, 2013.

[9] Anil Madhavapeddy. Combining static model checking with dynamic enforcement using the statecall policy language. In Proceedings of the 11th International Conference on Formal Engineering Methods: Formal Methods and Software Engineering, pages 446–465. Springer-Verlag, 2009.

[10] Anil Madhavapeddy, Richard Mortier, Charalampos Rotsos, David Scott, Balraj Singh, Thomas Gazagnaire, Steven Smith, Steven Hand, and Jon Crowcroft. Unikernels: Library operating systems for the cloud. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’13, 2013.

[11] Hannes Mehnert and David Kaloper Mersinjak. Transport layer security purely in ocaml. In OCaml Workshop, 2012.

[12] Isaac C Sheff, Robbert van Renesse, and Andrew C Myers. Distributed protocols and heterogeneous trust: Technical report. arXiv preprint arXiv:1412.3136, 2014.

[13] Irene Zhang, Adriana Szekeres, Dana Van Aken, Isaac Ackerman, Steven D Gribble, Arvind Krishnamurthy, and Henry M Levy. Customizable and extensible deployment for mobile/cloud applications. In Proceedings of the 11th USENIX conference on Operating Systems Design and Implementation, pages 97–112. USENIX Association, 2014.

CFP: FOG Networking for 5G and IoT

This seems like an interesting venue for work on edge network distributed systems. Nice to see that we are not the only ones who think that the edge is area worth researching.

 

CFP: Fog Networking for 5G and IoT workshop

>>>>> In conjunction with SECON 2015 <<<<<

=========================================

22 June 2015, SEATTLE – USA

http://secon2015.ieee-secon.org/workshop-program

Important dates

Submission deadline(Hard): April 1st, 2015

Notification of acceptance: April 15th, 2015

Camera Ready: April 30th, 2015

Workshop: June 22nd, 2015

Scope:

Pushing computation, control and storage into the “cloud” has been a key trend in networking in the past decade. Over-dependence on the cloud, however, indicates that availability and fault tolerance issues in the cloud would directly impact millions of end-users. Indeed, the cloud is now “descending” to thenetwork edge and often diffused among the client devices in both mobile and wireline networks. The cloud is becoming the “fog.”

Empowered by the latest chips, radios, and sensors, each client device today is powerful in computation, in storage, in sensing and in communication. Yet client devices are still limited in battery power, global view of the network, and mobility support. Most interestingly, the collection of many clients in a crowd presents a highly distributed, under-organized, and possibly dense network. Further, wireless networksis increasingly used locally, e.g. intra-building, intra-vehicle, and personal body-area networks; and data generated locally is increasingly consumed locally.

Fog Network presents an architecture that uses one or a collaborative multitude of end-user clients or near-user edge devices to carry out storage, communication, computation, and control in a network.

It is an architecture that will support the Internet of Things, heterogeneous 5G mobile services, and home and personal area networks. Fog Networking leverages past experience in sensor networks, P2P and MANET research, and incorporates the latest advances in devices, network systems, and data science to reshape the “balance of power” in the ecosystem of computing and networking.

As the first high-quality IEEE workshop in the emergent area of Fog Networking, this workshop’s scope includes:

–       Edge data analytics and stream mining

–       Edge resource pooling

–       Edge caching and distributed data center

–       Client-side measurement and crowd-sensing

–       Client-side control and configuration

–       Security and privacy in Fog

–       Fog applications in IoT

–       Fog applications in 5G

–       Fog applications in home and personal area networking

Accepted and presented papers will be published in the IEEE FOG Networking Proceedings by the IEEE Computer Society Conference Publishing Services and IEEE Xplore Digital Library.To be published in the IEEE FOG Networking Proceedings an author of an accepted paper is required to register for the workshop at the full (member or non-member) rate and the paper must be presented by an author of that paper at the conference unless the TPC Chair grants permission for a substitute presenter arranged in advance of the event and who is qualified both to present and answer questions.  Non-refundable registration fees must be paid prior to uploading the final IEEE formatted, publication-ready version of the paper.  For authors with multiple accepted papers, one full registration is valid for up to 3 papers.

Workshop Co-Chairs:

Mung Chiang

Arthur LeGrand Doty Professor of Electrical Engineering

Director of Keller Center for Innovation in Engineering Education

Princeton University

Sangtae Ha

Assistant Professor, Computer Science Department

University of Colorado at Boulder

Junshan Zhang

Professor, Electrical and Computer Engineering Department

Arizona State University

Workshop Technical Program Committee:

Bharath Balasubramanian (AT&T Labs)

Suman Banerjee (University of Wisconsin)

John Brassil (HP Labs)

Gary Chan (Hong Kong University of Science and Technology)

Tian Lan (George Washington University)

Athina Markopoulou (UC Irvine)

Rajesh Panta (AT&T Labs)

Chunming Qiao (University of Buffalo)

Moo-ryong Ra (AT&T Labs)

Tao Zhang (Cisco)