MOOC for edge networking and IoT

This course maybe of interest to readers, titled “Fog Networks and the Internet of Things”.

This course teaches the fundamentals of Fog Networking, the network architecture that uses one or a collaborative multitude of end-user clients or near-user edge devices to carry out storage, communication, computation, and control in a network. It also teaches the key results in the design of the Internet of Things, including consumer and industrial applications.

Link: https://www.coursera.org/course/fog

New abstraction for edge network systems

The end-to-end principle of the internet is a fallacy. Modern distributed system rely on the cloud rather than deal with the complexity of the edge network. We propose to explore how to provide primitives such as consistency, integrity, accessibility and authentication in the context of edge network distributed systems.

Motivation

The internet has abandoned the end-to-end principles on which it was established [2]. With IPv4 addresses depleted and the transition to IPv6 yet to restore public identities, devices are left behind NATs and firewalls. Instead of dealing with the complexity of the edge network, users opt to use centralized cloud services, offering usability and high availability.

It’s a jungle on the edge network. By Original – Highways Agency [CC BY 2.0], via Wikimedia Commons
In this post-Snowden era, users are beginning to question their decision out of fear of censorship and mass-surveillance. Furthermore, a series of highly publicized data breaches and DoS attacks have shed light on the weak guarantees provided by opaque terms of service [8], which are engineered to minimize legal responsibility. Classes of applications such as multiplayer games and video conferencing can benefit from low latency characteristics of direct peer to peer connections whilst others such as local file sync and sharing can benefit from the high bandwidth and scalability. Even in this mod- ern world, users need the ability to establish inter-device connectivity without a full internet connection, for example isolating processing of personal data from the Internet Of Things or connecting between personal devices on the go.

In response to this demand, developers are building new applications for the edge network. They are reimplementing solutions to establishing authenticated identities, consensus and availability in the face of mobile nodes, network partitions and asymmetric channels. Without a clear stack and layers of abstraction, systems fail to provide even the most basic safety guarantees. Protocols are layered on top of each other without formal agreement on the services provided at each layer. Even after this engineering effort by developers, systems still require intricate configuration to deal with the diversity of devices, middleboxes and network environments [6] on the edge network, if they are able to work at all.

Challenges

For this discussion we make the following distinction. Data requirements are needs specified by the application, for example a distributed file system may specify that file meta- data must be strongly consistent whilst the files themselves need only be eventually consistent. In contrast, environmental requirements is the set of network environments that the application needs to operate in. For example, an application might specify that the nodes may be mobile and intermittently connected, however there will always be a cloud node which is highly reliable and publicly addressable but run on untrusted infrastructure.

Our focus on the edge network means we lose the data center assumptions, typical in distributed systems for decades. The environmental requirements now spans:

We’re in the real world, no longer the datacenter. By Hugovanmeijeren (Own work) [GFDL or CC BY-SA 3.0], via Wikimedia Commons
  • Heterogeneous network topologies — Middleboxes plague the edge network, network topologies are complex, devices may have asymmetric reachability, there is a wide range of link characteristics and traffic can be treated differently depending on its class.
  • Mobile nodes — We can no longer rely on IP ad- dresses to identify nodes. Nodes may move between networks and have multiple network interfaces. Inter- mittent connectivity and network partitions are com- mon.
  • Diverse hardware — Devices can vary in the con- straints of CPU, power supply or memory. Utilizing different networks may come at different costs.
  • New failure models — We no longer assume homogeneous trust between nodes. Different nodes suffer with different failure models and expected failure patterns.

Developers make crude assumptions about their applications’ requirements. The data requirement space is large, it includes some regions that have been proved impossible and others which may prove impossible.

Research Questions

The key research question is how can we provide services such as consistency, accessibility and authentication in the context of edge network distributed systems, this encompasses other questions such as:

  • Which areas of the space of data and environmental requirements are covered by existing distributed algorithms, which areas are not yet covered and which areas are provably impossible to cover?
  • How can we formally express the assumptions and guarantees of distributed algorithms and their trade-offs, data and environmental requirements such that our engine can resolve them?
  • How can we evaluate such systems given the diversity of possible environmental requirements and combina- tions of data requirements?
  • How can we ensure that the distributed algorithms provide the stated guarantees under the assumptions? How can we construct and reason about these algorithms such that they provide stronger guarantees then conventional systems?
  • How can we combine the above to provide a stack of protocols which fulfils the data requirements, given the environmental requirements?

Approach

We propose a new common abstraction between applications and networked devices to form personal clouds. Programmers (and ultimately users) formally specify the data and environmental requirements, these requirements span domains in fault tolerance, replication, consistency, caching, accessibility, security levels and confidentiality. From a col- lection of distributed algorithms, each with their own set of formally specified assumptions and guarantees, an engine will stack the protocols to provide the data requirements in the environmental requirements. From this foundation, we can build new distributed systems including new systems for personal data [4]. We are currently considering building upon a suite of existing tools in this domain including a unikernel operating system [10], TLS implementation [11], a git-style distributed data store [3] and Raft consensus implementation [5].

State of the Art

Sapphire [13] is a programming platform to separate application and deployment logic in cloud and mobile applications. Whilst Sapphire’s motivation is similar to ours, it covers a limited space of data requirements and environmental requirementss and doesn’t provide any guarantees to applications running on the platform.

The systems community is beginning to design distributed protocols specifically to tolerate the edge network, such as achieving consistency in an environment of heterogeneous trust [12, 7]. But quantifying the environmental requirements of such protocols requires a much richer abstraction than those currently used. Some authors [1, 9] suggest we can provide stronger guarantees for distributed protocols by changing the basic programming constructs and languages used, this is something we intend to explore further.

REFERENCES

[1] Peter Alvaro, Tyson Condie, Neil Conway, Joseph M. Hellerstein, and Russell Sears. I do declare: Consensus in a logic language. SIGOPS Oper. Syst. Rev., 43(4), January 2010.

[2] Marjory S. Blumenthal and David D. Clark. Rethinking the design of the Internet: the end-to-end arguments vs. the brave new world. ACM Transactions on Internet Technology, August 2001.

[3] Thomas Gazagnaire. Irminsule; a branch-consistent distributed library database. OCaml 2014 Workshop, 2014.

[4] Hamed Haddadi, Heidi Howard, Amir Chaudhry, Jon Crowcroft, Anil Madhavapeddy, and Richard Mortier. Personal data: Thinking inside the box. arXiv preprint arXiv:1501.04737, 2015.

[5] Heidi Howard, Malte Schwarzkopf, Anil Madhavapeddy, and Jon Crowcroft. Raft refloated: Do we have consensus? SIGOPS Oper. Syst. Rev., 49(1), January 2015.

[6] Christian Kreibich, Nicholas Weaver, Boris Nechaev, and Vern Paxson. Netalyzr: illuminating the edge network. In Proceedings of the 10th annual conference on Internet measurement, IMC ’10, pages 246–259. ACM, 2010.

[7] Jed Liu, Michael D. George, K. Vikram, Xin Qi, Lucas Waye, and Andrew C. Myers. Fabric: A platform for secure distributed computation and storage. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP ’09, 2009.

[8] Ewa Luger, Stuart Moran, and Tom Rodden. Consent for all: Revealing the hidden complexity of terms and conditions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 2687–2696. ACM, 2013.

[9] Anil Madhavapeddy. Combining static model checking with dynamic enforcement using the statecall policy language. In Proceedings of the 11th International Conference on Formal Engineering Methods: Formal Methods and Software Engineering, pages 446–465. Springer-Verlag, 2009.

[10] Anil Madhavapeddy, Richard Mortier, Charalampos Rotsos, David Scott, Balraj Singh, Thomas Gazagnaire, Steven Smith, Steven Hand, and Jon Crowcroft. Unikernels: Library operating systems for the cloud. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’13, 2013.

[11] Hannes Mehnert and David Kaloper Mersinjak. Transport layer security purely in ocaml. In OCaml Workshop, 2012.

[12] Isaac C Sheff, Robbert van Renesse, and Andrew C Myers. Distributed protocols and heterogeneous trust: Technical report. arXiv preprint arXiv:1412.3136, 2014.

[13] Irene Zhang, Adriana Szekeres, Dana Van Aken, Isaac Ackerman, Steven D Gribble, Arvind Krishnamurthy, and Henry M Levy. Customizable and extensible deployment for mobile/cloud applications. In Proceedings of the 11th USENIX conference on Operating Systems Design and Implementation, pages 97–112. USENIX Association, 2014.

Raft Refloated: Do We Have Consensus?

The January edition of SIGOPS Operating Systems Review is out now and thus is the aptly named “Raft Refloated: Do We Have Consensus?”. This is my first journal paper and I’m really existed to see what the community makes of it.

Title: Raft Refloated: Do We Have Consensus?
Authors: Heidi Howard, Malte Schwarzkopf, Anil Madhavapeddy and Jon Crowcroft
Paper: acm dl, open access link
Abstract: The Paxos algorithm is famously difficult to reason about and even more so to implement, despite having been synonymous with distributed consensus for over a decade. The recently proposed Raft protocol lays claim to being a new, understandable consensus algorithm, improving on Paxos without making compromises in performance or correctness.

In this study, we repeat the Raft authors’ performance analysis. We developed a clean-slate implementation of the Raft protocol and built an event-driven simulation framework for prototyping it on experimental topologies. We propose several optimizations to the Raft protocol and demonstrate their effectiveness under contention. Finally, we empirically validate the correctness of the Raft protocol invariants and evaluate Raft’s understandability claims.

Below is the key figure of the paper, showing a side-by-side comparison of the simulation results next to the authors’ original results.

fig15-original

fig15-replicate

Bring on the Databox

Last week we release a open access preprint of our first paper on the Databox on arXiv, titled “Personal Data: Thinking Inside the Box“. Despite not publishing in a peer reviewed venue, the response has been greater than we expect. Most notability we were featured in the Guardian, a British newspaper known for its pro-privacy and anti-government surveillance views and well as the MIT Technology Review and Treasury Insider.

Time to start thinking inside the box? Image By Husky [Public domain], via Wikimedia Commons

In the paper, we propose there is a need for a technical platform enabling people to engage with the collection, management and consumption of personal data; and that this platform should itself be personal, under the direct control of the individual whose data it holds. Our solution is the the Databox, a personal, networked service that collates personal data and can be used to make those data available.

The paper is an accessible read and does not cover any technical details, instead its a brief overview of the problem space and its challenges. We are currently preparing the paper for submission so your thoughts and ideas are more welcome than ever.

A huge thanks to my amazing co-authors Hamed Haddadi (@realhamed), Amir Chaudhry (@amirmc), Jon Crowcroft (@tforcworc), Anil Madhavapeddy (@avsm) and Richard Mortier (@mort___).

 

CFP: FOG Networking for 5G and IoT

This seems like an interesting venue for work on edge network distributed systems. Nice to see that we are not the only ones who think that the edge is area worth researching.

 

CFP: Fog Networking for 5G and IoT workshop

>>>>> In conjunction with SECON 2015 <<<<<

=========================================

22 June 2015, SEATTLE – USA

http://secon2015.ieee-secon.org/workshop-program

Important dates

Submission deadline(Hard): April 1st, 2015

Notification of acceptance: April 15th, 2015

Camera Ready: April 30th, 2015

Workshop: June 22nd, 2015

Scope:

Pushing computation, control and storage into the “cloud” has been a key trend in networking in the past decade. Over-dependence on the cloud, however, indicates that availability and fault tolerance issues in the cloud would directly impact millions of end-users. Indeed, the cloud is now “descending” to thenetwork edge and often diffused among the client devices in both mobile and wireline networks. The cloud is becoming the “fog.”

Empowered by the latest chips, radios, and sensors, each client device today is powerful in computation, in storage, in sensing and in communication. Yet client devices are still limited in battery power, global view of the network, and mobility support. Most interestingly, the collection of many clients in a crowd presents a highly distributed, under-organized, and possibly dense network. Further, wireless networksis increasingly used locally, e.g. intra-building, intra-vehicle, and personal body-area networks; and data generated locally is increasingly consumed locally.

Fog Network presents an architecture that uses one or a collaborative multitude of end-user clients or near-user edge devices to carry out storage, communication, computation, and control in a network.

It is an architecture that will support the Internet of Things, heterogeneous 5G mobile services, and home and personal area networks. Fog Networking leverages past experience in sensor networks, P2P and MANET research, and incorporates the latest advances in devices, network systems, and data science to reshape the “balance of power” in the ecosystem of computing and networking.

As the first high-quality IEEE workshop in the emergent area of Fog Networking, this workshop’s scope includes:

–       Edge data analytics and stream mining

–       Edge resource pooling

–       Edge caching and distributed data center

–       Client-side measurement and crowd-sensing

–       Client-side control and configuration

–       Security and privacy in Fog

–       Fog applications in IoT

–       Fog applications in 5G

–       Fog applications in home and personal area networking

Accepted and presented papers will be published in the IEEE FOG Networking Proceedings by the IEEE Computer Society Conference Publishing Services and IEEE Xplore Digital Library.To be published in the IEEE FOG Networking Proceedings an author of an accepted paper is required to register for the workshop at the full (member or non-member) rate and the paper must be presented by an author of that paper at the conference unless the TPC Chair grants permission for a substitute presenter arranged in advance of the event and who is qualified both to present and answer questions.  Non-refundable registration fees must be paid prior to uploading the final IEEE formatted, publication-ready version of the paper.  For authors with multiple accepted papers, one full registration is valid for up to 3 papers.

Workshop Co-Chairs:

Mung Chiang

Arthur LeGrand Doty Professor of Electrical Engineering

Director of Keller Center for Innovation in Engineering Education

Princeton University

Sangtae Ha

Assistant Professor, Computer Science Department

University of Colorado at Boulder

Junshan Zhang

Professor, Electrical and Computer Engineering Department

Arizona State University

Workshop Technical Program Committee:

Bharath Balasubramanian (AT&T Labs)

Suman Banerjee (University of Wisconsin)

John Brassil (HP Labs)

Gary Chan (Hong Kong University of Science and Technology)

Tian Lan (George Washington University)

Athina Markopoulou (UC Irvine)

Rajesh Panta (AT&T Labs)

Chunming Qiao (University of Buffalo)

Moo-ryong Ra (AT&T Labs)

Tao Zhang (Cisco)

2nd Annual Oxbridge Women in Computer Science Conference

I’ve just registered to attend the 2nd Annual Oxbridge Women in Computer Science Conference on 16th March 2015. I may presenting a poster or even giving talk. My submitted abstract is below:

Life on the Edge (Network)

The internet has abandoned the end-to-end principles on which it was established. With IPv4 addresses depleted, devices are left behind NATs and firewalls, with the transition to IPv6 yet to restore their public identity. Instead of dealing with complexity of the edge network, users opted to use centralized cloud services, offering usability and high availability.

In this post-snowden era, users are beginning to question their decision in fear of censorship and mass-surveillance. Furthermore, a series of highly publicized data branches and DDoS attack has shed light on the weak guarantees provided by opaque terms of service which are engineered to minimize legal responsibility. Many peer to peer applications such as multi player gaming and video conferencing need to low latency characteristics of edge network connections. Even in this modern world, users need the ability to establish inter-device connectivity without a full internet connection, for example to isolate local processing of personal data from the Internet Of Things or connecting between personal devices on the go.

In response to this demand, developers are building new applications for the edge network, they are reimplementing solutions to establishing authenticated identities, distributed consensus and availability in the face of mobile nodes, pervasive network partitioning, asymmetric channels and Byzantine failures. Without a clear stack and layer of abstraction, systems fail to provide even the most basic safety guarantees. Protocols are layered on of either other without formal agreement on the services provided at each layer. Even after this engineering effort by developers systems still require intricate configuration to deal with the diversity of devices, middleboxes and network environments on the edge network, if they are able to work at all. Developers made crude assumptions about their applications requirement e.g. assuming all application data needs the same level of consistency and fault tolerance.

We propose a new common abstraction between applications and the networked devices to form a personal cloud for every individual. Programmers (and ultimately users) formally specific the requirement for the data items, these requirements span domains in fault tolerance, replication, consistency, caching, accessibility, security levels and confidentiality. This foundation will enable us to develop new systems for handling personal data to finally put the individual back in control of their own data.

I would like to apply for both a talk and a poster on this topic. This topic is highly interdisciplinary both within and beyond computer science. Everyone (who uses the internet) will be able to relate to this topic and thus I think all the Oxbridge women will be able to take something away from this talk regardless of their particular field of computer science and stage of study. Whilst the talk will be of interest to a wider audience, the poster will focus on the proposed architecture of the system and the technical challenges of the working with the edge network.

Personal Data: Thinking Inside the Box

Our first paper on the Databox, a personal, networked service that collates personal data and can be used to make those data available is now available (open access) on arXiv. Enjoying reading it and let me know what you think.

Title: Personal Data: Thinking Inside the Box
Authors: Hamed Haddadi, Heidi Howard, Amir Chaudhry, Jon Crowcroft, Anil Madhavapeddy, Richard Mortier
Abstract:
We propose there is a need for a technical platform enabling people to engage with the collection, management and consumption of personal data; and that this platform should itself be personal, under the direct control of the individual whose data it holds. In what follows, we refer to this platform as the Databox, a personal, networked service that collates personal data and can be used to make those data available. While your Databox is likely to be a virtual platform, in that it will involve multiple devices and services, at least one instance of it will exist in physical form such as on a physical form-factor computing device with associated storage and networking, such as a home hub.