Author Archives: Heidi-ann

Getting Client & Server Talking – Demo Pt 3

Now I have an Android phone running the client and my laptop running the server, its time to get them taking.

A quick inspection of the client code shows the line “public static final int [] IP_ADDR = {128,2362,110,172}; ” I change this to the IP address of my laptop and reload the code onto my android phone.

I launch the server on my laptop, the client on the Android phone and here the troubleshooting begins. Nothing appears to happen on either the client or server so I:

  1. check the network connection on both client and server  ✓
  2. un-install the client application and re-install it ✓
  3. check the IP address of the server is correct ✓
  4. Try running client on emulator ✓ just got the exact same problem

Taking a closer look at the IP address used:

I tried to find out the IP address of the server by visiting here but i thought that I should try to take a more technical approach. Visiting connection information in Ubuntu gives me a different IP address. Now I’m fairly sure that my server is behind a NAT as is my client.

Initally I assume that the different IP address are caused by the NAT, therefore the IP address from the website, is my public IP address (the IP address of the NAT) and the IP address given to my by my OS is the private IP address, that I am referred to by machines also behind the NAT.

This explaination fails to explain why using the IP address from the website didn’t work and the IP address given to me by my OS doesn’t have the normal structure that I would accociate with a private IP address.

So I now repeat my initial test using the different IP address (previously I used the address provided by a website and now I am using the one provided by the OS) and it works on both the android phone and the emulator.

But by use of the word “works” I mean that the server recognises that a client has connected and the GUI on the android client provides graphs showing latency and goodput, it unfortunately doesnt mean that I’ve yet found a good way to extract the data to make a comparision to the data provided by Iperf to test accuracy and reliably, on the assumption that Iperf will always produce accurate results that we can assume to be the true value.

The next useful step will be to continue to explore why this IP address works and the other one doesn’t. As the client and server are now able to connect I can test if this IP address is working only because both client and server are behind the same NAT or is there something more complex going on. Currently both the laptop and android phone are making use of the same Wi-FI network, so the next step in testing if this will work when the client and server are not behind the same NAT is to change the network that either is using.

The plan for testing with would be:

  • move server onto another WiFi network
  • find out new IP address and update client code
  • test if emulator can connect to server, it should do as it used same network connection as the host computer, which in this case is the server
  • test if android phone can connect to server, it should not as its now not behind the same NAT as the server
  • move the android phone onto the other WiFi network
  • test if android phone can connect to server, it should now be able to

As this test is not critical for now, I will move onto Part 4 next week, where I get Iperf collecting the same data between the android client and then compare the data collected

Running the Android Client – Demo Pt 2.1

An update for you on my progress on test the code to measure network performance between an android phone and a computer running a server written in Ocaml. So far, I have run the server and the client separately but not yet got them communicating. As explained in part 2 of this series, I decided to set up Eclipse with ADT Plugin on my desktop after getting some weird errors on my current laptop set up of Eclipse and ADT. The details of the problem are available here on Stackoverflow.com. There has been no clear solution to the problem. Therefore I have had to totally remove my install of Eclipse and ADT, so that I can start all over again.

DreamPlug

I’ve also been distracted the last few days by this piece of new kit, the DreamPlug. We are using it as part of the upcoming Signposts Demo at SIGCOMM in Helsinia.I will write an article on it soon

Finally I’ve fix my Eclipse set-up and I’m now able to load the client code onto the Android phone and run the server on my laptop. Part 3 will be coming very soon.

Also on a personal note, I am 20 today…

Running the Ocaml Server – Demo Pt 1

 

I’ve got some code here and here in OCaml & Java, that I have been tasked with running, testing and fixing. I’ve already taken a look at the basic syntax but now its time to get to grips with some the libraries and related packages that Signposts makes use off.

THE SERVER IN OCAML

The first thing that took me by surprise was the number of files in the repo, considering that this is suppose to be a fairly straight forward project.

Oasis

The repo contains a file called “_oasis”, which on closer inspection includes a collection of metadata on the project. OASIS is a tools to integrate a configure, build and install system into an OCaml project. The _oasis files tells me that the entry point for the program is “server.ml” and the build dependences are the findlib packages lwt.syntax, lwt.unix, re, re.str.
The creater of the program will have ran “oasis setup”, which has generated setup.ml, _tags and myocamlbiuld.ml.

According to the instructions for OASIS you can configure, build and install the Ocaml program using:

  • ocaml setup.ml -configure

  • ocaml setup.ml -build

  • ocaml setup.ml -install

Re_str

The server.ml file makes use of Re_str and two of the biuld dependences are the findlib packages re and re.str. I have found the packages here on github. Re stands for Regular expression and ocaml-re is a regular expression library doe Ocaml (that is still under development). This project has also made use of OASIS and installs without a problem

Compiling the Code

I cd into the directory and run ocaml setup.ml -configure, ocaml setup.ml -build and ocaml setup.ml -install. This generates a new dirctory “_biuld” and a new link to an executable in _biuld

Running the Code

I run a link to the execuated in _biuld and it return the following error:
Fatal error: exception Unix.Unix_error(50, “bind”, “”)
This means that the socked that the code is try to bind to is already in use. I inspected the code, identified the ports that were being used and ran “sudo netstat -ap | grep ‘: ‘ “. This identified a process already using the port, the process was in fact an earlier attempt to run the project. I identified the process ID and killed the process, re-running “sudo netstat -ap | grep ‘: ‘ “ showed that the port was now free. I can now happily run the executable in the _build directory. As this is a client-server implementation, I now need to move my attention to the client code

Running the Android Client – Demo Pt 2

[ Sorry but this is part 2 in the series, I have completed part 1 but will not be able to upload the article until tomorrow ]

This is the 2nd part of a series on my taking a look at the code here, to test it and then improve it if required. In part 1, I ran the Ocaml server on my laptop. In this part, I run the android client. Tomorrow (Monday), in part 3 I am going to get the client and server communicating. On Tuesday, I will hopefully collect some data that I can then extract in some suitable format. On Wednesday, I hope to run Iperf between my client and server to collect data and extract in some suitable format. On Thursday, I hope to have a day of collecting data using both the GitHub code and Iperf. Finally on Friday, I hope to use a statistics package such as MatLab to compare the data collected and see if there is a statistically significant difference.

Since I have managed to mess up my eclipse/android install (see part 1), this gives me a perfect excuse to set up eclipse with ADT Plugin on my desktop. I’ve also got my hands on a different android phone, since the code that testing was designed for API 10. The phone info is as follows:


Model: HTC Hero Andriod Version: 2.3.7CPU: ARMv6- compatible processor #34Available memory: 190MBMod Version: CyanogenMod-7.2.0-RC1-UNOFFICAL
Build Number: GWK74

THE BIG PLAN

The stages in setting up the Android ADK, Eclipse and loading the code that I’m testing on the Android device (spec above)

  1. Download the Android SDK (Software Development Kit)
  2. The SDK only installs the SDK tools so to install other parts of the SDK, I’ve got to open up the Andriod SDK Manager, this is done by cd’ing into the tools/ directory and executing android SDK
  3. To develop an Andriod app, I need to download the lastest Android SDK Platform-tools and at least on Android platform. The platform that I want is the one associated with Android 2.3.7 and API 10, I’m going to get the related SDK Platform. Documentation and System image.
  4. Download and install the latest Eclipse Classic
  5. Add the source of the ADT Plugin to the Repository in Eclipse and download the Developer Tools in via Eclipse
  6. Specify the location of the Android SDK directory in the Preferences panel of Eclipse
  7. Create a AVD (Android Virtual Device) in Eclipse with target platform 2.3.7 and API 10
  8. Download the code that I’m testing from GitHub
  9. Import the code into Eclpise
  10. Test Code on the AVD
  11. Plug in Android phone and test code on the Andriod phone

AND THEN THE REALITY

1

The Android SDK is avaliable from the Android Developer Site, the package for Linux is called android-sdk_r20-linux.tgz.

The download comes as a .tgz file, this can be extracted using: tar -zxf android-sdk_r20-linux.tgz after using cd to move to the directory that you downloaded the file to.

I’ve found that its useful the learn as your going along, the purpose of the arguments that you are passing to programs, so that you can then easily adjust the arguments in future to suit your needs. Using man tar, explains that -z means filter archive through gzip, -x means extract the filesand -f means that your next going to give the file name.I would highly recommend that when following tutorials with Unix commands that you use man to look up what the executable and its arguments do.

2

As with the last time I installed Android SDK, to launch the Android ADK Manager, I need to enter ./android SDK instead of android SDK. This then appears to work fine, but I would still be interested as to why I didn’t work as expected.

[To clarify This command is entered after you use cd to move into android-sdk-linux/tools]

3

I selected the required packages (that I’ve listed in the plan above) and download them, simple as.

4

(Read this whole paragraph before beginning to download Eclipse) I download Eclipse Classic 4.2 for 32 bits as a .tar.gz file and extract the file (like before using tar -zxf eclipse-SDK-4.2-linux-gtk.tar.gz ). To install Eclipse, I cd into the new eclipse directory, to find no configure file, so my normal method of installing from .tar.gz files, of using ./configure && make && sudo make install will now not work. I little bit of research highlights that installing eclipse its not as straightforward as it might seem and the best approach for integration into Ubuntu is to get it from the software center, though I would prefer to continue to use the command line, I’m happy to leave it for now and return to this point at a later date

5

I start up eclipse, select my workspace and go into Help > Install New Software. I add the ADT Plugin at the URL https://dl-ssl.google.com/android/eclipse/. I then select the Developers Tools and install

6

With a slow connection,

you could be waiting sometime

I restarted Eclipse and configured the new ADT Plugin to link it to the Android SDK. On launching Eclipse, I’m presented with a Welcome to Android wizard (not used in the instruction at https://developer.android.com/sdk/installing/installing-adt.html)

7/8/9

Using the Eclipse GUI setting up a virtual device is straightforward. I navigate to the correct page on GitHub, download the code and extract as already done twice before and then I import the code as an existing Android project into Eclipse.

UNEXPECTED DIGRESSION

On loading the code, that I know should compile, eclipse highlights 2 errors and 13 warnings. As I’m just testing the code today before reviewing it next week. I will ignore the warming but deal with the errors as they prevent me from testing the code. The first error is from the line: import com.google.gson.Gson; I download the Gson project from googlecode and add it as an external libray to the project to overcome this. The second error is from the following line: public void onClick(View v). The error is that: The method onClick(View) must override a superclass method so this can be fixed by removing the above @Override annotation

10

Before I can run the code on the emulator, I create a new andriod project and add the code so far. I then use Run As > Android Application to test the code and it works

11

I put the android phone into USB debug mode, plug it in and run as. Again it works 🙂

“The best part about being a computer scientist is the feeling of satisfaction when something finally works”

UROP Goals

I hope the following will give you an insight into what I’m studying and where I aim to go with it.

Overall Goal for the next 10 weeks: To be of use the Signposts Project and learn some useful skills on the way

Skill Required:

  1. A solid knowledge of computer networking (in particular protocols at the network and transport layers)
  2. Familiarly and practical experience with Ocaml and (in particular Lwt and using sockets)
  3. Experience with Java on Android and the platform

How I hope to achieve a solid knowledge of computer networking ?

  • review the material from my computer networking course, mostly by re-reading Kurose, J.F. & Ross, K.W. (2009). Computer networking: a top-down approach. Addison-Wesley (5th ed.), redoing past exam questions / supervision work, doing the Wireshark Lab sheets and working through the hands on material
  • Gain a deeper knowledge of particular areas by reading RFC and academic papers, ensuring that I alway take notes so that its easy to refer back to material without re-reading all of the material
  • Get update with the latest in online anonymity and security including projects/topics such as Tor, FreedomBox, Privoxy, Hamachi, Tails, Backtrack, SSL, DNSSEC, FireFox addons like NoScript and HTTPS Everywhere etc…

How I hope to achieve familiarly and practical experience of Ocaml

  • Working through Unix system programming in ocaml by Xavier Leroy and Didier Rémy
  • Keeping upto date with Ocaml project & questions on StackOverflow
  • Build Ocaml code

RFC 1034 – An insight into the 1980’s

If you think that RFC make dry reading, maybe just maybe you just haven’t given them enough of a chance yet. Today, I begin reading my first RFC, I choose RFC 1034 and I can genuinely say that I’ve found much more interesting than I was expect, in fact I’ve struggled to put it down (though I suspect that says more about me than RFCs) . RFC 1034 was written 1987 (before I was even born) and offers an insight to a world before DNS and how its designed envisioned DNS’s future use. Below are my notes on the RFC so far (this is likely to be the first collection of notes out of 3). Sorry for the bullets but its my preferred way to take notes

–>

I’ve located the Request for comments (RFC) relating to DNS from the following page http://www.zoneedit.com/doc/rfc/. The RFC that I am going to begin with reading is RFC 1034. This RFC is developed in November 1987 and it titled “Domain Names – Concepts & Facilities”

–>

Historical Context

  • Originally per host file in /etc/hosts for hostname to IP address mapping was updated via FTP.
  • The bandwidth required to update this per host file was proportional to number of host squared, hence exponential grow of hosts mean new system was required
  • The first operational packet switching network was the Advanced Research Projects Agency Network (ARPANET)
  • When changes to the per host file were make, they took along time to propagate to all hosts

Common Ideas from Proposed Solutions

  • Hierarchical name space with the hierarchy corresponding to organisations structures
  • Using a full stop “.” to mark the boundary between hierachy levels

DNS Design Goals

  • Consistent name space where names do not include network identifiers or addresses
  • Distributed management with local caching
  • Minimising attempts to collect consistent copies of entire database as this is not scalable
  • Source (not client) control tradeoff between cost, speed and accuracy of caches
  • All data associated with a name is tagged with a type
  • Name server translations should be independent of communication system that carriers them
  • The system should be useful across a wide spectrum of hosts

Usage Assumptions

  • Initially total database size is proportional to number of hosts but this will grow in time
  • System wide data will change slowly, subnet wide data will change rapidly
  • Each organisation is responsible for its own domains including providing redundant name servers
  • Clients use trusted name server before name server outside the “trusted set”
  • Availability is more important than consistency so its ok that updates are not simultaneous
  • Believe old data when new data is not yet avaliable
  • Copies of translations are distributed with time-outs for refreshing, the distributor sets time-out value
  • When a name server is presented with a query that can only be answered by some other server, either a recursive (name server asks another server to do query) or iterative (name server returns address of another name server to ask instead). The iterative approach is preferred but the recursive approach is still an option
  • All data originates from master files keep on hosts that use the DNS
  • All master files are updated by local sysadmin and all master files are text files that are read by a local name server
  • User programs access name servers through standard programs called resolvers
  • The standard format of master files allows them to be exchanged between hosts
  • Sysadmin provides zone boundaries, master files of data, updates of master fules and refreshing policies
  • DNS provides standard formats for resource data, standard method of querying the database, standard methods for name server to refresh local data from foreign name servers

DNS & Resource Records (RR)

  • The names in the name space will have a tree structure and RR hold the data associated with each name
  • Each node and leaf on the DNS tree has a set of information associated with it and a DNS query aims to extract this information
  • A DNS query includes the name of interest and the type of information that is required

Name Servers

  • These are server programs which hold information about the DNS tree structure and information set
  • A name server may cache information on DNS tree structure or information associated with particular names
  • In general, a name server has complete information from a subset of the DNS tree leaves and then pointers to other name server that can be used to get information on the other areas of the DNS tree
  • A name server has authority for the area of the DNS tree for which it has complete information
  • Authotitative information is organized into units called zones

Resolvers

  • Resolvers are programs that extract information from the name servers in response to client queries
  • All resolvers must have access to at least one name server
  • Typically a resolver is a system routine that is accessible to user programs

DNS from a client prospective

  • DNS is accessed via a procedure call to the local resolver
  • The DNS consists of a single tree and the user can request information from any section of the tree

DNS from the resolvers prospective

  • DNS is composed of an unknown number of name servers
  • Each name server hold at least one piece of information from the DNS tree
  • The DNS tree is static

DNS from the name servers prospective

  • DNS is divided into zones
  • The name server has local copies of some zones
  • The name server must periodically refresh its zones from master copies in local files or foreign name server

Name Specification

  • The term “node” is used to refer to both interior nodes and leaves
  • Each node has a resource set, though this can be empty
  • Each node has a label, 0 to 63 octets (an octet is 8 bits) in length
  • Brother nodes may not have the same label
  • The same label can be used by nodes that are not brothers
  • The root label is null
  • The domain name of a node (not the same as the label for a node) is the list of labels on the path from the node to the tree root
  • The label that compose the domain name are read from (most specific) left to (most general) right, e.g. consider the structure of the DNS name for college computers as342.priv.queens.cam.ac.uk
  • Domain names are case-insenstive
  • In domain names the labels are separated by dots (.)
  • A domain name ending in a dot represents a absolute/complete domain name
  • A domain name not ending in a dot represents an incomplete domain name that should be completed by local software (aka relative domain name)
  • The maximum domain name length is 255 octets (set of 8 bits), this includes labels and label lengths
  • A domain is identified by a domain name and consists of the tree at and below that domain name
  • A domain is a sub domain of another domain if it is contained within that domain
  • You can test if a domain is a sub-domain of another, if the domain name is the within the end of the sub-domain e.g. cam.ac.uk is a sub-domain of uk and queens.cam.ac.uk is a subdomain of ac.uk
  • Labels must follow the rules for ARPANET host names

Email

  • The usually format of email addresses is localpart@maildomain, this is mapped to a domain name so that it can be looked up in the domain name tree by converting localpart into a single label (ignoring dots) then converting maildomain into a domain name (not ignoring dots) and replacing @ with a dot to form a domain name so the email address as342@cam.ac.uk will become as342.cam.ac.uk
  • Domain names ( and therefore email addresses) are case insensative

Resource Records (RR)

  • Each node in the tree has a set of resource information, this can be empty
  • The set of resource information is composed of the set of Rrs
  • Order of RRs is not significant and need not be preserved
  • The owner of RR is the domain name where the RR is found, this is normally inplicit rather than being an explicit part of a RR
  • The type of a RR is an encoded 16 value such as:
A – a host address
CNAME – the canonical name of an alias
HINFO – CPU & OS used by a host
MX – mail exchange for domain
NS – authoritative name server for the domain
PTR – pointer to another part of domain name space

SOA – start of zone of authority

  • The class of a RR in an encoded 16 bit value such as IN – internet system or CH – Chaos System
  • The TTL of a RR is the time to live of the RR, this is a 32 bit integer in seconds, this defines how long a RR can be cached for before discarding, this limit does not apply to authoritative data in zones (TTL of 0 will prevent caching)
  • The RDATA of an RR contains different values depending on the type of class of the RR such as:
  • If type is A and class is IN, then RDATA is a 32 bit IP address
  • If type is A and class is CH, then RDATA is a domain name followed by a 16 bit octal Chaos address
  • If type is CNAME, then RDATA is a domain name
  • If type is MX, then RDATA is a 16 bit preference values followied by a host name willing to act as a mail exchange
  • If type is NS, then RDATA is a host name
  • If type is PTR, then RDATA is a domain name
  • If type is SOA, then RDATA is several fields
  • The structure of an RR is as follows:
  • start of line gives the owner of the RR, next we list the TTL, the type and the class of the RR (dig giving results in different order)
  • Domain names in RR’s could point at the primary name not another alias to avoid extra indirection
Queries
  • queries are carried in UDP or over TCP connection
  • responses to queries are either the answer to the question, refers the requester to another set of name server or signals some error condition
  • in general the resolver deals with queries on the users behalf
  • DNS queries and responses have a standard format, the start is a opcode which is one for a standard query
  • Then there are 4 section in a DNS query/response:
  • Question – Query name and other parameters
  • Answer – Carries RR’s which directly answer the query
  • Authority – Carries Rrs which describe other authoritative servers
  • Additional – Carries Rrs which may be helpful in using the Rrs in the other sections
  • A standard query specifies a target domain name (QNAME), query type (QTYPE) and query class (QCLASS)
  • QTYPE field may specify a specific type, AXFR special zone transfer QTYPE, MAILB matches all mailbox related Rrs or * which (like the regular expression) matchs all RR types
  • QCLASS field may contain a specific type or * to match all RR classes

Getting Started with Python

Why have I chosen to learn Python ?

  • Its a language sometimes used by my summer research project
  • It works particularly better than other language on the Raspberry Pi
  • I’ve covered the basics of Python before but never gone further than simple syntax
  • Its open source
  • It can be used as a scripting language, which is an area that I really want to improve
  • It supports object-oriented programming so I feel that with my previous experience of Java and C++, I will have covered the most popular OOP languages

Python 2.7 or 3 ?

The first decision that I needed to make before learning Python was whether to learn Python 2.7 or 3, since Python 3 fixes some of the “non-optional” design decision found in 2.7 and it of course more update I decided to learn Python 3. The potential problems with this is the lack of backwards compatibility with 2.7 and lack of teaching materials.

Python in my degree ?

The lecture course which is most likely to have included Python is Concepts of Programming Language which I took in Easter term this year. The course makes a few references to the language but does not cover the language to the degree that languages such as Fortran, Lisp, Algol, Pascal, Simula, Smalltalk, SML and Scala were covered.

What programming paradigms does Python support ?

Python advertises itself as a multi-paradigm programming language and from what I can see the range of supported paradigms supports this claim. This makes Python a multi-purpose language along with languages such as Java, C, C++ and SML compared to special-purpose languages such as SQL and LaTeX. The paradigm which Python support include:

  • imperative programming – describing computation by prescribing how to move the program from one state to another, imperative languages included Fortran, C and assembly
  • object-oriented programming – it supports dynamic lookup, abstraction, subtyping and inheritance, OOP language include Java, C++, Simula, Smalltalk and Scala
  • Functional programming – treats computation as the evaluation of mathematical function and avoids state and mutable data. Examples include Lisp, Ocaml, Haskell and Scala
  • Reflection programming – allowing a program to examine and modify its structure at runtime. Like Java, ML and Haskell
  • Structured programming – subroutines, blocks, for and while making code more readable compared to simple tests and goto’s. Examples include most modern languages except assembly code

 How does Python handle types?

Type checking is the process of attempting to prevent type errors by ensuring that the operations in a programm are applied properly. There are two common types of type checking:

  • Dynamic (run-time) type checking – the compiler generates code such that whenever an operation is performed when the program is ran, the code checks to make sure that the operands have the correct types. Examples of dynamically type checked languages are MATLAB, Prolog and Ruby
  • Static (compile-time) type checking – the compilers checks that program text for potential type errors as the program is compiled. Examples of statically type checked languages are  Fortran, Haskell, Java and Ocaml
Python uses Duck typing

Python is dynamically type checked, in fact it uses duck typing. Duck typing means that valid semantics of a object are dictated by the current state of methods instead of its inheritance from a class/interface like in Java. Why call it duck typing then ? the name refers to the duck test:

“When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck”

This allows use of EAFP or “It’s easier to Ask Forgiveness than Permission”. attributed to Grace Hopper.

A type system is said to be strongly typed when it specifies restrictions on how operations involving values of different data types can be intermixed. If this is not true, then we describe the language as weak. Python is strongly typed.

The built in types are as follows:

  • bool – truth value e.g. True or False
  • int – int of unlimited magnitude
  • list – mutable sequence (supports mixed types) e.g. [4, 4.0, “Four”]
  • tuple – immutable sequence (supports mixed types) e.g. ( 4, 4.0, “Four”)
  • complex – complex number e.g. 3+5j
  • float – floating point number e.g. 3.14
  • dict – group of key and value pairs {“key1”:1.0, “key2”:2.0}
  • set – unordered, no-duplicate collection of values
  • bytes – immutable sequence of bytes
  • bytearray – mutable sequence of bytes
  • str – sequence of Unicode characters

Mutable means that an objects state can be modified after its created, immutable means the opposite.

Python uses the off-side rule

Syntax & Semantics of Python ?

Unlike almost all other languages that I have studied Python uses whitespace to delimit block instead of curly braces (like Java), this feature is termed the off-side rule.

You define a function or method using the def keyword. The typically statements that you would except also apply: if, for, while, try, except and finally.

Python uses the words and, or, not for boolean expressions. Integer division (using //) is defined to round towards minus infinity and floating point division is do using /. Compare by value is achieved using == and compare by reference is achieved using is.

The syntax and semantics of Python highlight the main purpose of Python as a language for teaching programming, hence the focus on readability. This is why Python is the language being taught to school children using the Raspberry Pi

Installing the Interpreter

I install the packages for Python 3 from the Ubuntu Repositories using  sudo apt-get install python3-minimal, and then run the interpreter by entering Python3 into the terminal

Once I get going with Python programming I hope to return to this type of language analysis and review how to design decisions made in the development of Python 3 reflect the use of the language in an introduction to programming. I hope to put up a part 2 of Getting Started with Python, later this week.

As ever, feel free to comment and highlight my mistakes even the spelling/grammar ones.

I’m back from brussels

From this …

I arrived in the back in Cambridge yesterday afternoon after the IBM EMEA Best Student Recognition Event 2012 in Brussels, Belgium. The theme of this years 3 day event was “Data Analytics”. The highlight for me on Wednesday 4th July was the a semenar by Prof. Bart Goethals, a lecturer at University Antwerp on data mining and machine learning in data analytics. For Thursday 5th July my highlights were a talk by Corinna Schulze from the IBM Governmental Program on privacy and a talk by Robert-Jan Sips focusing on how data analytics is being applied to water management in the Netherlands.  And for the final day was highlights was a talk by Guy Lefever on the application of Watson to future medical care. Overall I really enjoyed the event and I would throughly recommend attending if you get the opportunity to go.

It looks as if I’m going to have some new work on Signposts next week but I don’t know as yet what this will be so over the weekend I intend to review some of the new material that I’ve learned over the my first few weeks on the Undergraduate Research Opportunities Project (UROP) here at the William Gates Building, Cambridge. My initial project brief was:

Signposts – the virtual personal internet 

Users with multiple networked devices are having a hard time linking them up when said devices reside at more than one location. Simple example is trying to synch your android phone while on the bus to the laptop in your office and the Mac Mini at home. NATs and Firewalls and other obstacles to symmetric communication abound, and obstruct. Point solutions (Dropbox, iCloud) do not deal with many cases (different OS) and only work if you pay (with money or eyeball time on adverts). In any case, they are not acceptable to privacy conscious users.

Instead, we are building a system called signposts. This is the smallest conceivable virtual machine that sits in the public internet, and looks after your personal namespace of devices. Registration and lookup of names is “effectful”, in other words, has side effects. The side effect is to use a set of possible tactics to rendezvous between the devices (various techniques for NAT traversal, for example) including possible multiple simultaneous path use, and also allowing for constraints, such as battery life, and price of use of connections. The system sits on top of DNSSEC, but more interestingly, as a piece of Computer Science, makes use of the new Cloud OS software being developed in CL, called Mirage (see http://www.openmirage.org/ and https://github.com/avsm/signpostd

This project is to help with the documentation and testing “in the wild” of signposts. We expect to uncover a lot of problems with different oddities (see http://conferences.sigcomm.org/imc/2011/docs/p181.pdf for examples) that appear, and we need to have good stories to account for what we are doing to the public and to security people. OCaml programming is pretty essential, as well as at least basic understanding of development tools (including git etc) “

Over the last few weeks, I had used a wide range of tools that are new or that I had not previously used to this extent including (in no particular order):

  • Unix command line – although this is something that i typically use, I have definitely used it more and for a wider range of applications that ever before
  • Git & GitHub – for my 2nd year group project PROTON, we took the (controversial) decision to use Mercurial and BitBucket instead of Git and GitHub like most other group so it been interesting to see how they compare
  • Tor – As alot of my work over the last few weeks has been looking at testing the performance of tools for online anonymity, I’ve made extensively use of the onion router
  • Orbot – Tor for android
  • Tails – An OS build around tor
  • Iperf – As just explained as I’ve been working with testing network performance Iperf has been an invaluable tool (though I have recently discovered Torperf)
  • Amazon Elastic Compute Cloud – I’ve been using this a server when testing network performance, its really use as it provides as global domain name and IP address and avoid pathological cases that could be created by having other cleint and server on the same network
  • SHH – for remote access to the other computers I am testing with locally and the virtual machine on Amazon EC2
  • dig-host – command line tools for performing DNS queries, just generally useful
  • CyanogenMod – the OS on my android phone which allows easy using of proxies and tethering
  • Privoxy – heavy customizable non-caching web proxy
  • Hamachi – used with privoxy so set up encrypted VPN access
  • OpenVPN – easy set up of VPNs
  • Texmaker – a nice LaTeX editor, that I would throughly recommend which I have been using to keep the notes on my work over the last few weeks
  • Mozilla Firefox & addons – previously a Chrome user myself, I’ve opted to use Firefox since the research project begin for a change and during this time I have experimented with a range of addons for privacy including: HTTPS Everywhere, NoScript,  AdBlock Plus, DoNotTrackPlus, Collusion …
  • Eclipse & Android Development tools – getting to grips with the basics of Android development in preparation for the Google European Android Camp
  • Telnet – for commend line control of Tor
  • Wireshark – very useful for learning about computer networking, allows you to watch the packets on the network
  • TraceRoute – brillient tools which used TTL to should the route that a packet takes to its destination
  • Ocaml – the main project that I’m working with over the summer makes extensive use of Ocaml

I plan to use this list to visit some of the material that I’ve learned over the last few weeks in preparation for new work next week.