Thursday, July 10, 2008

UBF and VM opcocde design

UBF is a data encoding that allows structured terms (rather like XML) to be sent over the network. It also includes a protocol checking scheme to automatically determine if sequences of typed messages follow a particular protocol.

This blog entry was stimulated by this posting on the erlang mailing list.

One of the basic ideas of UBF of was to send programs not data structures. The programs were for a byte-coded stack machine. So instead of sending data structures between machines we send tiny programs which when evaluated create data structures.

Each byte is an opcode for a VM. The net-effect of executing a UBF program is to leave a value on the stack.

The trick in UBF was not to start allocating the opcodes in the VM from zero - but to allocate them with loving care.

A common mistake in making byte coded VMs is to allocate the byte codes from zero. If you think about it the byte code for a PLUS operation can only be 43 (why? - easy - this is the ASCII code for "+").
In fact the byte code for PLUS should be 43 in all byte coded VMs - there should be laws that make it a criminal offense for the opcode to be anything other than 43 - thus it is written - there will of course, be a problem with the opcode for TIMES - if you are familiar with your ASCII codes then you should understand why.
I have no idea where I learned this trick - it seems to be in the folk-law of VM design - choose the op codes so that the binary code is readable (if you can). Unfortunately I didn't know this when I designed the first Erlang VM but now I know better.

So this way the byte code for start-of-tuple is 123, end-of-tuple is 125 and element-separator is 44 - unsurprisingly "{", "}" and ",". Thus "{...,...,... }" is a program and NOT a bit of syntax.

With this choice of encoding programs become human readable strings which require zero parsing - you just execute the byte codes.

Contrast XML where the data structures are human readable but require parsing - this is why constructing a term from UBF is far faster than using XML and why the size is far smaller and is human readable.

Why didn't UBF spread?

If you have something that is almost ok - then lots of people can have great fun arguing over it and polishing it at the edges.

Things which deeply flawed and industry standards things like XML can lead to endless discussions - great fun - lots of hot air. Project management can happily preside over "the illusion of work" - wages get paid - everybody is happy. Projects get delayed - project management becomes very happy.

The optimal point is where projects get as delayed as much as possible, budget overruns are as large as possible and the project manger is almost, but not quite, sacked. This idea is explored in Putt's law and the successful Technocrat - recommended to me Gilad Bracha - and a great read.

Some things like (scheme, pascal, ..) are pretty nearly perfect - thus there is little to do. In fact pascal was perfect (anybody got a UCSD pascal emulator and image? - now that was really nice)

Fixing stuff that's broke

Programmers like to have something to do - so our lot in life is to fix flawed things. Most of my time is spent in fixing things that should work, but are in fact, broken.

ASN.1 (which got me started on this blog entry) is elegant - but how it has been used is not.

I am currently examining LDAP - LDAP schemas have to be seen to be believed (and yes LDAP schemas are written in ASN.1)

In LDAP schema speak a boolean is a (this is an OID, for those in the know) and is a string ...

I'm glad the LDAP schema designers didn't turn their hand at
programming language design. If they had, then

     boolean x,y,z;

Might have been

   type x,y,z;

The only thing that is good about LDAP schemas is that they are not XML schemas.

Saturday, June 28, 2008

Itching my programming nerve

Photo: oreillygmt

I've just got back from the first ever commercial Erlang conference. Some 40 talks in two days all related in some way or other to Erlang. It was a chance to meet old friends, make new friends and connect people together in the hope that new synergy effects would arise.

The most exciting thing was the emergence of what I think might be the first killer applications written in Erlang. I might be wrong, but my gut feeling is that what Alexander Reinefeld showed us will be the first killer application in Erlang.

Only a few language nerds are interested in programming languages in their own right. Most people are more interested in what you can do with a programming language than with the underlying language. Thus is was with Ruby. Ruby on rails was the application that drew developers into Ruby. It made them want to learn Ruby so that they could easily build web applications.

Alexander Reinefeld told us about an Erlang implementation of the Wikipedia that not only has a stunningly beautiful architecture, but which outperforms the existing Wikipedia.

I'll talk about the Wikipedia implementation later in this posting, but it was not the only notable talk there were many other great talks.

Claes Vikström (Klacke) gave a great lecture which was a mixture of battle stories, history and what he was doing today.

Klacke is the master of the one-line throw-away remark - "and then I implemented a DBMS ... and a web server" this was the technical stuff, and on the business side
...then we started a company and made a whole lot of money...
At the end of his lecture Klacke said something like:
... so how come we have this great technology and people are just doing boring things and not writing stock exchanges ... there aren't any killer applications ...
Just as an aside Joel Reymont did stir up the Erlang mailing list with his announcement that he wanted to write an open source stock exchange as a publicity stunt, but this ran out in the sand. Perhaps later we can resurrect this idea, it would be a bit of fun.

The Wiki

Now for the fun stuff. Alexander Reinefeld video answered Klackes call for action and for a non-boring application as he described how he had implemented the Wikipedia on a new p2p system now called Scalaris.

Here's my version of his story:
  1. They make a peer to peer system based on the chord algorithm
  2. They added a replication later using the paxos algorithm
  3. They added a transaction layer
  4. The injected the wikipedia
  5. It went faster that the existing wikipedia
Applied to Wikipedia, Scalaris serves 2,500 transactions per second with just 16 CPUs, which is better than the public Wikipedia.
Alexander is a tall bespectacled academic who usually only turns up at academic conferences. He was very worried at the start of the talk when he was introduced as "Professor Reinefeld" I think he thought it would frighten people.

The system they described won an IEEE prize for scalable systems and was also presented at the Google conference on scalability. I asked Alexander why publicity about what they had done was so hard to find.
"I'm academic, we usually publish papers," he said.
He'd also said he'd started a company that "wasn't doing very well" (tip to VCs - check this one out and give the guy some help).

So my take on this is that this is one of the sexiest applications I've seen in many a year. I've been waiting for this to happen for a long while. The work is backed by quadzillion Ph.D's and is really good believe me.

On second thoughts don't believe me but check out the video lecture. You can also download the code.


When Alexander had blown my mind Jan Lehnardt popped up for the next section and blew it even further by presenting CouchDB - am I going mad are we seeing the emergence of two killer applications? This cannot be.

Jan Lehnardt has a presentation technique that is a joy to watch - it reminded me of why I love programming.

Jan communicates on two level simultaneously. His body language oozes enthusiasm - he waves his arms so fast and hops up and down so we think that he was either a helicopter in a previous life or that he wants to get a job as a windmill. Words tumble out of his mouth so fast that his tongue often trips over the end of his sentences and falls flat into the middle of his next sentence conveniently missing out the middle of the last sentence.

Check out the video and you'll see that I mean - You'll see Jan almost taking off as he impersonates a helicopter. For more information see the slides of the talk. You can download CouchDb from the Apache incubator site.

This style of lecturing is amazing. Jan communicates simultaneously on two entirely different level. His enthusiasm is received by the amygdala in the limbic system and his slides go via your eyes to the pre-frontal cortex for analytic processing.

So what was this great stuff the Jan was so enthusiastic about?

CouchDb - is an Erlang application that turns a Key-Value JSON store into a system with a RESTFUL interface that stores arbitrary data structures data in a way that fits nicely in with the Erlang system. When I got home I downloaded CouchDb and took a look, and there is was, nicely packaged with a Mochiweb server, the same server which is used by facebook for their comet web chat system.

Like most good ideas the CouchDB is deceptively simple. Once you've seen it you think - yes that's how it should be, how simple, how beautiful. But designing simple things is not easy it requires many false starts and takes a long time to get right. Hats off to Damien Katz for the initial design and to his collaborators, and thanks Jan for telling us about it.


Now it just so happens that both Jan Lehnardt and Alexander Reinefeld both live in Berlin, both are working with key-value stores (the details, vary) which are programmed in Erlang, both of them are working on what might be the next killer Erlang application and ... they have never met.

I introduced them and then stood back. Wow.

After a moment of shyness they changed to speaking German and Jan started bouncing up and down and speaking even faster than in English - this was getting dangerous - this time Jan did turn into a helicopter and narrowly missed causing an accident as he flew out of the room.

Going home

Klacke and I took the same flight back to Sweden the next day.

"That itched my programming nerve," said Klacke

"Precisely ..."

Tuesday, June 24, 2008

Invasion of Privacy

On 18 June the Swedish Parliament passed a law giving sweeping new powers to the FRA (Swedish Defense Radio Establishment) allowing them to wiretap people in Sweden through phone conversations, email, text messages and more.

All people in Sweden using electronic communication can have their communication monitored despite the fact that they are not suspected of committing any crime.

In my view this is in direct contravention of article 12 of the UN declaration on human rights to which Sweden is a signatory.
Article 12.

No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.
This shameful law applies to me - if you send me email (I get a lot of emails from readers of this blog and of my books) then you should be aware of the fact that your mails are not private but will be read by FRA employees - so your rights will be violated.

The Swedish government is sensitive to foreign opinion so if you wish to protest about the fact that your privacy would be violated if you mail me then I suggest you send mail to one of the leaders of the four political parties (Maud Olofsson, Fredrik Reinfeldt, Jan Björklund, Goran Hagglund) that voted this law through. They can be contacted through The Prime Minister and Ministers.

I urge you to make your opinions known.

My daughter asked me:

Does this mean they will read my MSN chats?
I said "yes".
She said "that sucks"

I get pretty pissed off when they confiscate my water bottle when I have to travel by air, but I guess I can live with this but reading my email, and the email of my wife and kids is totally unacceptable. I would never read my wife's or kids' email - such behavior is totally unacceptable in a civilized society.

All this is done in the name of saving us from terrorism - after 9/11 western politicians promised that they would not allow acts of terrorism to change our way of life. Well, spying on 9 million people who have not broken the law is a funny way of "not changing our way of life."

This really pisses me off.

/Joe Armstrong

[In other areas Sweden is a pretty decent place to live, with decent human values, but this new legislation is totally unacceptable]

[Note also - all mail to the erlang mailing list will be monitored by FRA - if this upsets you then please mail the people responsible (see above). If this worries enough people we can move the list to a country that respects human rights]

Monday, May 26, 2008

The Road we didn't go down

I've been following an interesting discussion on the Erlang mailing list where Steve Vinoski and friends have been telling us what's wrong with RPC.

The discussion started on 22 May, the general topic of conversation was the announcement that facebook had deployed a chat server written in Erlang.

In one of the posts Steve said:
"What all those years of CORBA taught me, BTW, is that RPC, for a
number of reasons, is generally A Really Bad Idea. Call it a hard-won lesson. The Erlang flavor of RPC is great because the entire Erlang system has distribution fundamentally designed and built into it, but for normal languages, RPC creates more problems than it solves."

-- Steve Vinoski
Future posts asked Steve to elaborate on this.

Steve posted a long and brilliant summary of the problems with RPC to the Erlang mailing list:
"But if you don't have the time or energy, the fundamental problem is that RPC tries to make a distributed invocation look like a local one.
This can't work because the failure modes in distributed systems are
quite different from those in local systems, ..."
more ...

-- Steve Vinoski
Precisely - yes yes yes. As I read this my brain shouted YES YES YES - thank you Steve. Steve wrote more about this in RPC under fire ...

This the road we didn't go down

Steve went down this road and saw what was there and saw that it stunk, but he came back alive and could tell us what he had seen.

The fundamental problem with taking a remote operation and wrapping it up so that it looks like a local operation is that the failure modes of local and remote operations are completely different.

If that's not bad enough, the performance aspects are also completely different. A local operation that takes a few microseconds, when performed through an RPC, can suddenly take milliseconds.

If programmers cannot tell the difference between local and remote calls then it will be impossible to write efficient code. Badly placed RPCs in the middle of some mess of software can (and does) destroy performance.
I have personally witnessed the failure of several large projects precisely because the distinction between local and remote procedure calls was unclear.
Note that this factor becomes even worse in large projects with dozens of programmers involved. If the team is small there is a chance that the participants know which calls are local and which calls are remote.

How do we do things in the Erlang world?

All Erlang programs are composed from sets of parallel processes, these processes can create other processes and send and receive messages. Doing so is easy and is a lightweight operation.

Processes can be linked together for the purposes of error handling. If A is linked to B and A fails then B will be sent an error signal if A fails and vice versa. The link mechanism is completely orthogonal to the message send/receive mechanism.

When we are programming distributed systems, various forms of RPC are often extremely useful as programming abstractions, but the exact form of the RPC varies from problem to problem and varies with architecture.

Freezing the exact form of an RPC into a rigid framework and disregarding the error cases is a recipe for disaster.

With send, receive and links the Erlang programmer can easily "roll they own RPC" with custom error handling.

There is no "standard RPC stub generator" in Erlang nor would it be wise for there to be such a generator.

In a lot of applications the simplest possible form of RPC suffices, we can define this as follows:
rcp(Pid, Request) ->
Pid ! {self(), Request},
{Pid, Response} ->
Nothing complicated, this code just sends a message waits for the reply.

There are many variations on this theme. The simplest RPC waits forever, so if a reply never comes the client hangs. We can fix this by adding a timeout:
rcp(Pid, Request, Time) ->
Pid ! {self(), Request},
{Pid, Response} ->
{ok, Response}
after Time ->
{error, timeout}
Suppose we wish an exception to be raised in the client if the remote machine dies in the middle of a RPC, then we define:
rcp(Pid, Request) ->
Pid ! {self(), Request},
Response ->
The addition of the link will ensure that the client terminates if anything goes wrong in the RPC.

Suppose we want to "parallelize" two rpcs:
rpc(Pid1, Pid2, Request) ->
Pid1 ! Pid2 ! {self(), Request},
{Pid1, Response1} ->
{Pid2, Response2} ->
{Response1, Response2}
(don't worry this does work, the order of the replies is irrelevant)

The point I am trying to make through a number of small examples is that the level of granularity in the RPC AND the error characteristics is under the precise control of the programmer.

If it turns out that these RPC abstractions do not do exactly what we want then we can easily code our solution with raw processes and messages.

So, for example, going from a message sequence diagram to some Erlang code is a trivial programming exercise.

"Standard" RPC also make the following crazy assumption - "that the reply should go back to the client".

Interactions of the form tell X to do Y then send the result to Z are impossible to express in a standard RPC framework (like SOAP) but are simple in Erlang:
rpc(tell,X,toDo,Y,replyTo,Z) ->
X ! {Z, Y}.
(This assumes the convention I'd used earlier of always sending two-tuples as messages with the Id of the process that is expecting a reply as the first element of the tuple (using self(), in the earlier examples we forced the reply to come back to the originator)).

Let's suppose we want to add versioning to our protocols, this is easy:

rpc(Pid, Request, Vsn) ->
Pid ! {self(), vsn, Vsn, Request},

The point is here is to show that things like versioning, error handling parallelisation etc are easily added if we expose the interface between messaging and function calls and allow the user to custom build their own forms of interactions with remote code.

Of course, certain common patterns of interaction between complements will emerge - theses are what are baked into the OTP libraries.

What is OTP?

OTP is a set of battle tested ways of doing things like RPC in fairly common cases. The OTP methods do not cover all error cases but they do cover the common cases. Often we have to step outside the OTP framework and design our own specialised error and recovery strategies but doing so is easy, since OTP itself is a message driven framework and all we have to do is strip away the stub functions that send and receive the message and replace these with our own custom routines.

OTP should re-branded as "OTP on rails" it's really just a framework for building fault tolerant systems.

Does this method of building software without excessive reliance upon one particular flavour of RPC work?

I'd say the answer is Yes and Yes with a vengeance.

This is the way we have built real-time server software at Ericsson for decades. We have used PLEX, EriPascal, Erlang and C++ with Rose-RT for years. The common factor of all of these is the non-reliance on RPC. We specify protocols then we terminate them with a number of different technologies.

These protocols are way more complex than can be specified using RPCs but by exposing the protocols and the failure modes we can make systems that are highly reliable.

I'd always thought that if we did things with RPCs then we'd run into trouble.

Steve went there and did that and found the problems - we went down a different road.

What's really interesting is that Steve's world and our world are starting to collide - we have a lot to learn from each other.