Monday, May 26, 2008

The Road we didn't go down

I've been following an interesting discussion on the Erlang mailing list where Steve Vinoski and friends have been telling us what's wrong with RPC.

The discussion started on 22 May, the general topic of conversation was the announcement that facebook had deployed a chat server written in Erlang.

In one of the posts Steve said:
"What all those years of CORBA taught me, BTW, is that RPC, for a
number of reasons, is generally A Really Bad Idea. Call it a hard-won lesson. The Erlang flavor of RPC is great because the entire Erlang system has distribution fundamentally designed and built into it, but for normal languages, RPC creates more problems than it solves."
more...

-- Steve Vinoski
Future posts asked Steve to elaborate on this.

Steve posted a long and brilliant summary of the problems with RPC to the Erlang mailing list:
"But if you don't have the time or energy, the fundamental problem is that RPC tries to make a distributed invocation look like a local one.
This can't work because the failure modes in distributed systems are
quite different from those in local systems, ..."
more ...

-- Steve Vinoski
Precisely - yes yes yes. As I read this my brain shouted YES YES YES - thank you Steve. Steve wrote more about this in RPC under fire ...

This the road we didn't go down

Steve went down this road and saw what was there and saw that it stunk, but he came back alive and could tell us what he had seen.

The fundamental problem with taking a remote operation and wrapping it up so that it looks like a local operation is that the failure modes of local and remote operations are completely different.

If that's not bad enough, the performance aspects are also completely different. A local operation that takes a few microseconds, when performed through an RPC, can suddenly take milliseconds.

If programmers cannot tell the difference between local and remote calls then it will be impossible to write efficient code. Badly placed RPCs in the middle of some mess of software can (and does) destroy performance.
I have personally witnessed the failure of several large projects precisely because the distinction between local and remote procedure calls was unclear.
Note that this factor becomes even worse in large projects with dozens of programmers involved. If the team is small there is a chance that the participants know which calls are local and which calls are remote.

How do we do things in the Erlang world?

All Erlang programs are composed from sets of parallel processes, these processes can create other processes and send and receive messages. Doing so is easy and is a lightweight operation.

Processes can be linked together for the purposes of error handling. If A is linked to B and A fails then B will be sent an error signal if A fails and vice versa. The link mechanism is completely orthogonal to the message send/receive mechanism.

When we are programming distributed systems, various forms of RPC are often extremely useful as programming abstractions, but the exact form of the RPC varies from problem to problem and varies with architecture.

Freezing the exact form of an RPC into a rigid framework and disregarding the error cases is a recipe for disaster.

With send, receive and links the Erlang programmer can easily "roll they own RPC" with custom error handling.

There is no "standard RPC stub generator" in Erlang nor would it be wise for there to be such a generator.

In a lot of applications the simplest possible form of RPC suffices, we can define this as follows:
rcp(Pid, Request) ->
Pid ! {self(), Request},
receive
{Pid, Response} ->
Response
end.
Nothing complicated, this code just sends a message waits for the reply.

There are many variations on this theme. The simplest RPC waits forever, so if a reply never comes the client hangs. We can fix this by adding a timeout:
rcp(Pid, Request, Time) ->
Pid ! {self(), Request},
receive
{Pid, Response} ->
{ok, Response}
after Time ->
{error, timeout}
end.
Suppose we wish an exception to be raised in the client if the remote machine dies in the middle of a RPC, then we define:
rcp(Pid, Request) ->
link(Pid),
Pid ! {self(), Request},
receive
Response ->
Response
end.
The addition of the link will ensure that the client terminates if anything goes wrong in the RPC.

Suppose we want to "parallelize" two rpcs:
rpc(Pid1, Pid2, Request) ->
Pid1 ! Pid2 ! {self(), Request},
receive
{Pid1, Response1} ->
receive
{Pid2, Response2} ->
{Response1, Response2}
end
end.
(don't worry this does work, the order of the replies is irrelevant)

The point I am trying to make through a number of small examples is that the level of granularity in the RPC AND the error characteristics is under the precise control of the programmer.

If it turns out that these RPC abstractions do not do exactly what we want then we can easily code our solution with raw processes and messages.

So, for example, going from a message sequence diagram to some Erlang code is a trivial programming exercise.

"Standard" RPC also make the following crazy assumption - "that the reply should go back to the client".

Interactions of the form tell X to do Y then send the result to Z are impossible to express in a standard RPC framework (like SOAP) but are simple in Erlang:
rpc(tell,X,toDo,Y,replyTo,Z) ->
X ! {Z, Y}.
(This assumes the convention I'd used earlier of always sending two-tuples as messages with the Id of the process that is expecting a reply as the first element of the tuple (using self(), in the earlier examples we forced the reply to come back to the originator)).

Let's suppose we want to add versioning to our protocols, this is easy:

rpc(Pid, Request, Vsn) ->
Pid ! {self(), vsn, Vsn, Request},
receive
...
end.

The point is here is to show that things like versioning, error handling parallelisation etc are easily added if we expose the interface between messaging and function calls and allow the user to custom build their own forms of interactions with remote code.

Of course, certain common patterns of interaction between complements will emerge - theses are what are baked into the OTP libraries.

What is OTP?

OTP is a set of battle tested ways of doing things like RPC in fairly common cases. The OTP methods do not cover all error cases but they do cover the common cases. Often we have to step outside the OTP framework and design our own specialised error and recovery strategies but doing so is easy, since OTP itself is a message driven framework and all we have to do is strip away the stub functions that send and receive the message and replace these with our own custom routines.

OTP should re-branded as "OTP on rails" it's really just a framework for building fault tolerant systems.

Does this method of building software without excessive reliance upon one particular flavour of RPC work?

I'd say the answer is Yes and Yes with a vengeance.

This is the way we have built real-time server software at Ericsson for decades. We have used PLEX, EriPascal, Erlang and C++ with Rose-RT for years. The common factor of all of these is the non-reliance on RPC. We specify protocols then we terminate them with a number of different technologies.

These protocols are way more complex than can be specified using RPCs but by exposing the protocols and the failure modes we can make systems that are highly reliable.

I'd always thought that if we did things with RPCs then we'd run into trouble.

Steve went there and did that and found the problems - we went down a different road.

What's really interesting is that Steve's world and our world are starting to collide - we have a lot to learn from each other.

16 comments:

Anonymous said...

Jim Waldo said it all first and most explicitly in his paper
"A Note on Distributed Computing"

http://research.sun.com/techrep/1994/smli_tr-94-29.pdf

Unknown said...

@Anonymous: yes, Jim and crew did a wonderful job on that paper. (I used to work with Jim at Apollo and HP before he went to Sun, and in fact worked with all four authors when HP and Sun were doing joint distributed systems development in the early 90s.) If you had read through the links Joe provided to my postings and articles, you'd have found that I reference Jim's paper liberally and often.

Without trying to take anything away from that paper, however, I must correct you a wee bit: you said that Jim's paper "said it all first," which isn't quite true. For example, there's nothing in that paper that talks about language mappings and impedance mismatch between "network" types and programming language types, nor does it talk about the brittleness of generating code from IDL, nor does it talk about the coupling introduced by such code, etc. That paper says some very important things, which is why I always refer people to it, but again, it doesn't "say it all."

chris said...

I've often wondered how the linking mechanism is implemented. More specifically, how are the signals/notifications implemented? I'm referring to the signals/notifications sent to a client when a remote machine (or service?) dies. I don't understand why the notification mechanism doesn't suffer from the same how-do-I-know-what's-going-on-over-there problems that basic RPCs suffer from in the first place. In other words, has the unreliability just been shuffled off into that linking/signalling/notification layer? No? Why not? What magic makes the notifications inherently more reliable than the RPCs? I understand that the link/signalling mechanism is orthogonal to the messaging mechanism--I'm not asking about the messaging, per se. Just wondering how much one can trust or rely upon the linking mechanism proper.

Thanks,
Chris

Anonymous said...

@Joe -the link to the RPC under fire article is broken.

@Anonymous: yes, Jim's paper is seminal, and it is even more impressive when you recognise he was faulting LAN-based systems, not cross-internet. All that Ed and I added in our criticism of JAX-RPC was to point out that it was still a victim of the same flaws: the assumption that the best way to present long-haul communications is by blocking procedure -now- method call.

What needs to be considered is whether blocking method invocation is the main metaphor used in Java/C# style languages, because that is the only metaphor the language allows. You have objects, you have methods, so everything has to make your communication look like method calls on objects, passing other objects as parameters. So even though everyone has known for a long time that RPC as a metaphor sucks, the underlying languages offer no easy way out.

Thomas Lindgren said...

Chris,

Unless I'm misremembering or am out of date, the erlang distribution protocol (which handles links between nodes among other things) basically provides a heartbeat as part of its services.

More complex failure detection is possible. Per Bergqvist did a lot of work on this; see his presentations at the Erlang User Conferences around 2000 or so.

Anonymous said...

One of the things about RPC is that it is verb-based, which unduly couples caller and callee as they must have shared knowledge. A pure message-passing system is not only asynchronous, thus allowing for more parallelism, but it is noun-based - any unknown messages can be dropped and the errors handled at a higher level. This is one of the reasons that RESTful systems are less fragile than SOAP-based ones.

Mark Lee Smith said...

Correct me if I'm wrong but the argument against RPC boils down uninformed programmers not being able tell the difference between local and remote calls?

Are you seriously telling me that the big problem with RPC is bad documentation and poor naming conventions? That sounds a hell of a lot more like a project management problem than a technological one. Maybe the RPC advocates should take this as a compliment.

Here’s a simple solution that addresses these problems: prefix remote calls with ‘remote’.

If we’re done being silly and dismissing whole families of language and technology with no real justification, what am I missing here? I’d love to start thinking highly of the Erlang community again. Please don’t take this the wrong way, I’ve had my fill of preachy Erlang users for the now.

Mark Lee Smith said...

Oh, before someone jumps in and beats me ragged with my own prose let me say that I’m quite fond of the actor-based concurrency model seen languages like Erlang.I'm not fond of the Erlang community as a whole.

Chris said...

I too was going to cite Jim's paper, but I see someone has beaten me to it (and yes, @anonymous, there are liberal references to it from Steve's writings:-)

@thomas, heartbeats == FAIL. There's nothing worse than flooding the network with a bunch of useless heartbeats. One of those ideas that "seems like a good idea at the time"... then someone actually does the math.

@joe, I did want to clear up one misconception about SOAP from your post above. SOAP is _not_ an RPC framework. It is inherently a one-way messaging protocol (read the spec). That said, the way that SOAP has been used, primarily, has been to implement RPCs, and that is a shame. Unfortunately, certain unnamed persons in a position to make such decisions decided that JAXRPC would be the principle means by which the Java community leveraged SOAP.

Cedric said...

So how can I tell a remote process from a local one in Erlang?

Anonymous said...

@netytan: prefixing your method call with 'remote' isn't enough, because you are still blocking. The only way to model 'state' in a chain of blocking RPC calls is to create a private thread for the operation, or implement a state machine. Oh, and then there is the failure handling. Every Java RMI call throws a RemoteException, but you normally ignore that and throw it up, rather than look at the many ways it can go wrong and try and handle it. RPC was a wonderful idea, and I remember the joy of encountering it 20 years ago at CERN while still an undergraduate. It's time to move on.

@Chris. Heartbeats work if done efficiently. Java RMI Remote GC and DCOM are both examples of what not do, but there is no reason why an entire cluster can't track the presence of other parts of the cluster at the expense of one multicast/node/heartbeat. What is wrong is for every app to do its own liveness protocol or to have more than one.

Sam Leitch said...

Is this a discussion of the downfall of RPC or a discussion of synchronous vs. asynchronous? There are ways to do RPC asynchronously:

rpc_call(parameters, callback)

The only difference between this and your example is that I've used function parameters and you've used a message. Both examples have to convert to a network message before transmission. Both can account for timeout and loss of connections (through exceptions). The question then becomes "What's the best way to do async procedure calls?"

RPC doesn't have to mean "dumbing down" the interaction. I see it as a way to promote loose coupling and therefore separation of concerns.

RPC leaves the developer to worry about the program, not the interaction between remote computers. The interaction and error handling between remote computers is very important, and equally complicated to the problem that most applications I've written using RPC are trying to solve. However, it's much easier to solve a problem by looking at it one problem at a time. So, by moving the interaction to a proxy and using RPC, you've separated the problem of dealing with connectivity and message translation to the proxy. You're then free to deal with the problem that the application is trying to solve.

The problem of creating a robust, fault-tolerant proxy is not simple by any means. However, by separating the proxy from the application using RPC, the proxy is free to change from a high-impedance format (like most XML) to a low impedance format, local to remote, single-threaded to thread-per-connection or thread pooled, single-processor to multi-processor, etc.

If that eventually means moving to an asynchronous RPC interface, then both sides need to change, but it still promote the same loose coupling.

Mark Lee Smith said...

@Steve: The problem outlined above was that programmers were accidentally making remote calls thinking they were local, and that this eventually lead to the failure of the project.

Prefixing remote calls clearly distinguishes remote and local calls, so no more problem :).

As for the failure modes I'm certain that a good implementation could propagate the appropriate errors, allowing failures to be handled effectively. With the right abstraction there's no reason why the remote call action can't be performed in parallel. This is easier in some languages than others, but you can do it with standard languages.

I'm not a fan of exception handling as it appears in Java and others, but I don't see any inherent problem with RPC. All of the oppositions I've seen to RPC seem to hinge on poor implementation.

Objective-Cs distributed object framework is very well designed. I've had no problems with it. Maybe that's because Objective-C is message-based, but I doubt it.

Alexei Polkhanov said...

After working with CORBA for several years I gradually became disappointed with its complexity and poor performance. I remember comments by Michi Henning who co-authored together with Steve Vinoski now famous CORBA manual for C++ developers ("Advanced CORBA Programming with C++") Problem with technologies like CORBA and RPC is that in many cases you cannot ignore network errors, you cannot ignore security and most of all you cannot ignore latency and bandwidth limitations. Since "gold years" of CORBA there were many attempts to create middleware solution that does not have these problems.
Michi Hennings himself now working on yet another middleware solution - Internet Communications Engine But if anyone want to know my opinion - remember End-to-end principle? So to my opinion even TCP(and UDP) protocols are redundant and unnecessary layers because in many cases developers still end up re-implementing many services they provide in application itself. If you think I am insane - ask yourself why do you send your own keep-alive, and procedure invocation ACK messages over TCP?
I think I cannot say it better that it already been written here: "The Rise of the Middle and the Future of End-to-End".

Bwooce said...

@alexei "If you think I am insane - ask yourself why do you send your own keep-alive, and procedure invocation ACK messages over TCP? "

Hmm. I see this approach on some other protocols (notably XMPP, originally specified in terms of a TCP implementation).

The harsh reality is that TCP does not provide application to application reliability, or even (by default) useful heartbeating. The problems are:
1. the loose binding between the asynchronous sending of packets and any failures -- it is impossible to determine how much was sent from the host TCP buffer when you get a socket_closed notification. There are are OS/platform specific ways of determining how much data was in the OS-level TCP send buffer, but the problem is mirrored on the recipient host whose buffer can have just received a packet and acked it.
2. the heartbeating process defaults to "too slow" at 2 hours, and when sped up is still too slow for many applications. It ends up not being particularly useful because of problem #1.

I've seen very little discussion on point 1, although people that deal with TCP all the time do know them. It seems that protocol designers are not always TCP literate though. It can be correctly argued that this is a limitation of the BSD socket API rather than the protocol, but this API/approach is a fairly universal implementation for every OS.

Off-topic: I despair when I see XMPP pushed as a generic reliable transport because of these issues. The default impl. over TCP is unreliable when compared with DIAMETER or AMQP, and the suggested response of implementing XMPP over HTTP (over TCP) makes me want to scream.

Anonymous said...

Sorry for a bit outdated comment, but there was yet another good article: A Critique of the Remote Procedure Call Paradigm (Andrew S. Tanenbaum,
Robbert van Renesse). I suppose it had been published in 1988.