Reliable Connections are Not



Introduction

When is a reliable connection not a reliable connection?

The answer is - when you really truly need it to be reliable. Connections provided by standard software are reliable up to a point. If you don't care, they are reliable enough. If you do care, they aren't as reliable as you want.

In testing the reliability of the connection, we all refer to TCP's contract and how it uses the theory of networking to guarantee the delivery. As we shall see, this is a fallacy of extension and for reliable applications - let's call that reliability engineering - the guarantee is not adequate.

And, it specifically is not adequate for financial cryptography. For this reason, most reliable systems are written to use datagrams, in one sense or another. Capabilities systems are generally done this way, and all cash systems are done this way, eventually. Oddly enough, HTTP was designed with datagrams, and one of the great design flaws in later implementations was the persistent mashing of the clean request-response paradigm of datagrams over connections.

To summarise the case for connections being reliable and guaranteed, recall that TCP has resends and checksums. It has a window that monitors the packet reception, orders them all and delivers a stream guaranteed to be ordered correctly, with no repeats, and what you got is what was sent.

Sounds pretty good. Unfortunately, we cannot rely on it. Let's document the reasons why the promise falls short. Whether this effects you in practice depends, as we started out, on whether you really truly need reliability.

Specific Failure Modes in Reliable Connections

1. Access To Approximations

Much of the assumption behind TCP is that TCP delivers X and you get X. This is unfortunately only an approximation, due to multiple layers that may or may not impact your use of TCP. In practice, what you get are connection or pipe semantics, and these may be close to TCP.

Here are some specific layer issues.

1.a Access to TCP

You the programmer do not have access to TCP. What you have is access to is sockets. Do these pass through to you the semantics of TCP or not? That's an open question which I've never really looked at but it is doubtful that you could guarantee the claim, simply because we do not live in a world where we prove every protocol stack and every socket layer is correct or equivalent.

1.b Access to Sockets

You as applications programmer probably do not even have access to sockets. Applications programmers generally work with libraries ("frameworks") and languages (Java, PHP, Perl, Python, .NET...) rather than direct OS APIs. Most applications libraries will set up sockets to be passed through as high level constructs. All languages that do not compile down to machine code provide higher layer methods to access system calls.

What do these high level sockets do? Probably much the same as the OS sockets. There's the rub - probably. Who's got time to go read that code to find out? In some cases you can't read that code, and some languages (notably Java and Microsoft equivalents where open source does not keep them honest) are particularly ropey when they come to inserting trickery into their layers on the hope that nobody notices.

2. Incomplete Guarantee

The guarantee is not complete. If I open a connection and write X data and then the connection drops, I do not have a guarantee that it got there - any, all or none! Consider that for TCP if the amount of data is less than a full window length, and the connection closes without a positive acknowledgement then the sender is SOL ("somewhat out of luck").

The receiver has a guarantee that what was received was good (subject to the other constraints here). But the *sender* has no such guarantee. Specifically, the only guarantee the sender has is when a good close happens. And as we shall see, that's not even reliable.

Consider opening a connection and then writing a megabyte buffer. The app parcels up the buffer, and passes it to the OS. The OS parcels up the buffer and potentially slices up the buffer(s) and passes it to the stack. The stack parcels up the buffer(s) and starts sending TCP [1]. TCP parcels up the buffers and starts sending IP datagrams. Down deep over the network, all those little IP packets get passed across and chopped up and re-assembled and mixed up and.... then when they arrive at the other end, the process happens in reverse. But, how the buffers are eventually reassembled and passed to the receiving application is not defined - excepting that when data is passed, it's the right stuff in the right order.

Then, something happens. Bang! We don't care what the failure is, but imagine any failure. The failure wends its way back to your stack, is turned into an exceptional close, and the original Megabyte call gets bounced up all these layers to your application and Bang! Your connection just failed. Got closed.

How much got to the other side? You have no idea. Some software will tell you that you sent X bytes and then when you go to send more, it says it is closed. All software? We don't know. If you're unsure on this point, recall proxies sit in the middle. What is set in your proxy as a buffer size and for block writing optimisation settings?

The sender has no reliable way at the application layer to know how much data has been sent, partly because any failure may or may not overtake any acknowledgements, but also because fundamentally, there is no way for all those layers to actually pass back the reliability information of how many bytes have been sent, except in the narrow case of "I want this again." That is, the implementation of the guaranteed delivery is all oriented to the layer's needs and ultimately the receiver's needs but not to the sender's needs.

The receiver gets a better guarantee anyway because what data he gets is exactly that. The sender has nothing until a good return comes back. Her guarantee is only as good as the last return. And even then, we can see that proxies could muck that up.

3. Tunnelling

As an application programmer your job is to deliver. You really don't care how it is done, you just want it done. So you don't specify TCP sockets with pure semantics and nothing in between, what you specify is a connection. And you generally get quite flexible about how that connection is delivered.

The canonical case is those bloody firewalls. What does everyone do? Everyone tunnels over HTTP. Why? Because those bloody firewalls stop everything else - at least enough times that it is blackspot #1 for connecting applications.

In fact, we want our connections tunnelled over anything we can get, and more than likely we never see a raw connection. "What, you mean there's no tunnelling in place? How do I deal with that?"

So what are the semantics of HTTP? Or SSH tunnels? Or whatever product you happen to be lumbered with? Are they like TCP? Actually, no, well, sometimes, yes. It is possible to tunnel of course - that's what we all do - and enjoy something like a reliable connection but there are some strange artifacts. For example, with HTTP we have redirects, proxies, HTTPS, connection caching and ...

3.a HTTP resends!

As soon as there is any layer in place, you are slave to variable semantics as discussed above in #1. Here's one from tunnelling: HTTP has a wierd mode that if the client detects that the network closes the connection without acknowledging any data, then it is permitted to restart and retry the entire connection [2].

Whoops - this means that if the data has indeed been sent the first time and the other end receives a close but the precise state is muffed in some sense, then it is possible to send the data twice. Perfectly. So if you are in some client that uses HTTP to send some important information then you just purchased two puts on the market dropping, launched two missiles, instructed your robot scalpel to slice 2cm not 1cm towards that femoral artery, or whatever...

OK, this is pretty unrealistic isn't it? Well, no, it's just rare and it's outside normal experience. Chances are that much of the software you use won't implement the HTTP resend feature or won't implement it aggressively enough to trip you up.

But Java does! I know this only because I've debugged pathological cases where connection data was reliably being sent twice. And have you ever wondered why those payments web sites instruct you: "Don't click the button twice!?" Well, maybe it is because a user once clicked it twice, or maybe it is because browsers implement the resend, and the website designer was not aware of this subtlety difference.

Now, network engineers will just blame the user, or point out how the document clearly states that this should not be done, or those are the wrong circumstances (GET not POST), or that one or other of client or server is non-conformant, or you should be using SOAP or ....

Application programmers will nod politely, move away, and correctly conclude that network engineers are FOS ("full of subtlety?").

There is simply no way for the coder who is stuck up top of a shaking shuddering pyramid of networking layers to know what is correct and what is not ("idempotent ... what the hell does that mean? where can I buy one?"). The only thing you can conclude is that the stuff you are dealing with might just decide to resend your entire connection. Completely. Some documents (RFC2068) say it shouldn't do it more than twice. Gee, thanks guys.

(Evil Footnote. Must work out if REST requires idempotency ...)

4. Proxies

Anything in the path that interferes with the data can change it. Consider proxies. What are they doing? Are they changing the data? They can - easily - because the checking that causes the guarantee of the connection semantics is at the TCP level and with a proxy we have by definition two TCP connections and a proxy sitting like an MITM in the middle.

Considering that there are firewalls that play around, redirects on your local machine to get access to the port 80, and here's a new and upcoming one: Stateful Packet Inspection. What this does is check inside the packets looking to see what it's all about. Maybe this is a bad packet? From parsing your packet it isn't a big step to rewriting it.

Sounds ludicrous? Read on:

5. Router Unreliability

There are some pathological failings with routers. These aren't *directly* strikes against TCP, but what they do is trip up all the other problems that exist with TCP, exacerbating the situation. Only by understanding the flaws in TCP *and* the router is it possible to see why these are such a nuisance.

In general, the cheaper your router(s) the more problems.

5.a DHCP ISPs

DHCP is a common mechanism for connecting without a static IP by instead allocating an IP dynamically (plus all the other things needed). This is most usually used by ISPs for basic userland connection. There are several aspects here; firstly there is the original motive of a shortage of IP#, secondly there are mobile users, and finally there is the use of this to discriminate (in the marketing sense) between customers who pay less and those who pay more.

Some ISPs will allocate one static IP according to your account so that it while all the DHCP stuff works, it is in effect a static IP. This is good because it means that when modems and routers drop out for their many reasons, the TCP connections can recover.

Other ISPs drop the DHCP assigned dynamic IP number and force you to get another. And they rotate them. This is bad (at least for this topic) because when this happens all the TCP connections break - they are no longer talking from the same place and nobody out there knows where to send the new packets.

A related mode to this is that on some ISPs, the connections all get closed at some router layer. I'm not sure why this is but it feels different to the DHCP allocation.

5.b Game Mode

Routers that were sold into the commercial market had a special mode called Game Mode [3]. What this mode did (does?) was to change any string in the stream that looked like your IP number. Effectively, if you set up your game to be on your local NAT-protected network (192.168.0.1) and the server out on the Internet saw your IP as some real IP number, then the game would not work because the data was saying one IP number and the network was saying another.

Routers were changing the data dynamically and repairing the checksums so as to trick the software checks. Something around 4Gb of data was needed for this string to roll around arbitrarily so it wasn't a big deal. This artifact was discovered by peer-to-peer applications that were used to move huge files (OS installs, MP3s, movies) discovered that their end-to-end checking was failing.

5.c Checksums & CRCs

So just turn your router off of game mode and carry on? Nope. There is a 16 bit Checksum in TCP which provides some protection. That's probably sufficient for a reliable link, but it's not ok for a lossy link that doesn't do underlying CRCs. If 100's of MB of data are passed across, eventually enough errors will pass across to slide an incorrect packet but correctly checksummed packet across; one empirical study found that

"After eliminating those errors that the checksum always catches, the data suggests that, on average, between one packet in 10 billion and one packet in a few millions will have an error that goes undetected [4]."

(From Philipp - research this for layer details.) Also, see this detailed investigation into a checksum bug found in Linux kernel. Rare but a killer:

"The Linux Kernel has a bug that causes containers that use veth devices for network routing (such as Docker on IPv6, Kubernetes, Google Container Engine, and Mesos) to not check TCP checksums. This results in applications incorrectly receiving corrupt data in a number of situations, such as with bad networking hardware. [5]."

5.d Router Crashes

The Tor faq reports:

Tor servers hold many connections open at once. This is more intensive use than your cable modem (or other home router) would ever get normally. So if there are any bugs or instabilities, they might show up now. If your router/etc keeps crashing, you've got two options. First, you should try to upgrade its firmware. If you need tips on how to do this, ask Google or your cable / router provider, or try the Tor IRC channel. Usually the firmware upgrade will fix it. If it doesn't, you will probably want to get a new (better) router.

Tor uses onion routing which means (say) 3 nodes passing your connection on from one node to the next and to the next. This means that Tor is more susceptible to the problems described here according to how many nodes are in the link. That is, 3 layers in the onion means 3 times as many problems, and hence they are more than normally sensitive to these issues [6].

6. Secure Protocols

"Connections are an illusion. Secure connections doubly so."
-- Brian Warner, Zooko O'Whielacronx, and Dirk Gently

And of course there is security. That's why we use SSL for HTTP and SSH tunnelling for everything else of course. But, even that's not totally secure. Consider the case of the Great Firewall of China.

6.a The Dreaded MITM Bogeyman

In China, all speech is controlled. The chinese government has erected a huge firewall (built out of western technology, eagerly supplied by brand-name suppliers of course) to monitor speech. As a chinese dissident you can diss all you want, as long as your SSL browsing connects up to their firewall, gets opened up and inspected, and then forwards on to wherever you wanted to talk privately too.

It's pretty easy to do this, because the browsers do not defend against a determined MITM (man-in-the-middle). The obvious ploy is to simply get a CA-signed certificate and use that, but there are plenty of other possibilities.

Most westerners would leap to the defence of SSL and say, that only happens in China. Well, no, there is some anecdotal suggestion that it is possible more widely than that. Imagine an ISP that decided to datamine your traffic and sell it on. The best traffic is the protected stuff, so how could we get hold of that? Easy. For all users, give them a user disk to install or simply give them some instructions to insert special certificates into their browsers so that the ISP's proxy is an acceptable SSL hop. Then, datamine all you want.

Now, I don't know how likely this is. There's enough fact in there to make it a concern. And enough conjecture to make it salacious. But for the wise applications programmer, this is just yet another strike against the reliable connection - SSL and SSH may promise you reliable and secure connections, but there are limits to that promise. If you want absolute reliability, using an SSx tool isn't going to give it to you.

(Notice here that there are flaws equally split across all crypto tunnelling tools, and although the flaw described above is a TLS one, we could make a similar case for SSH.)

Conclusions

By now, you've either stopped reading, or got this far and are mighty depressed. So what does all this mean?

For reliable applications you have to do it yourself [7]. That means unfortunately layering a TCP-like protocol across the top of TCP. Boring and stupid but that's the price of reliability.

The good news is that this is mostly done anyway. Most applications have forms of end-to-end checking. It might be an application layer checksum, it might be a digital signature, or it might extend out to the grey layer and have the user punch the big bright refresh button when she sees something odd.

The important thing to realise is that rarely does an application take the data offered and assume it is perfect. Applications work with bad data delivered by "reliable connections" and offer various strategies to correct it. At least, the apps that made it out of beta do. It is just that the developers do not realise where the problem really lies - in their assumption that using reliable connections means they get reliable data.

Which means what? Stop using connections?

No, not quite. It means that you should use connections for what they deliver - ordered data in a stream fashion, mostly with reliability. If that's good enough for you, then you are fine - and obviously connections and TCP and tunnels and proxies and the rest are all reliable enough for most purposes, else you wouldn't be able to read this page.

But if you really truly need reliability (like we do in financial cryptography) you will probably find yourself adding extra reliability in at the higher layers. So consider how that effects the entire application - and make the app's needs drive your use of network protocols, not popular myths on reliability.

References

[1] René Pfeiffer, TCP and Linux' Pluggable Congestion Control Algorithms suggests that Linux starts at 32k window size and may grow to 4 MBytes. (Added after first publication.)

[2] RFC2068, RFC2616 8.2 Message Requirements 8.1.4 Practical Considerations 9.1.2 Idempotent Methods

[3] Torrents stop at 99 percent for some tips.

[4] Jonathan Stone & Craig Partridge, " When The CRC and TCP Checksum Disagree," 2002? Also, Table 5, inverting High/DORM gives 15 million packets and rising. Also, Stone, Greenwald, Partridge, Hughes, "Performance of Checksums and CRCs over Real Data," ? ton98.pdf

[5] Vijay Pandurangan " Linux kernel bug delivers corrupt TCP/IP data to Mesos, Kubernetes, Docker containers," 2016

[6] Thinking aloud here, one wonders if they initiate transparent restarts to overcome connection failures on the nodes, and if this leads to more than normal data problems? Note that Tor people have a Faq entry on datagrams that indicates in brief why they haven't done datagrams.

[7] This might also be called the End to End Principle as espoused in: "End-to-End Arguments in System Design," Saltzer, J., Reed, D., and Clark, D.D., ACM Transactions on Computer Systems, 1984.