Abstract: Analysis of the architecture and message flows of Instant Messaging and Payments systems reveal that they are very similar; more so than would be expected for applications that are nominally separated in the user's minds.
This paper examines merging payments and messaging for the Internet.
A funny thing happened when the second generation of the Ricardo payment system was trialled back in 1999. Our users found it convenient to send unrelated messages using the payment system. They would do this by typing some random chat message into the Memo field, and then pay the person to whom the message was targetted.
Payments being reliable, the user's message would eventually get picked up by the software client and displayed, as part of the notification process. Once a payment was so received by a user, commonly she would then create a new payment, fill in the memo field and send back a reply.
This cycle became so pervasive that I found that to test the software, the best way was to find some external user and hold a series of messaging cycles. In effect, we were conducting a chat session, using a clunky and inconvenient sideeffect of the payment system.
Not being the fastest of strategic cookies, this practice went on for 2 more years, until a discussion arose amongst programmers around integration of other functions into the payments client. It was mostly hypothetical, and exploratory, and went like this:
Manager: I think we need to consider how to get support improving with the users... ideally, VOIP in the payments client ... Programmer: No way. Manager: Perhaps it would work for other purposes too. Programmer: Hell will freeze over first. Manager: Even chat would make our life easier ... Programmer: This is impossible. Manager: How does a help desk communicate? Programmer: Not my problem.
But the seed was sown. What if chat was added to a payments client? In 2002, we took the plunge and commissioned a chat plugin to WebFunds. It was delivered by the end of the year, and trialled in time honoured fashion by chatting with users.
Unfortunately, our protocol had one missing link; the ability for server to send to client. Adding this proved to require a radical rewrite of the protocols, partly because it rejected in our theoretical model of reliable communications over an unreliable Internet, but also because our implementations had been lazy about how they dealt with the request-response model. Leaving aside the details, wind forward to 2005 and we have now dynamic movement of messages from server to client (as well as the reverse, of course).
So now we have instant messaging over a payment system, or, is it instant payments over a messaging system? So what does that mean, exactly?
In 1983, the New York Federal Reserve building in lower Manhatten was flooded. For a day, all the computer systems that drove the Fedwire system were down, and banks could not upload their tapes.
Only a day or so of time was lost but the real damage was not discovered until the computer systems were back up. Then, as tapes were fed into the system, it became clear that nobody knew where the status of the system was.
As tapes were fed back in, transactions and transfers that had been done already were being redone, and ones that were missed out remained missed. In Internet terms, the tapes formed a connection over which a stream of data was fed in. On the surface, the system exhibited reliability as the operator could load the tape, watch it spin to the end, and unload it. Yet, below the surface, this was an unreliable system as there was no information available after the floos as how far each tape had spun through it's transactions.
The science of networking recognises the coordination problem [EC]. It states that a sender of a message does not know whether her message gets to the recipient, except if the recipient sends the same message back as indication of having received it. This acknowledgment by the recipient then becomes the key to all communications, but it carries a cost with it in its most basic form that a second copy of the message has to be sent back for every forward message.
When many messages need to be sent, many acknowledgements also are needed. To avoid the cost of these, there are many tricks that computer scientists can play. For example, in place of the entire packet, only a a number representing the packet is sent. Or, better yet, only the last packet number is sent. The aggregation of these acknowledgements is done in a construct known as a connection which is tasked to keep track of the details.
The connection is of course employed routinely on the Internet nominally to save bandwidth but more importantly because it reduces the programmer time needed to overcome vagueries of lost packets.
In the Fed's computer center, each tape played this role, with a single acknowledgement for every entry on the tape (and sent manually back to the initiating bank). The flirtation with streaming had then to be unwound and each event needed to be captured as a discreted message with a format, a checksum and a unique identifier permitting safe resending of that single message.
Since the flood, the Federal Reserve moved its financial system moved back from streamed tapes to a messaging or datagram paradigm. In 1985, SWIFT followed the Fed's lead and formalised its communications structures in messages. In a sense, this was a step backwards, as the world's financial systems had already been using messages since the days of the telegram. The availability of proprietary networks and magnetic tapes had promised enourmous gains in volume, but only with a loss of status that was recognised fully under the disaster conditions of 1983 flood.
This message orientation was to be unchallenged until the arisal of the World Wide Web another decade later.
What's the web got to do with it?
Early net payment systems were message based [DC]. Yet they were not to survive, partly due to their costs of operation. A cheaper alternative emerged in stable form in 1998 and 1999, being the use of credit cards and the browser payment system [DC]. Use of the browser to secure payment instructions grew quickly, far more quickly than any alternative, because of the convenient access.
The browser client has a problem - it is connection based. Each session is protected by a cryptographic protocol called SSL that provides a secure connection between the client and server. All requests for the web page at hand are aggregated over that connection, and as the connection is fundamentally a streaming problem, we strike the same problem that the Fed and its community of banks struck in 1993, except at an individual level.
The web page method of collecting payment instructions shows this clearly. On credit card collection pages, or payment systems such as Paypal or Goldmoney, pages will often include a warning to the effect of "do not click on this button twice!" The instruction to pay is part of the stream delivered from client to server, and there is no capability within the connection-based system to reliably deliver the message once and once only.[DC].
This lack of a message based architecture will come back later to haunt it, but not in the sense of reliability of payments. What it did do was raise support costs on the payments to some non-trivial fraction.
Consider the costs of a payment system. The major cost driver for payment systems is support. It is not the cost of the payment that matters, but the cost of the failure. Each failure involves human time, which is charged at dollars to the minute. Each success involves a few numbers moving around in a database, or at the extreme, an email being sent out and some page updates.
Support for payment systems on the web started out simply by listing the telephone number. Soon however this was superceded by email support and with some degree of success, chat-based support. The benefit of email was realised by the basic assumption that if a user has a web-based payments account, then she has an email account. And is quite comfortable using it!
Instant messaging, or chat, is possibly a superior method for support. One skilled operator can conduct 3 or 4 chat sessions at once, which is better than telephone! Yet, information can be built up efficiently based on feedback, making it much faster than email.
Which brings us full circle to where we started. It has been essential since the dawn of Internet time to supplement the payment system with a message-based communication system. At the fundamental level, a message needs to be sent to the receiving party. At the technical level, messages are needed to deal with reliability problems inherent in the net, and now we see that messages are needed to support the user.
There are then potential benefits to be had from adding messaging facilities to a payment system. But before looking at the possibilities, perhaps we can flip the statement around: are there potential benefits to be had adding a payment facility to an instant messaging system.
Economics has it that a payment is part of an exchange of value. The Austrian school has extended this view to say that an exchange of value results in information passing through the economy about relative needs and wants.
In the Austrian sense, each payment is a message, but it is a message between actors in the wider economy, rather than a smiley between two chat lovers. Looking further, at what is exchanged, the goods for the payment, there are virtual goods that can be exchanged for money: music is one popular one at the monent.
What is more interesting is not the good that is being traded but the process of the trade itself. Some trades are simple - get a price, say yes or no. Others are more complex. Consider dealing with travel agent in order to book a trip. By the time the deal is sown up, some 100 discrete messages might have shuffled back and forth between the parties.
Hence, for that sort of merchant and trade, there needs to exist an information passing environment. This negotiation process will benefit far more from an efficient messaging system than from an efficient payment system, although both need to work in the end. It should then be no surprise that complex deal making is now being experimented with on instant messaging systems.
All that might then be needed to sew up the deal is to complete the payment. Adding the payment to the chat client would then close the deal efficiently.
It seems there is a case for putting either feature into the other's facility. We should add payments to instant messaging and we should add IM to payments. Can we go further? Can we, indeed, to making the payment a message, and the message a payment.
If we look at the structure of an Internet payment system and an Instant Messaging system, we see some similarity. Both have a single central server, and both have clients that talk to that server. [p2p]. Likewise, the messaging that logically flows is similar. The initiating client sends a message to the server, the server records it for the receiving server, and notifies the receiver if he is online. Confirmation in both cases is primarily handled by the server.
At the data level, again there are similarities. A target and a source. Routing information, and some data attached. The amount
One would like to make a comparison with the reliability and importance of a payment over a chat message, but the chat message comes out more favourably, as does the email. Payments simply aren't that reliable, and would do well to take the advice of Internet engineers in some respects.
This is the age old battle that computer scientists know as between two paradigms: connection-oriented and message-oriented communications. The connection oriented or streaming approach adds reliability caused by the lossiness of messages, but it does so at the expense of edge effects. When the connection goes down, it is impossible to tell how much additional data went through, and at the extreme, when all data is sent in a window large enough to cover the entire message, the connection does not say whether the data went zero, once or many times.
Using connections in programming is ultimately an expression of cost-effective reaonable reliability if the result is not so important. Such a decision works well for a web page, for example, where reloading can be finessed by the user, and seeing the same web page twice does not cause a problem.
Yet for payments this could be a disaster, and connections are not advised.
Internet payments architects know this. Payments based on digital cash concepts (DigiCash) or nymous concepts (Ricardo) use message based packets, with unique identifying numbers. The messages are sent 0, 1 or many times and acknowledgements are returned as many times as needed. The result is deterministic, as the client repeats sending until the single action is done.
Consider phishing. The most prevalent form is an is an attack on the web browser's security model. This is simply because the web browser is the most likely software tool for the user to access their payments facilities.
It breaches the security model by introducing a spoof URL into the browser. That URL takes the user to a site which is a copy of her real banking site. There are many variations but we stick here to the canonical phishing attack.
The primary defence against phishing would be the browser's security model, which has in it the capability to detect that the site is unknown to the user [ssl]. Most attention however has been focused on the secondary defence, which is the mechanism of transmission of the phish. This is most commonly email, but instant messaging and even cellular phones and pagers have been employed.
The attention to the communications mechanism is misguided due to the difficulty of ensuring any related security requirements. Indeed, early efforts have handed phishing a bonus. Domain Keys as predicted enabled phishers to better identify targets and to better target attacks. Phisher takeup was far in excess of user take up, almost guarunteeing a failure.
Need to check out how this all panned out.
This secondary defence is weak because of one issue: There is a dramatic distance between the payment mechanism and the message mechanism. That is, in security terms, even if the mail is secured (whatever that may mean) how is it possible to relate that security to the security requirements of the payment?
Yet, if the problem is recast as the message system being the payment system, that distance shrinks to zero. When the message and the payment have the same security characteristics, then a breack in the message system is a breach in the payment system. By a process of elimination (e.g., bankrupcy) we can now assume that a message sent by payment system vendor to the user is correct. Links can be clicked on safely again.
What does this mean for payment systems? A lot depends on how viable a payment system is when done in a p2p framework. Assuming that this is viable, existing payment systems will face several hurdles.
Firstly, speed. Can the payment be settled as fast as an instant message between two parties? If not, consider rewriting.
Secondly, access. Can the payment system be made available to two cooperating parties over the Internet? Systems such as Paypal and Goldmoney exhibit these characteristics, but those of banks do not. Where a system is in terms of its availability for dynamic access will say a lot.
Thirdly, reliability. Do the messaging systems that are fielded exhibit sufficient reliability for Internet service? This is a subtle point. Most payment systems are unreliable at the technical level, and rely on customer support layers to clean up the failures. This will not work on the Internet, simply because the volume of payments is expected to rise dramatically. One system shows a velocity of money as about 10 times that of conventional systems.
Systems that are unreliable include Paypal and most banking operations. Systems that are reliable - hard money - include Goldmoney and other gold systems [May].
Similar questions can be asked about P2P systems, although the emphasis is different.
Firstly, are the basic messaging protocols reliable enough for money? Traditionally, p2p systems have not had to face the issue of a transactionally secure system, as a lost message is no big deal.
Secondly, are the systems secure? Nobody steals a chat message, but that all changes when there is money moving across the wires. Up until now, the security model of p2p has been ill-thought out, with the normal misattention paid to the wire, and practically no attention paid to nodal threats. This reflects much of the conventional security thinking of the net, but it is fair to say that systems of security such as SSL did not evolve in an adverse environment, and thus their first test against a threat (phishing) was a cake walk. For the phishers that is.
Are the clients and nodes transactionally secure? Do they consider that a message that is a payment suddenly has a value well and truly beyond that of a smiley?
[EC] Adam Shostack, Ratty Signals Emergent Chaos blog, 01 Jan 2005.
[EC] FC FC
[DC] The earliest were based on DigiCash's concepts which presumed a downloaded client.
[DC] For example Paypal and e-gold.
[DC] The underlying software that is being used places no guaruntees on the number of attempts to deliver the data, and the software itself can cause the connection to be restarted on loss, thus resulting in two deliveries of the same instruction. Such buttons are indicative of a failure in design and architecture; the solution is relatively simple, to include unique Ids in the hidden attributes on the page.
[p2p] For self-serving reasons, I am ignoring the case of direct peer-to-peer messaging. The eventual results will extend that far, but it is beyond the scope of this paper to follow.
[ssl] Ian Grigg, SSL Considered Harmful, numerous rants from 2003 onwards.
[May] JP May The May Scale,