Multimedia Sessions on the Internet



Add a note hereLayering telephone-type functions onto the existing Internet architecture is a challenge. Some of the basics are just not there. For example, the Web uses names asymmetrically. There are a huge number of Web sites out there that can be accessed by anonymous users with browsers. Type in the URL, or use a search engine. Click and go. But the Web site doesn’t normally try to find you, and you lack a URL. The Public Switched Telephone Network (PSTN) by contrast names all its endpoints with telephone numbers. A telephone number is mapped to a device such as a mobile phone or a physical line for a fixed telephone. Various companies provide phone number directory services, and the phone itself provides a way to dial and to alert the called user by ringing. The basic Internet structure of routers and computer hosts provides little help in emulating this architecture. Somehow users need to register themselves with some kind of telephony directory on the Internet, and then there has to be some signaling mechanism that can look up the called party in that directory, and place the call. The IETF (Internet Engineering Task Force) has been developing a suitable signaling protocol (SIP—Session Initiation Protocol) since around 1999 and many VoIP companies are using it.

Add a note hereThe next problem is a phone equivalent. A PC can handle sophisticated audio and video, multi-way conferencing, and data sharing. A PC, however, cannot be easily carried in a small pocket. Lightweight and physically small portable IP hosts are likely to have only a subset of a PC’s multimedia capabilities and cannot know in advance the capabilities of the called party’s terminal—more problems for the signaling protocol. A further reason for the relative immaturity of interactive multimedia services is the lack of wide-coverage mobile networks and terminals that are optimized for IP and permit Internet access. The further diffusion of WiFi, WiMAX and possibly lower charges on 3G cellular networks will hopefully resolve this over the next few years.

Add a note hereCan the Internet, and IP networks in general, really be trusted to carry high-quality isochronous traffic (real-time interactive audio-video)? Whole books have been written on the topic (Crowcroft, Handley, and Wakeman 1999) and it remains contentious. My own view is as follows. In the access part of the network, where bandwidth is constrained and there are a relatively small number of flows, some of which may be high-bandwidth (e.g., movie downloads), some form of class of service prioritisation and call admission control will be necessary. In the network itself, traffic is already sufficiently aggregated so that statistical effects normalise the traffic load even at the carrier’s Provider Edge router. With proper traffic engineering, Quality of Service (QoS) is automatically assured and complex, expensive bandwidth management schemes are not required. As traffic continues to grow, this situation will get better, not worse due to the law of large numbers. Many carriers, implementing architectures such as IMS (IP Multimedia Subsystem), take a different view today and are busy specifying and implementing complex per session resource reservation schemes and bandwidth management functions, as they historically did in the PSTN. My belief is that by saddling themselves with needless cost and complexity that fails to scale, they will succeed only in securing for themselves a competitive disadvantage. This point applies regardless whether, for commercial reasons, the carriers introduce and rigidly enforce service classes on their networks or not—the services classes will inherently be aggregated and will not require per-flow bandwidth management in the core.

Add a note hereAfter establishing a high-quality multimedia session, the next issue of concern is how secure that call is likely to be. By default, phone calls have never been intrinsically secure as the ease of wiretaps (legal interception) demonstrates. Most people’s lack of concern about this is based upon the physical security of the phone company’s equipment, and the difficulties of hacking into it from dumb or closed end-systems like phones. One of the most striking characteristics of the Internet is that it permits open access in principle from any host to any other host. This means that security has to be explicitly layered onto a service. Most people are familiar with secure browser access to Web sites (HTTPS) using an embedded protocol in the browser and the Web server (SSL—Secure Sockets Layer) which happens entirely automatically from the point of view of a user. Deploying a symmetric security protocol (e.g., IPsec) between IP-phones for interactive multimedia has been more challenging, and arguably we are not quite there yet. IMS implements hop-by-hop encryption, partially to allow for lawful interception. Most VoIP today is not encrypted—again, Skype is a notable exception. As I observe, Skype looked for a while to be proof against third-party eavesdropping, but following the eBay acquisition, I would not bet on it now.


No comments:

More?