RTP: Audio and Video for the Internet
Multiplexing has been an area of some controversy, and considerable discussion, within the IETF. Although TCRTP is the recommended best current practice, there are other proposals that merit further discussion. These include Generic RTP Multiplexing (GeRM), which is one of the few alternatives to TCRTP that maintains RTP semantics, and several application-specific multiplexes. GeRM
Generic RTP Multiplexing (GeRM) was proposed at the IETF meeting in Chicago in August 1998 but was never developed into a complete protocol specification. 45 GeRM uses the ideas of RTP header compression, but instead of compressing the headers between packets, it applies compression to multiple payloads multiplexed within a single packet. All compression state is reinitialized in each new packet, and as a result, GeRM can function effectively end-to-end. CONCEPTS AND PACKET FORMAT
Figure 12.4 shows the basic operation of GeRM. A single RTP packet is created, and multiple RTP packets ”known as subpackets ”are multiplexed inside it. Each GeRM packet has an outer RTP header that contains the header fields of the first subpacket, but the RTP payload type field is set to a value indicating that this is a GeRM packet. Figure 12.4. A GeRM Packet Containing Three Subpackets
The first subpacket header will compress completely except for the payload type field and length because the full RTP header and the subpacket header differ only in the payload type. The second subpacket header will then be encoded on the basis of predictable differences between the original RTP header for that subpacket and the original RTP header for the first subpacket. The third subpacket header is then encoded off of the original RTP header for the second subpacket, and so forth. Each subpacket header comprises a single mandatory octet, followed by several extension octets, as shown in Figure 12.5. Figure 12.5. GeRM Subpacket Header
The meanings of the bits in the mandatory octet are as detailed here:
Any CSRC fields present in the original RTP header then follow the GeRM headers. Following this is the RTP payload. APPLICATION SCENARIOS
The bandwidth saving due to GeRM depends on the similarity of the headers between the multiplexed packets. Consider two scenarios: arbitrary packets and packets produced by cooperating applications. If arbitrary RTP packets are to be multiplexed, the multiplexing gain is small. If there is no correlation between the packets, all the optional fields will be present and the subpacket header will be 14 octets in length. Compared to nonmultiplexed RTP, there is still a gain here because a 14-octet subheader is smaller than the 40-octet RTP/UDP/IP header that would otherwise be present, but the bandwidth saving is relatively small compared to the saving from standard header compression. If the packets to be multiplexed are produced by cooperating applications, the savings due to GeRM may be much greater. In the simplest case, all the packets to be multiplexed have the same payload type, length, and CSRC list; so three octets are removed in all but the first subpacket header. If the applications generating the packets cooperate, they can collude to ensure that the sequence numbers and timestamps in the subpackets match, saving an additional six octets. Even more saving can be achieved if the applications generate packets with consecutive synchronization source identifiers, allowing the SSRC to be removed also. Of course, such collusion between implementations is stretching the bounds of what is legal RTP. In particular, an application that generates nonrandom SSRC identifiers can cause serious problems in a session with standard RTP senders. Such nonrandom SSRC use is acceptable in two scenarios:
At best, GeRM can produce packets with a two-octet header per multiplexed packet, which is a significant saving compared to nonmultiplexed RTP. GeRM will always reduce the header overheads, compared to nonmultiplexed RTP. THE FUTURE OF GERM
GeRM is not a standard protocol, and there are currently no plans to complete its specification. There are several reasons for this, primary among them being concern that the requirements for applications to collude in their production of RTP headers will limit the scope of the protocol and cause interoperability problems if GeRM is applied within a network. In addition, the bandwidth saving is relatively small unless such collusion occurs, which may make GeRM less attractive. The concepts of GeRM are useful as an application-specific multiplex, between two gateways that source and sink multiple RTP streams using the same codec, and that are willing to collude in the generation of the RTP headers for those streams. The canonical example is IP-to-PSTN gateways, in which the IP network acts as a long-distance trunk circuit between two PSTN exchanges. GeRM allows such systems to maintain most RTP semantics, while providing a multiplex that is efficient and can be implemented solely at the application layer. Application-Specific Multiplexing
In addition to the general-purpose multiplexing protocols such as TCRTP and GeRM, various application-specific multiplexes have been proposed. The vast majority of these multiplexes have been targeted toward IP-to-PSTN gateways, in which the IP network acts as a long-distance trunk circuit between two PSTN exchanges. These gateways have many simultaneous voice connections between them, which can be multiplexed to improve the efficiency, enabling the use of low bit-rate voice codecs, and to improve scalability. Such gateways often use a very restricted subset of the RTP protocol features. All the flows to be multiplexed commonly use the same payload format and codec, and it is likely that they do not employ silence suppression. Furthermore, each flow represents a single conversation, so there is no need for the mixer functionality of RTP. The result is that the CC, CSRC, M, P, and PT fields of the RTP header are redundant, and the sequence number and timestamp have a constant relation, allowing one of them to be elided. After these fields are removed, the only things left are the sequence number/timestamp and synchronization source (SSRC) identifier. Given such a limited use of RTP, there is a clear case for using an application-specific multiplex in these scenarios. A telephony-specific multiplex may be defined as an operation on the RTP packets, transforming several RTP streams into a single multiplex with reduced headers. At its simplest, such a multiplex may concatenate packets with only the sequence number and a (possibly reduced) synchronization source into UDP packets, with out-of-band signaling being used to define the mapping between these reduced headers and the full RTP headers. Depending on the application, the multiplex may operate on real RTP packets, or it may be a logical operation with PSTN packets being directly converted into multiplexed packets. There are no standard solutions for such application-specific multiplexing. As an alternative, it may be possible to define an RTP payload format for TDM (Time Division Multiplexing) payloads, which would allow direct transport of PSTN voice without first mapping it to RTP. The result is a "circuit emulation" format, defined to transport the complete circuit without caring for its contents. In this case the RTP header will relate to the circuit. The SSRC, sequence number, and timestamp relate to the circuit, not to any of the individual conversations being carried on that circuit; the payload type identifies, for example, "T1 emulation"; the mixer functionality (CC and CSRC list) is not used, nor are the marker bit and padding. Figure 12.6 shows how the process might work, with each T1 frame forming a single RTP packet. Figure 12.6. Voice Circuit Emulation
Of course, direct emulation of a T1 line gains little because the RTP overhead is large. However, it is entirely reasonable to include several consecutive T1 frames in each RTP packet, or to emulate a higher-rate circuit, both of which reduce the RTP overhead significantly. The IETF has a Pseudo-Wire Edge-to-Edge Emulation working group , which is developing standards for circuit emulation, including PSTN (Public Switched Telephone Network), SONET (Synchronous Optical Network), and ATM (Asynchronous Transfer Mode) circuits. These standards are not yet complete, but an RTP payload format for circuit emulation is one of the proposed solutions. The circuit emulation approach to IP-to-PSTN gateway design is a closer fit with the RTP philosophy than are application-specific multiplexing solutions. Circuit emulation is highly recommended as a solution for this particular application. |