RTP: Audio and Video for the Internet

Forward error correction, which relies on the addition of information to the media stream to provide protection against packet loss, is one form of channel coding. The media stream can be matched to the loss characteristics of a particular network path in other ways as well, some of which are discussed in the following sections.

Partial Checksum

Most packet loss in the public Internet is caused by congestion in the network. However, as noted in Chapter 2, Voice and Video Communication over Packet Networks, in some classes of network ”for example, wireless ”noncongestive loss and packet corruption are common. Although discarding packets with corrupted bits is appropriate in many cases, some RTP payload formats can make use of corrupted data (for example, the AMR audio codecs 41 ). You can make use of partially corrupt RTP packets either by disabling the UDP checksum (if IPv4 is used) or by using a transport with a partial checksum.

When using RTP with a standard UDP/IPv4 stack, it is possible to disable the UDP checksum entirely (for example, using sysctlnet.inet.udp.checksum=0 on UNIX machines supporting sysctl , or using the UDP_NOCHECKSUM socket option with Winsock2). Disabling the UDP checksum has the advantage that packets with corrupted payload data are delivered to the application, allowing some part of the data to be salvaged. The disadvantage is that the packet header may be corrupted, resulting in packets being misdirected or otherwise made unusable.

Note that some platforms do not allow UDP checksums to be disabled, and others allow it as a global setting but not on a per-stream basis. In IPv6-based implementations , the UDP checksum is mandatory and must not be disabled (although the UDP Lite proposal may be used).

A better approach is to use a transport with a partial checksum, such as UDP Lite. 53 This is a work in progress that extends UDP to allow the checksum to cover only part of the packet, rather than all or none of it. For example, the checksum could cover just the RTP/UDP/IP headers, or the headers and the first part of the payload. With a partial checksum, the transport can discard packets in which the headers ”or other important parts of the payload ”are corrupted, yet pass those that have errors only in the unimportant parts of the payload.

The first RTP payload format to make significant use of partial checksum was the AMR audio codec. 41 This is the codec selected for many third-generation cellular telephony systems, and hence the designers of its RTP payload format placed high priority on robustness to bit errors. Each frame of the codec bit stream is split into class A bits, which are vital for decoding, and class B and C bits, which improve quality if they are received, but are not vital . One or more frames of AMR output are placed into each RTP packet, with the option of using a partial checksum that covers the RTP/UDP/IP headers and class A bits, while the other bits are left unprotected. This lack of protection allows an application to ignore errors in the class B and class C bits, rather than discarding the packets. In Figure 9.9, for example, the shaded bits are not protected by a checksum. This approach appears to offer little advantage, because there are relatively few unprotected bits, but when header compression (see Chapter 11) is used, the IP/UDP/RTP headers and checksum are reduced to only four octets, increasing the gain due to the partial checksum.

Figure 9.9. An Example of the Use of Partial Checksums in the AMR Payload Format

The AMR payload format also supports interleaving and redundant transmission, for increased robustness. The result is a very robust format that copes well with the bit corruption that is common in cellular networks.

Partial checksums are not a general-purpose tool, because they don't improve performance in networks in which packet loss is due to congestion. As wireless networks become more common, however, it is expected that future payload formats will also make use of partial checksums.

Reference Picture Selection

Many payload formats rely on interframe coding, in which it is not possible to decode a frame without using data sent in previous frames. Interfame coding is most often used in video codecs, in which motion vectors allow panning of the image, or motion of parts of the image, to occur without resending the parts of the preceding frame that have moved. Interframe coding is vital to achieving good compression efficiency, but it amplifies the effects of packet loss (clearly, if a frame depends on the packet that is lost, that frame cannot be decoded).

One solution to making interframe encodings more robust to packet loss is reference picture selection, as used in some variants of H.263 and MPEG-4. This is another form of channel coding, in which if a frame on which others are predicted is lost, future frames are recoded on the basis of another frame that was received (see Figure 9.10). This process saves significant bandwidth compared to sending the next frame with no interframe compression (only intraframe compression).

Figure 9.10. Reference Picture Selection

To change the reference picture, it is necessary for the receiver to report individual packet losses to the sender. Mechanisms for feedback are discussed in the next section in the context of retransmission; the same techniques can be used for reference picture selection with minor modification. Work on a standard for the use of reference picture selection in RTP is ongoing, as part of the retransmission profile discussed next.

Категории