RTP: Audio and Video for the Internet
Each application in an RTP session will maintain a database of information about the participants and about the session itself. The session information, from which the RTCP timing is derived, can be stored as a set of variables :
In addition, the implementation needs to maintain variables to include in RTCP SR packets:
A session data structure containing these variables is also a good place to store the SSRC being used, the SDES information for the implementation, and the file descriptors for the RTP and RTCP sockets. Finally, the session data structure should contain a database for information held on each participant. In terms of implementation, the session data can be stored simply: a single structure in a C-based implementation, a class in an object-oriented system. With the exception of the participant-specific data, each variable in the structure or class is a simple type: integer, text string, and so on. The format of the participant-specific data is described next. To generate RTCP packets properly, each participant also needs to maintain state for the other members in the session. A good design makes the participant database an integral part of the operation of the system, holding not just RTCP- related information, but all state for each participant. The per-participant data structure may include the following:
Within an RTP session, members are identified by their synchronization source identifier. Because there may be many participants and they may need to be accessed in any order, the appropriate data structure for the participant database is a hash table, indexed by SSRC identifier. In applications that deal with only a single media format, this is sufficient. However, lip synchronization also requires the capability to look up sources by their CNAME. As a result, the participant database should be indexed by a double hash table: once by SSRC and once by CNAME.
Participants should be added to the database after a validated packet has been received from them. The validation step is important: An implementation does not want to create a state for a participant unless it is certain that the participant is valid. Here are some guidelines:
This implies that the implementation should maintain an additional, lightweight table of probationary sources (sources in which only a single RTP packet has been received). To prevent bogus sources of RTP and RTCP data from using too much memory, this table should be aggressively timed out and should have a fixed maximum size. It is difficult to protect against an attacker who purposely generates many different sources to use up all memory of the receivers, but these precautions will prevent accidental exhaustion of memory if a misdirected non-RTP stream is received. Each CSRC (contributing source) in a valid RTP packet also counts as a participant and should be added to the database. You should expect to receive SDES information for participants identified only by CSRC. When a participant is added to the database, an application should also update the session-level count of the members and the sender fraction. Addition of a participant may also cause RTCP forward reconsideration, which will be discussed shortly. Participants are removed from the database after a BYE packet is received or after a specified period of inactivity. This sounds simple, but there are several subtle points. There is no guarantee that packets are received in order, so an RTCP BYE may be received before the last data packet from a source. To prevent state from being torn down and then immediately reestablished, a participant should be marked as having left after a BYE is received, and its state should be held over for a few seconds (my implementation uses a fixed two-second delay). The important point is that the delay is longer than both the maximum expected reordering and the media playout delay, thereby allowing for late packets and for any data in the playout buffer for that participant to be used. Sources may be timed out if they haven't been heard from for more than five times the reporting interval. If the reporting interval is less than 5 seconds, the 5-second minimum is used here (even if a smaller interval is used when RTCP packets are being sent). When a BYE packet is received or when a member times out, RTCP reverse reconsideration takes place, as described in the section titled BYE Reconsideration later in this chapter. |