Introduction to Voice Protocols
VoIP has several call signaling and control protocols available for use. The protocol that you should use depends on the type of gateway, endpoint host, and call agent, and the capabilities you need in the network. Multiple protocols might be used in different portions of the network. After you set up the call, you transmit IP voice and video traffic using a different protocol, RTP. This section gives an overview of IP signaling and media protocols used in a VoIP network.
Media Gateway Control Protocol
MGCP is a client-server call control protocol, built on centralized control architecture. All the dial plan information resides on a separate call agent. The call agent, which controls the ports on the gateway, performs call control. The gateway does media translation between the PSTN and the VoIP networks for external calls. In a Cisco-based network, CallManagers function as the call agents. This book deals with CallManager-controlled MGCP gateways only; other call agents are beyond the scope of this book.
MGCP is an Internet Engineering Task Force (IETF) standard that is defined in several RFCs, including 2705 and 3435. Its capabilities can be extended by the use of "packages" that include, for example, the handling of DTMF tones, secure RTP, call hold, and call transfer.
An MGCP gateway is relatively easy to configure. Because the call agent has all the call-routing intelligence, you do not need to configure the gateway with all the dial peers it would otherwise need. A downside, however, is that a call agent must always be available. Cisco MGCP gateways can use SRST and MGCP fallback to allow the H.323 protocol to take over and provide local call routing in the absence of a CallManager. In that case, you must configure dial peers on the gateway for use by H.323.
Note
See Chapter 2, "Media Gateway Control Protocol," for more information on this protocol.
H.323
H.323 is an International Telecommunications Union Telecommunication Standardization Sector (ITU-T) standard protocol. It has its roots in legacy telecommunications protocols, so it communicates well with hosts on the PSTN. H.323 is actually a suite of protocols that specify the functions involved in sending real-time voice, video, and data over packet-switched networks. Unlike MGCP, an H.323 gateway does not require a call agent; it is built on a distributed architecture model. Gateways can independently locate a remote host and establish a media stream; thus, you must configure them with call routing information.
Although an H.323 gateway does not require a call agent, you can use it in a CallManager network. The CallManager directs calls that are bound for the PSTN to the gateway, which uses plain old telephone service (POTS) dial peers to route them. The gateway has a VoIP dial peer pointing to CallManager for calls that are bound inside the VoIP network. You can configure IP phones to register directly with an H.323 gateway using SRST when their CallManager is unavailable.
The H.323 standard defines four components of an H.323 system: terminals, gateways, gatekeepers, and multipoint control units (MCU).
- Terminals These are the user endpoints, such as a video conferencing units, that communicate with each other.
- Gateways Used to communicate with terminals on other networks (primarily across the PSTN).
- Gatekeepers Translate phone numbers to IP addresses, and control and route calls.
- Multipoint control units (MCU) Enable multiple parties to join a videoconference.
Note
See Chapter 3, "H.323," for more information on this protocol.
Session Initiation Protocol
SIP, like MGCP, is an IETF standard, which is defined in a number of RFCs. Its control extends to audio, video, data, and instant messaging communications, allowing them to interoperate. SIP uses a distributed architecture based somewhat on the Internet model, using clear text request and response messages and URLs for host addressing. The protocol addresses only session initiation and teardown. It relies on other protocols, such as HTTP for message format, Session Description Protocol (SDP) for negotiating capabilities, and the Domain Name System (DNS) for locating servers by name. The driving force behind SIP is enabling next-generation multimedia networks that use the Internet and Internet applications.
SIP uses several functional components in its call setup and teardown. Because these are logical functions, one device could serve several functions. SIP entities can act as a client or server, and some can act as both. Clients initiate requests for a service or information, and servers respond. One call might involve several requests and responses from several devices. Some SIP functions are as follows:
- User agent The SIP endpoint, such as a SIP phone, which generates requests when it places a call and answers requests when it receives a call.
- Proxy server The server that handles requests to and from a user agent, either responding to them or forwarding them as appropriate.
- Redirect server The server that maintains routing information for remote locations and responds to requests from proxy servers for the location of remote servers.
- Registrar server The server that keeps a database of user agents in its domain and responds to requests for this information.
- Presence server The server that supports SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE) applications. It collects and communicates user and device status, communications capabilities, and other attributes.
SIP is a developing standard; therefore, interoperability between vendors and with other VoIP protocols can be a challenge. Much work is being done in this area, as SIP becomes more widely adopted.
Note
See Chapter 4, "Session Initiation Protocol," for more information on this protocol.
Skinny Client Control Protocol
Cisco IP phones use the Cisco proprietary Skinny Client Control Protocol (SCCP), or "Skinny," to communicate with their call agent. As the "skinny" portion of the name implies, SCCP is a lightweight protocol that is built on a client-server model. Call control messages are sent over TCP. End stations use SCCP to register with their call agent, and then to send and receive call setup and teardown instructions.
Routers in SRST mode can use SCCP to communicate with the Cisco IP phones they control. Some analog gateway devices, such as a VG224 and Analog Telephone Adapter (ATA), can also use Skinny to communicate with a call agent.
Real-Time Transport Protocol
MGCP, H.323, SIP, and SCCP are protocols that handle call signaling and control. RTP is an IETF standard protocol that uses User Datagram Protocol (UDP) to carry voice and video media after the call is set up. Its header includes a sequence number so that the receiver knows if packets are arriving in the correct order, and a timestamp field to calculate jitter. The RTP, UDP, and IP headers together equal 40 bytes, which can be much larger than the voice data carried in the packet. RTP header compression (cRTP) compresses these three headers as small as 2 to 4 bytes to save bandwidth on low-speed links. The Real-Time Transport Control Protocol (RTCP) allows monitoring of the call data, including counts of the number of packets and bytes transmitted. Secure versions of RTP (SRTP) and RTCP (SRTCP) encrypt the media streams between hosts. Chapter 8, "Connecting to an IP WAN," has more information on using cRTP and SRTP with gateways.