Signaling
The general idea behind the design of WebRTC has been to fully specify how to control
the media plane, while leaving the signaling plane as much as possible to the application
layer. The rationale is that different applications may prefer to use different standardized
signaling protocols (e.g., SIP or eXtensible Messaging and Presence Protocol [XMPP])
or even something custom.
Session description represents the most important information that needs to be ex‐
changed. It specifies the transport (and Interactive Connectivity Establishment [ICE])
information, as well as the media type, format, and all associated media configuration
parameters needed to establish the media path.
Since the original idea to exchange session description information in the form of Ses‐
sion Description Protocol (SDP) “blobs” presented several shortcomings, some of which
turned out to be really hard to address, the IETF is now standardizing the JavaScript
Session Establishment Protocol (JSEP). JSEP provides the interface needed by an ap‐
plication to deal with the negotiated local and remote session descriptions (with the
negotiation carried out through whatever signaling mechanism might be desired), to‐
gether with a standardized way of interacting with the ICE state machine.
The JSEP approach delegates entirely to the application the responsibility for driving
the signaling state machine: the application must call the right APIs at the right times,
and convert the session descriptions and related ICE information into the defined mes‐
sages of its chosen signaling protocol, instead of simply forwarding to the remote side
the messages emitted from the browser.