Explaining the WebRTC Secure Real-Time Transport Protocol (SRTP)
The Secure Real-time Transport Protocol (SRTP) is a security framework that extends the Real-time Transport Protocol (RTP) and allows a suite of crypto mechanisms.
WebRTC uses DTLS-SRTP to add encryption, message authentication and integrity, and replay attack protection. It provides confidentiality by encrypting the RTP payload and supporting origin authentication. SRTP is one component of this security in WebRTC. It gives comfort to developers looking for a reliable and secure API. But what is SRTP really, and how does it work?
What is the Secure Real-time Transport Protocol?
SRTP extends RTP to bring upgraded security features. It was initially published by the Internet Engineering Task Force (IETF) in RFC 3711 in March 2004.
SRTP provides confidentiality by encrypting the RTP payload, not including the RTP header. It supports source origin authentication and is widely used as the security mechanism for RTP. While it can be used in its entirety, it is also possible to disable or enable specific security features. The main hurdle with SRTP is key management, as many options exist, including DTLS-SRTP, MIKEY in SIP, Security Description (SDES) in SDP, ZRTP, and others.
SRTP uses Advanced Encryption Standard (AES) as the default cipher. This includes two cipher modes: Segmented Integer Counter Mode and f8-mode. Segmented Integer Counter Mode is standard, and is critical for running traffic over an unreliable network with a possible loss of packets. f8-mode is used for 3G mobile networks, and is a variation of output feedback mode designed to be seekable with an altered initialization function.
SRTP also allows developers to disable encryption with the NULL cipher. The NULL cipher does not perform any encryption, and instead operates as an identity function. It copies the input stream directly to the output stream without any changes.
In WebRTC, NULL Cipher is not recommended, as discretion is typically fairly critical to end-users. In fact, valid WebRTC implementations MUST support encryption, currently with DTLS-SRTP.
In order to maintain message integrity in SRTP, an algorithm is used over the packet payload and parts of the packet header to create an authentication tag, which is then added to the RTP packet. This authentication tag is used to validate the payload contents, which prevents against potential forged data.
Authentication also provides the foundation to prevent replay attacks. In order to block replay attacks, each packet is assigned an index that is sequenced and compared. A new message may only be admitted if its index is the next in order and has not already been received. These indices are made effective by the message integrity implemented above, as without it, the potential exists to spoof message indices.
In WebRTC, while HMAC-SHA1 is the algorithm used as the basis for ensuring message integrity in SRTP, it is strongly recommended to choose cipher suites with Perfect Forward Secrecy (PFS) over non-PFS and Authenticated Encryption with Associated Data (AEAD) over non-AEAD. Most current WebRTC implementations use DTLS v1.2 with the TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 cipher suite.
SRTP uses a key derivation function to derive keys based off of one master key. This master key is exchanged by the key management protocol to create any and all session keys. By using unique keys per session based off of the master key, each individual session can be secured. In this way, if one session were to be infiltrated, all other sessions would still be secure. A key management protocol is used for the master key, which is usually ZRTP or MIKEY, though others exist.
What about SRTP and WebRTC?
Figure 1: The RTP IP Stack
With WebRTC, streams are secured by one of two protocols: SRTP or Datagram Transport Layer Security (DTLS). DTLS is used for encrypting data streams, while SRTP is used for encrypting media streams. However, DTLS-SRTP is typically used to perform the SRTP key exchange to detect any man-in-the-middle attacks. The IETF documents WebRTC security and security arch cover these aspects in detail.
Secure Real-time Transport Control Protocol (SRTCP)
SRTP also has a complimentary protocol, the Secure Real-time Transport Control Protocol (SRTCP). SRTCP is an extension of the Real-time Transport Control Protocol (RTCP) that contributes the same security features SRTP brings for RTP, including encryption and authentication. Similar to SRTP, almost all security features with SRTCP can be disabled. The only exception to this is message authentication, which is required in SRTCP.
What are the Potential Pitfalls of SRTP?
SRTP encrypts the payload of RTP packets, but not the RTP Extension headers. This leaves a security vulnerability, as the extension header of an RTP packet may contain sensitive information, such as the per-packet sound levels of the media data in the RTP payload. This may indicate to an eavesdropper on the network that a particular conversation between two individuals is occuring, which could be leveraged and breach their privacy. This was addressed by the IETF’s Request for Comments: 6904, which requires that all future SRTP encryption transforms specify how RTP header extensions are to be encrypted.
In some instances such as multi-party conferencing, an intermediary such as a Selective Forwarding Mixer (SFM) may be necessary to optimize some RTP parameters when forwarding a subset of streams. This introduces a go-between that disrupts the end-to-end security that is maintained by a purely peer-to-peer system. Currently, this requires the endpoints trust an additional source - the intermediary. In order to overcome this restriction, IETF’s Privacy Enhanced RTP Conferencing (PERC) is working on solutions such as SRTP Double Encryption Procedures (PERC). PERC provides hop-by-hop and end-to-end security guarantees based on two separate yet related cryptographic contexts. We will discuss this in a future blog post!