Have you ever wonder how Microsoft Teams, Zoom, or any other web-based tool for conferencing is able to do what they do? WebRTC is the underlying system.
Recently, I developed a POC (i.e. Proof Of Concept) using WebRTC for one of VI Company's clients. This particular client desired to see what their users were viewing in their web application. This was done in a controlled setting, in which the user granted full consent and could end the sharing session at any moment.
In this article, I want to share with you the [technical] lessons learned during the development of the POC.
What is WebRTC?
WebRTC is an abbreviation for Web Real-Time Communication. It's a free, open-source project for peer-to-peer communication over the web. WebRTC enables you to share data, messages, webcam, or even complete application views (like the POC). This web standard is supported by Apple, Google, Microsoft, Mozilla, Opera, and all other modern browsers.
From a security perspective, WebRTC is top-notch. It mandates encryption that is impossible for a regular user to turn off. The encryption of the content works with DTLS-SRTP. This entails that setting up the connection is secure over TLS and the generated keys cannot be captured. After that, all data between peers is encrypted with the keys in hmac-sha1.
WebRTC's terms & definitions
Before we dive into lessons learned it's important to understand the default process of a WebRTC connection. Also, some popular terms need to be clarified. WebRTC is not that complicated, but its specific flow and protocol require some explanation.
A server that exchanges setup information between peers. This is your own web server with web sockets. Aside from exchanging information, you could do additional business checks. For us, a signal server with Asp.net SignalR or Socket.IO made the most sense.
Session Traversal Utilities for NAT; Getting to know the publicly accessible address. STUN servers are inexpensive, and there are even free ones are available.
Traversal Using Relays around NAT; Relaying of traffic from peer-to-peer to overcome any network topologies obstructions because people are not in the same network. TURN servers are expansive because of relay.
Interactive Connectivity Establishment; a protocol for understanding and exchanging connectivity information between peers.
Network Address Translation; The process of translating internal IPs to external IPs so one can bridge the gap between private and public networks.
WebRTC's basic flow
The diagram below shows WebRTC's basic flow. The more business rules or clients you add, the more complex it will grow.
Browser A (first peer) initiates its data stream. This is important to set up first. A data channel needs to be present before any other communication can be set up.
The offer is based on the capabilities of browser A and the initiated data streams in step 1.
Browser B (second peer) receives the offer and holds on to this during the lifetime of the page session.
Browser B sends the answer which contains their capabilities.
Browser A receives the answer and holds on to this during the lifetime of the page session.
Then Interactive Connectivity Establishment (ICE) is being taken place. This process can be performed with a STUN/TURN server and happens both on browser A and browser B. A NAT traversal helps understanding how to get the peer-to-peer connection setup between the two browsers. When using a TURN it also relays data when the connection is made.
The routing information to get from the browser to the STUN/TURN server must be shared with the other browser. This is sent to the signal server.
The signal server sends it back to the other connected browser and that browser holds on to this during the lifetime of the page.
When both browsers finish their ICE, the data can be exchanged!
Use a TURN server from the start. Even when developing a POC which must demonstrate communication between remote peers in different locations; start using a TURN server. For far too long I thought I could use a free open STUN server. However, because I am working remotely on a VPN, I need the relaying of a TURN server.
Use TURN over TCP and port 443. Since corporate networks block some ports or even UDP, make sure your TURN server communicates over TCP and port 44. Usually, these are open. The downside of this decision is that information is sent more reliable (because of the nature of TCP)/ However, the costs are lower in comparison with UDP. I don't want to advocate you can't use it, and WebRTC allows even options for multiple ICE candidates but in my experience simply using a TURN server over TCP on port 443 is better.
Debug your application in both Firefox and Chrome. I was stuck in my POC sometimes and then I would switch to Firefox. Sometimes some events are logged more clearly in Firefox. However, the debug page in Chrome provides in-depth information which helped me as well.
Twilio offers STUN/TURN SAAS. Although I am not paid to say this, the STUN/TURN as a service of Twilio is a big help. Additionally, they provide some free usage for curious developers like us.
- General WebRTC website with good info and examples.
- Deep explanation of terms in the WebRTC domain.
- Twilio STUN/TURN as service.
- Can I use WebRTC?.
- A more in-depth article on security in WebRTC.
- chrome://webrtc-internals (open this link in the Chrome browser).
- about:webrtc (open this link in the Firefox browser).
Big thanks to Pim van Die for proof-reading my article and being a fellow WebRTC enthusiast!
Questions, remarks or suggestions?
Are you curious to learn more about WebRTC? Our Back-End Developer Casper Broeren is happy to tell you all about it. You can contact him via firstname.lastname@example.org.