SFU vs MCU vs P2P: WebRTC Architectures Explained

SFU vs MCU vs P2P: WebRTC Architectures Explained

There are different WebRTC architecture available today, these are SFU, MCU and P2P, selecting one depends on many factors these include

Network Conditions

Bandwidth availability

High Bandwidth

If participants have good quality bandwidth then SFU and peer to peer calling will work for them

Variable Bandwidth

If the participants do not have good quality bandwidth and there are a lot of participants than going with a MCU would be a good idead

Latency Sensitivity

Low Latency Required

If the situation requires low latency then going with an SFU and/ or peer to peer is a good idea

Latency Tolerance

If there is a tolerance for latency and latency is not that important then MCU can also be considered. MCU processes the streams which creates latency. But in return for latency MCU reduces the number of streams that must be given to each meeting user and thus reduces requirement for client resources

Server Capacity

Server Resources

Limited Resources

In peer to peer connections and SFU there is less CPU requirement on the server and more on the client. In MCU there is large CPU requirement on the server side and less on the client side

If you have limited resources on the client side then it is adviseable to go with MCU. In large meetings with multiple streams using IoT devices could be a good use case for MCU

Moderate Resources

If you have moderate resources on the client side like mobile phones, and smart devices and the meetings are not huge like 1000s of people then going with the SFU is a good idea.

If provides good quality streams that you don’t get with MCU and is moderately resource intensive on the client as well as on the server

High Resources

If you have devices that can handle high resource requirements such as 4G connection smart phones etc then you can consider peer to peer as well

peer to peer has minimal load on the server but provides all the streams to the client devices and the client devices itself has to send all the streams to all the other devices that are connected to the meeting

Use Cases

One to One calls

Peer to peer is a good use case for one to one call as it does not require a lot of resouces and individual devices can directly communicate with each other using a TURN server

Small Group Meetings

Here SFU as well as MCU as well as peer to peer work. It depends on the number of users in the meeting also the client device capabilities including bandwidth and CPU capacity to deal with incoming and outgoing streams

Large Webinars/ Conferences

SFU are perfect for this as the SFU streamlines the income streams and directs the stream to specific devices that need the stream

Interactive Live Events

For this SFU and MCU both can be considered. MCU is going to cost a lot more as compared to SFU because it processes the incoming streams and creates a single stream that is then broadcast to all the users.

Quick Comparison Table between SFU, MCU and P2P

Feature
SFU (Selective Forwarding Unit)
MCU (Multipoint Control Uni)
P2P (Peer-to-Peer)

Scalability
Highly Scalable
Moderately Scalable
Low Scalability, suitable for small groups

Latency
Low to Moderate
High, because of mixing and processing of streams
low to moderate

bandwidth usage
Efficient, streams are selectively forwarded
High usage, because all streams are forwarded to all users
Variable, it depends on how many users are connected

Steam Quality
Excellent quality
Excellent quality
Excellent Quality

Implementation Cost
Moderate
High
low

Server Load
Moderate load
High load
Minimum load

SFU ( Selective Forwarding Unit) in Detail

SFU or a Selective Forwarding Unit, is a server component in the WebRTC ecosystem.

SFU recieves multiple media streams from different participants but selectively forwards the stream to other participants without mixing them.

One thing to note is that the SFU does not process the streams, it just routes them to participants based on the need

How does SFU work

Stream Reception

Each participant in the SFU model sends their stream to the SFU

Stream Selection

The SFU analyses all the incoming streams and decides which streams needs to be send to which partcipant

Stream Forwarding

Based on its decision the SFU forwards the streams to each participant, thus optimizing bandwidth and CPU load

Adaptive Bitrate Streaming

SFU have the ability to implement adaptive bitrate streaming, this means that SFU can adjust the quality to the stream based on the bandwidth and CPU power of the recieving participant.

This makes sure that the participants that have a lower bandwidth or CPU power can also take part in the meeting.

Simulcast

Simulcast is also an innovative technology that is implemented by the SFUs, Here a participant client device sends multiple streams of different quality.

then SFU forwards the streams that are most appropriate for each participant based on the recieving participants bandwidth and CPU availability.

Advantages

Scalability

SFU can handle a large number of users efficiently by forwarding streams, and can work with a lot of streams in fairly small server size. This makes it a good solution for using it with multiplw simultaneous users in webinars and meetings

Lower Latency

The SFUs does not mix or process the streams hence there is not a lot of latency when working with SFUS

Efficient resource use

SFUs require comparatively less server resources and lower server costs in handling a large number of users

Exploring MCU (Multipoint control Unit)

The MCU is a server component in the WebRTC. It receives multiple media streams from multiple participants then mixes these streams and combines them to create a single stream which it then streams back to the participants

How does MCU work

Stream Reception

Each participant sends their stream to the MCU.

Stream Processing

The MCU then gets all the streams from all the participants that are in a meeting, then processes the streams and creates a single audio and video stream

Stream Distribution

Then the MCU sends this single stream to all the participants that are connected in the meeting

Advantages

Reduced bandwidth and CPU requirements for the client in large meetings:

In traditional webrtc systems like peer 2 peer or SFU all the streams are provided to all the participants. If there is a large meeting then each participant gets a lot of streams from other participants which takes away resources like bandwidth and CPU from the client devices.

In MCU this is avoided because each client gets a single stream from the MCU thus optimizing client bandwidth and CPU time which is precious

Consistent streaming quality:

In the MCU all the participants get the same quality stream from the MCU. This is not the case with p2p because different quality streams are provided by the peers and these streams are then provided to all the peers.

Disadvantages

Server load considerations:

In the MCU all the streams are directed towards the MCU which then processes all the streams and combines them and creates a new stream. This requires a lot CPU processing power on the server

So the MCU requires a lot of power on the server, but reduces load on the client devices

Scalability challenges

Because MCU offloads all the processing on the server, there are scalability challenges.

MCU are perfect for video calling on client devices that are low on CPU and bandwidth requirements like mobile devices or IoT devices. But scaling the video to large number of participants creates exponential requirements of compute and bandwidth on the server.

Exploring P2P (Peer-to-Peer)

Peer to Peer is a webrtc communication model where media streams that is audio, video and data. These streams are send from client to client directly but many times relayed through a TURN server so as to traverse the NAT

This Model provides direct communication between clients facilitating efficient and low latency communication

Metered TURN servers

API: TURN server management with powerful API. You can do things like Add/ Remove credentials via the API, Retrieve Per User / Credentials and User metrics via the API, Enable/ Disable credentials via the API, Retrive Usage data by date via the API.

Global Geo-Location targeting: Automatically directs traffic to the nearest servers, for lowest possible latency and highest quality performance. less than 50 ms latency anywhere around the world

Servers in all the Regions of the world: Toronto, Miami, San Francisco, Amsterdam, London, Frankfurt, Bangalore, Singapore,Sydney, Seoul, Dallas, New York

Low Latency: less than 50 ms latency, anywhere across the world.

Cost-Effective: pay-as-you-go pricing with bandwidth and volume discounts available.

Easy Administration: Get usage logs, emails when accounts reach threshold limits, billing records and email and phone support.

Standards Compliant: Conforms to RFCs 5389, 5769, 5780, 5766, 6062, 6156, 5245, 5768, 6336, 6544, 5928 over UDP, TCP, TLS, and DTLS.

Multi‑Tenancy: Create multiple credentials and separate the usage by customer, or different apps. Get Usage logs, billing records and threshold alerts.

Enterprise Reliability: 99.999% Uptime with SLA.

Enterprise Scale: With no limit on concurrent traffic or total traffic. Metered TURN Servers provide Enterprise Scalability

5 GB/mo Free: Get 5 GB every month free TURN server usage with the Free Plan

Runs on port 80 and 443

Support TURNS + SSL to allow connections through deep packet inspection firewalls.

Supports both TCP and UDP

Free Unlimited STUN

How does Peer to Peer work

Connection Establishment with Signalling

In peer to peer connection a signalling server is required which is used to exchange information such as the peer IP address and port number so that peers can identify each other and start the process of connection

ICE and TURN server for connection establishment

WebRTC uses the ICE framework to find the best path to connect the peers directly. The ICE frameworks first tries to connect the peers using the STUN server if that fails then it tries to connect the peers using the TURN server.

Stream Transmission

When the connection is established the media stream are transmitted between peers either directly or as in most cases through a TURN server.

Data passing through a TURN server is end to end encrypted, thus no one not even the TURN server know what data is passing through it

Advantages

Minimal latency

The main advantage of peer to peer connection is minimum latency that is because the connection is established directly, so there is minimum latecy

Direct Communication

In the peer to peer connection the connection is direct from one peer to another, sometimes a TURN server is used which also does not process the streams just forwards it to other participants

Technical Deep Dive

TURN

TURN server is Traversal using relays around NAT which is a server that relays the traffic from one peer to another

Here is an article on TURN server that explain in detail what are turn servers and how do they work

STUN

STUN server is called as the Session traversal utilities for NAT helps client devices descover their own external IP address and port number. The client devices do not know what their external IP address and port number is this because the NAT obfuscates this. The client device sends a request to the STUN server when then replies back with the external IP address and port number of the incoming request thus the device knows what the external IP address and port number is.

If you are interested in learning more about how STUN servers work you can refer to the article:

Stun Server: What is Session Traversal Utilities for NAT?

ICE Framework

ICE Framework is a webrtc framework that finds out which is the best path to connect to devices directly in order to create a webrtc connection

the ICE framework first tries to create a direct connection using STUN server and if that fails it tries to create a connection using the TURN servers

There are many ICE servers available in the market today. If you like to learn more about ICE servers then here is a guide.

NAT Issues

NAT does not elt the devices connect directly with each other, it blocks incoming connections (some types of NAT allow connections others do not) does not let teh devices know what their external IP address and port number is and create other issues

There are different types of NAT some allow external connections and others do not. There are also firewall rules that block incoming connections from external devices

For these purposes you need a TURN server