Interaction between WebRTC and IP camera

WebRTC Media Gateways for media interoperability

For integrating IP cameras into WebRTC applications, media interaction needs to be implemented first. This means that the media stream provided by the camera needs to be compatible with the WebRTC codec and the format supported by the browser. This also means translating what the IP camera spits out into something supported by WebRTC browsers. To achieve this, a technology called a WebRTC media gateway is usually required. In order to understand the working of such a gateway, consider the following points:

Most IP cameras available in the market (excluding exotic ones) publish media through any of these mechanisms:

Most IP cameras on the market (except foreign products) release media through the following mechanisms:

  • RTSP/H.264: This type of camera is commonly used in security applications. They use the RTSP protocol to establish an RTP media session. In other words, the signaling that occurs through RTSP during media transmission is based on plain RTP. Different camera manufacturers may support different RTP specifications, but on most cameras I’ve seen, AVP is the only option available. Also typically in these cameras, H.264 is the only codec option.

  • HTTP/MJPEG: This type of camera uses HTTP streaming to signal, transmit, and encode a series of JPEG images into video. The hardware of these cameras is relatively simple and requires very few resources to operate. This is why they are used more in situations where battery power consumption or load is of concern (such as robots, drones, etc.). The downside is that their video quality drops significantly.

Therefore, in order to achieve WebRTC interactivity, the media gateway requires the execution of the media management program as shown in the figure below. The gateway must first have the ability to speak the same language (such as RTSP/RTP or HTTP), secondly decode the video stream received from the camera (such as H.264 or MJPEG), and then re-encode it into the VP8 format (WebRTC universal encoding) and finally send it to the WebRTC client using the WebRTC protocol stack.

Figure 1: WebRTC media gateway provides a common solution for media interactivity between RSTP/H.264-HTTP/MJPEG cameras and WebRTC browsers. Media information (dark red line) requires appropriate protocol and codec adaptation to convert the format provided by the camera into a format supported by the WebRTC client. However, this is not enough to work in real networks, which need to take into account RTCP feedback provided by browsers to manage packet loss and congestion. This is very important to achieve satisfactory QoE because not all WebRTC gateways are capable of providing appropriate termination semantics for RTCP feedback.

Dealing with the network: making a production ready application.

Media adaptation is not enough to achieve a working application. You also need to manage how the real network works. To achieve this, the WebRTC protocol stack uses
SAVPF
Specifications, the last ‘F’ means “Feedback”. This Feedback constitutes the RTCP packet depicted in the scenario in Figure 1, which contains information about network conditions that may affect quality and is sent from the WebRTC client to the gateway.

As mentioned above, most IP cameras only support AVP (without the ‘F’), which means that the gateway cannot send feedback to the camera (as happens in many SFU architectures), but needs full control over it. In technical terms, the gateway must terminate RTCP feedback.

this is very important. You must make sure that the WebRTC gateway you use is authentic and fully terminates and provides semantics for RTCP transports. When the RTCP transfer is not terminated, what you can experience is devastating QoS and the video basically freezes.

To understand why the video is stuck, let’s analyze what happens when the gateway does not terminate two simple feedback RTCP packets: PLI and REMB:

  • If the gateway does not have the ability to manage PLI RTCP packets, the video will stagnate once packet loss occurs on the network. This is due to the way the VP8 encoder works. It may not produce keyframes over long periods of time (usually minutes). Every time a PLI packet is not managed by the gateway by generating a new key frame, the WebRTC client will not be able to decode until a new cycle of key frames arrives (again, this may take several minutes). Some gateways use a trick to deal with this problem: generating keyframes very frequently (such as every two seconds), but this significantly reduces the video quality of the VP8 codec because keyframes consume more bandwidth.

  • If the gateway does not manage REMB RTCP requests and does not consider any negotiation control mechanism, the gateway will not respond to the congestion control command of the VP8 encoder requesting a bit rate reduction. This means that once the connection between the gateway and the WebRTC client is blocked, the WebRTC browser will be overloaded with video transmissions and associated packet loss will occur, resulting in further degradation of the quality of experience, and then the video will stall.

Doing it right with Kurento Media Server

Kurento media server toolbox can flexibly create rich WebRTC media gateways, and the programming language can use Java or JavaScript. Interaction between WebRTC media server and IP camera is simple and secure in Kurento. You only need to consider three aspects:

  • Kurento media server PlayerEndpoint supports reading video streams from different sources, including RTSP/RTP and HTTP/MJPEG. In other words, PlayerEndpoint has the ability to manage the capture of media from IP cameras.

  • The Kurento media server WebRtcEndpoint supports publishing media streams to WebRTC browsers, providing complete termination of RTCP feedback. This means that every time a PLI packet is received, the WebRtcEndpoint should command the VP8 encoder to generate a new keyframe. This also means responding to REMB feedback and congestion control by commanding the VP8 encoder to reduce quality.

  • Kurento Media Server exerts “media agnostic capabilities” so that when two incompatible media elements are interconnected, all appropriate conversions are transparent to the developer. Therefore, by connecting the PlayerEndpoint source to the WebRtcEndpointsink, H.264/MJPEG to VP8 transcoding will be performed. ?

  • Figure 2: Kurento media server implements a WebRTC gateway that can support RTSP/H.264 and HTTP/MJPEG. Just a few lines of code instantiate the PlayerEndpoint and WebRtcEndpoint elements and then connect them to create a gateway. The internal logic of the Kurento media server is responsible for performing the necessary codec adaptation and management of RTCP feedback without the need for developers to pay attention.

    Therefore, as shown in Figure 2, it is simple to create a WebRTC media gateway to implement RTSP/H.264, HTTP/MJPEG camera interaction with WebRTC. The JavaScript source code that implements this part of the logic is as follows:

  • var pipeline = ...//Use Kurento Client API for obtaining your pipeline.
    
    //Create the PlayerEndpoint for receiving from the IP camera. HTTP and RTSP uris are supported
    pipeline.create("PlayerEndpoint", {uri: "rtsp://your.rtsp.address"}, function(error, playerEndpoint){
    
        //Create the WebRtcEndpoint
        pipeline.create("WebRtcEndpoint", function(error, webRtcEndpoint){
    
        //If working with trickle ice, some code for candidate management is required here.
    
            //Connect playerEndpoint to webRtcEndpoint. This connection activates the agnostic media
            //capability and the appropriate transcodings are configured and activated.
        playerEndpoint.connect(webRtcEndpoint, function(error){
    
                    //Media starts flowing ... enjoy
                    player.play(function(error){
            });
        });
        });
    });
    

    The highlight of all this is that adding more capabilities to your gateway, such as video recording or even content analysis, is still very simple, you just need to instantiate the corresponding media elements and connect them to the desired media topology. This is the advantage of using Kurento: modularity.

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Network Skill TreeHomepageOverview 42376 people are learning the system