In ffmpeg, why is the network video stream h264 converted to YUV by default instead of other formats?

When making online videos, knowing and clarifying some video programming concepts early will save you a lot of detours. Corresponding to video transcoding and transmission, if you follow the code directly at first, it is easy to feel that you understand everything, but why you do this seems to be explained in detail.

For example: In ffmpeg, why is the network video stream h264 converted to YUV by default instead of other formats?

Starting from this question, we will slowly clarify some concepts so that it will be easier to understand when programming.

What is h264

Also known as AVC (Advanced Video Coding), it is a video compression standard. This is an efficient method of encoding video that significantly reduces the bandwidth and storage required while maintaining high quality. H264 encoded videos can be played on a variety of devices, including TVs, computers, smartphones, etc. In video processing, H264 is used to compress video data to make it easier to transmit and store.

What is YUV

This is a color coding system commonly used in video systems. The YUV model defines a color space where Y represents brightness (grayscale) and U and V represent chroma (color and saturation). The advantage of this color encoding method is that it can compress color information more effectively because the human eye is much more sensitive to brightness than chrominance. In video processing, YUV is often used to reduce required bandwidth or storage while preserving visual quality.

There is another concept that we often mention RGB format.

What is RGB

RGB is an additive color model, where R represents red, G represents green, and B represents blue. The RGB model uses red, green, and blue colors of light mixed in different proportions to produce other colors. The RGB model is mainly used in display devices, such as computer screens, televisions, and mobile phones, because these devices display images by emitting three colors of light: red, green, and blue. The advantage of the RGB model is that it can represent a wide range of colors and is intuitive and easy to understand.

It can be said that RGB and YUV are products of different eras. When we first designed black and white TVs, we could display pictures as long as there was grayscale. Pictures were only black and white. After color TVs, U With the two vectors V, the color issue is compatible.

In the era of LED, three colors of red, green, and blue light are mixed in different proportions to produce other colors, that is, RGB.

Why H264 encoding is usually decoded into YUV format instead of RGB format

H264 encoded videos are usually decoded to YUV format first instead of RGB format, mainly due to the following reasons:

Compression efficiency: The color encoding of YUV format is more suitable for video compression. In the YUV format, luminance information (Y) and chrominance information (UV) are separated, which allows a greater degree of compression of chrominance information during the compression process because the human eye is much more sensitive to brightness than Chroma. This means that under the same video quality, video data in YUV format is usually smaller than video data in RGB format.
Compatibility: Many video devices and systems, including televisions and DVD players, use the YUV format. Therefore, decoding to YUV format ensures video compatibility on these devices and systems.
Color space conversion: Although H264 encoded video can be decoded into RGB format, this usually requires an additional color space conversion step. In contrast, decoding directly to YUV format is simpler and more efficient.

Therefore, although H264 encoded video can be decoded into RGB format, it is usually decoded into YUV format first due to considerations of compression efficiency, compatibility, and processing efficiency.

Whether yuv or h264 is used for video transmission

In video transmission, the H264 format is usually used. This is because H264 is an efficient video compression standard that can significantly reduce the bandwidth and storage space required while maintaining high quality. Due to its efficient compression performance, H264 has become the mainstream encoding format for online videos and streaming media.

However, this does not mean that YUV has no uses in video transmission. In fact, YUV is a color encoding system used to effectively compress color information in video processing. Before the video is encoded into H264 format, it is usually converted to YUV format first.

Therefore, although the H264 format is mainly used in video transmission, the YUV format still plays an important role in the video processing and encoding process.

Use ffmpeg to convert h264 to YUV

Here is a basic example using the FFmpeg library, which shows how to open a video file in H.264 format, read each frame, and convert it to YUV format:

extern "C" {<!-- -->
    #include <libavcodec/avcodec.h>
    #include <libavformat/avformat.h>
}

int main() {<!-- -->
    // Register all codecs and formats
    av_register_all();

    //Open video file
    AVFormatContext* formatContext = NULL;
    if (avformat_open_input( & amp;formatContext, "a.h264", NULL, NULL) < 0) {<!-- -->
        // Error handling...
    }

    // Find video stream
    if (avformat_find_stream_info(formatContext, NULL) < 0) {<!-- -->
        // Error handling...
    }
    int videoStreamIndex = -1;
    for (int i = 0; i < formatContext->nb_streams; i + + ) {<!-- -->
        if (formatContext->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {<!-- -->
            videoStreamIndex = i;
            break;
        }
    }
    if (videoStreamIndex == -1) {<!-- -->
        // Error handling...
    }

    // Find and open the decoder
    AVCodec* codec = avcodec_find_decoder(formatContext->streams[videoStreamIndex]->codecpar->codec_id);
    AVCodecContext* codecContext = avcodec_alloc_context3(codec);
    avcodec_parameters_to_context(codecContext, formatContext->streams[videoStreamIndex]->codecpar);
    if (avcodec_open2(codecContext, codec, NULL) < 0) {<!-- -->
        // Error handling...
    }

    // Read and decode video frames
    AVPacket packet;
    AVFrame* frame = av_frame_alloc();
    while (av_read_frame(formatContext, & amp;packet) >= 0) {<!-- -->
        if (packet.stream_index == videoStreamIndex) {<!-- -->
            if (avcodec_send_packet(codecContext, & amp;packet) < 0) {<!-- -->
                // Error handling...
            }
            while (avcodec_receive_frame(codecContext, frame) == 0) {<!-- -->
                // At this time, the frame contains YUV data
                // ...process frame data ...
            }
        }
        av_packet_unref( & amp;packet);
    }

    // Release resources
    av_frame_free( & amp;frame);
    avcodec_close(codecContext);
    avcodec_free_context( & amp;codecContext);
    avformat_close_input( & amp;formatContext);

    return 0;
}

In this example, we first open a video file and look for the video stream. Then we find and open the appropriate decoder. Next, we read and decode the video frames, and whenever we receive a new frame, we process its YUV data. Finally, we release all resources.

Use SDL to display video

In SDL2, data in YUV format can be displayed directly, and it is relatively easy. The code is as follows:

AVPacket packet;
AVFrame* frame = av_frame_alloc();
while (av_read_frame(pFormatCtx, & amp;packet) >= 0) {
if (packet.stream_index == videoindex) {
if (avcodec_send_packet(codecContext, & amp;packet) < 0) {
// Error handling...
}
while (avcodec_receive_frame(codecContext, frame) == 0) {
// At this time, the frame contains YUV data

//Start displaying
SDL_UpdateYUVTexture(sdlTexture, &sdlRect,
frame->data[0], frame->linesize[0],
frame->data[1], frame->linesize[1],
frame->data[2], frame->linesize[2]);

SDL_RenderClear(sdlRenderer);
SDL_RenderCopy(sdlRenderer, sdlTexture, NULL, & amp;sdlRect);
SDL_RenderPresent(sdlRenderer);
//SDL End----------------------
//Delay 1000/60ms--assuming 60 frames per minute
SDL_Delay(1000/60);
}
av_packet_unref( & amp;packet);
}
\t\t
}

The displayed effect is as follows:

As we just said, YUV actually does not need the UV channel and can be displayed as a black and white video. Modify the code and assign all UV channels to 128, and you can see the gray effect.

The code is modified as follows:

 memset(frame->data[1], 128, frame->linesize[1] * frame->height / 2);
memset(frame->data[2], 128, frame->linesize[2] * frame->height / 2);
SDL_UpdateYUVTexture(sdlTexture, &sdlRect,
frame->data[0], frame->linesize[0],
frame->data[1], frame->linesize[1],
frame->data[2], frame->linesize[2]);

All code is already on git.