FFmpeg+SDL+Qt builds a simple video player

This article mainly describes how to use FFmpeg, SDL, and Qt to build a simple video player.

FFmpeg is an open source library for audio and video processing. It provides a C interface for audio and video encoding, decoding, encapsulation, and stream processing. In this tutorial, FFmpeg is mainly used to decapsulate and decode video encapsulated files.

SDL is an open source library for audio and video playback and rendering. It is mainly used for video rendering and audio playback.

Qt is mainly used to write a simple UI for the player, as well as play and pause audio and video selection buttons.

First of all, you need to understand some basic knowledge about audio and video. The so-called MP4 and mkv files are audio and video encapsulated files, which generally contain two streams of audio and video. Each stream stores information such as encoding information and display time base.

The entire process of the player is as shown below. First, it pulls the stream from the server or opens the video file locally (the interface is the same when processing with FFmpeg, but the address provided is different). After opening, decapsulate and read one packet at a time. Audio is decoded into audio frames, and video is decoded into video frames. Then convert the audio and video frames into a format that can be played and rendered for playback.

The entire process of the program is as shown below. First, you need to write the Qt UI, initialize FFmpeg, initialize SDL, then open the input source, open the corresponding decoder, set the playback format, and create a converter based on the input source format and playback format. Create an audio and video playback thread, and then start reading the file. Every time a packet is read, it is judged according to the stream_index whether it is audio or video. If it is audio, it is put into the audio decoder to decode it into an audio frame, and then the format is converted and sent to the audio. If the video is in the cache, it is put into the video decoder and converted. Continue reading the next packet. Audio and video playback is completed by different threads, so time synchronization is also required. Generally speaking, the audio is used as the standard, and the video is rendered according to the playback time of the audio.

Therefore, the entire process will be explained in five parts below, namely initialization, decapsulation, decoding, conversion, and playback. The entire project uses Qt as the framework and needs to create a qt app.

Initialization

FFmpeg network initialization, because our program pulls streams from the rtmp server, we need to initialize the network.

avformat_network_init();

SDL initialization, initializes the audio and video modules and the time module.

if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER)) {<!-- -->
TestNotNull(NULL, SDL_GetError());
}

And after drawing the interface windows and buttons, organize them and set up slot functions.

ui.audio->setCheckState(Qt::Checked);
ui.video->setCheckState(Qt::Checked);

//Horizontal layout, control buttons
QBoxLayout *ctlLayout = new QHBoxLayout;
ctlLayout->addWidget(ui.pauseBtn, 5, Qt::AlignCenter);
ctlLayout->addWidget(ui.video);
ctlLayout->addWidget(ui.audio);

//Vertical layout: video player, progress bar, control button layout
QBoxLayout *mainLayout = new QVBoxLayout;
mainLayout->addWidget(videoWidget);
mainLayout->addLayout(ctlLayout);

//Set layout
mainWindowWidget->setLayout(mainLayout);

//Set slot function
connect(ui.pauseBtn, SIGNAL(clicked()), this, SLOT(ChangePlay()));
connect(ui.audio, SIGNAL(stateChanged(int)), this, SLOT(ChangeAudio()));
connect(ui.video, SIGNAL(stateChanged(int)), this, SLOT(ChangeVideo()));

Decapsulation

Before decapsulating, you need to open the input and find the index of the corresponding audio and video stream, which will be used to later determine which stream the packet read from the file belongs to.

//Open input
if (avformat_open_input( & amp;pFmtCtx, fileName.c_str(), NULL, NULL) < 0) {<!-- -->
TestNotNull(NULL, "Failed to open input file: " + fileName);
}
//Find encoding information
if (avformat_find_stream_info(pFmtCtx, NULL) < 0) {<!-- -->
TestNotNull(NULL, "Failed to find stream information.");
}
//Set the index of audio and video
for (int i = 0; i < pFmtCtx->nb_streams; i + + ) {<!-- -->
if (pFmtCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO)
aIndex = i;
}//for

//Set video index
for (int i = 0; i < pFmtCtx->nb_streams; i + + ) {<!-- -->
if (pFmtCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
vIndex = i;
}

To open an audio and video decoder, generally speaking, you need to reapply for a decoder CodecContext, but many of the examples given on the official website directly point to the decoder in the stream and open it, so I did that.

    

//Open the audio decoder
pAudioCdcCtx = pFmtCtx->streams[aIndex]->codec;
AVCodec* pAudioCdc = avcodec_find_decoder(pAudioCdcCtx->codec_id);
TestNotNull(pAudioCdc, "Failed to find audio decoder.");
if (avcodec_open2(pAudioCdcCtx, pAudioCdc, NULL) < 0) {<!-- -->
TestNotNull(NULL, "Failed to open audio codec.");
}

//Open video decoder
pVideoCdcCtx = pFmtCtx->streams[vIndex]->codec;
AVCodec* pVideoCdc = avcodec_find_decoder(pVideoCdcCtx->codec_id);
TestNotNull(pVideoCdc, "Failed to find decoder.");
if (avcodec_open2(pVideoCdcCtx, pVideoCdc, NULL) < 0) {<!-- -->
TestNotNull(NULL, "Failed to open codec.");
}

Then call the av_read_frame() function to read a packet of the file.

AVPacket pkt = {<!-- --> 0 };
av_init_packet( & amp;pkt);
av_read_frame(pFmtCtx, & amp;pkt);

Conversion

Video frame mainly converts the encoding format, length and width of a frame of video pixels.

//Settings
pSwsCtx = sws_getContext(pIn->width,
pIn->height,
(AVPixelFormat)pOut->format,
pOut->width,
pOut->height,
(AVPixelFormat)pOut->format,
SWS_BILINEAR, NULL, NULL, NULL);
....
//Convert
sws_scale(pSwsCtx, (uint8_t const* const*)pInFrm->data,
pInFrm->linesize, 0, VIDEO_HEIGH, pFrmYUV->data, pFrmYUV->linesize);
The audio frame mainly converts the audio collection format channel information and sampling rate.

//Conversion settings
pSwrCtx = swr_alloc_set_opts(NULL,
pOutPara->channel_layout,
(AVSampleFormat)pOutPara->format,
pOutPara->sample_rate,
pInPara->channel_layout,
(AVSampleFormat)pInPara->format,
pInPara->sample_rate,
0, NULL);
.....
//Convert
swr_convert(pSwrCtx, & amp;(pBuff->dataPos), pBuff->dataLen,
(const uint8_t**)pFrm->data, pFrm->nb_samples)

Play

Audio playback is more complicated. First, you need to set up an SDL_AudioSpec structure, which will pass in a callback function. When the player is opened with the SDL_OpenAudio() function, this callback function will be called when the audio data has been played. My callback function is as follows:

void CallBackFunc(void * userdata, Uint8 * stream, int len)
{<!-- -->
SDL_memset(stream, 0, len);
PlayBuffer* playData = (PlayBuffer*)userdata;
if (playData->dataLen == 0)
return;

/* Mix as much data as possible */
len = (len > playData->dataLen ? playData->dataLen : len);

SDL_MixAudio(stream, playData->dataPos, len, SDL_MIX_MAXVOLUME);
playData->dataPos + = len;
playData->dataLen -= len;
}

In addition, I also set up a thread that continuously reads a piece of data from the audio cache queue and changes the userdata parameter so that the userdata pointer is constantly updated. (In fact, the work of this thread can be done by the callback function, and the callback function can do it by itself. Read data from the cache queue). The thread code is as follows:

void Player::PlayAudio()
{<!-- -->
while(true)
{<!-- -->
if (ctrl.GetPlayStatus() & amp; & amp; ctrl.GetAudioStatus()) {<!-- -->
SDL_PauseAudio(0);
aPlay.WaitToPlay();
}
else {<!-- -->
SDL_PauseAudio(1);
}
}
}

The logic of video playback is relatively easy to understand. First, set the window, continuously read a frame from the cache queue, and decide whether to play based on the audio playback time.

The code for setting the window is as follows. It should be noted that window creation and rendering need to be in the same thread, otherwise a Reset() INVALIDCALL error will occur during rendering. winid is the id of the Qt window, so that the SDL window will be embedded in the Qt window.

int VideoPlay::SDLInit(int w, int h, void* winid)
{<!-- -->
//Set the rendering rectangle
pScreen = SDL_CreateWindowFrom(winid);
TestNotNull(pScreen, SDL_GetError());

//Create window
pRender = SDL_CreateRenderer(pScreen, -1,
SDL_RENDERER_ACCELERATED);
TestNotNull(pRender, SDL_GetError());

//SDL_HINT_VIDEO_WINDOW_SHARE_PIXEL_FORMAT
pTexture = SDL_CreateTexture(pRender,
SDL_PIXELFORMAT_YV12,
SDL_TEXTUREACCESS_STREAMING,
rect.w, rect.h);
TestNotNull(pTexture, SDL_GetError());
return 0;
}

Video playback thread function:

void Player::PlayVideo(void* p)
{<!-- -->
vPlay.SDLInit(0,0,p);
while (true) {<!-- -->
//Get the audio playback time
int64_t ts = aPlay.GetCurrentTime();
if (ctrl.GetPlayStatus() & amp; & amp; ctrl.GetVideoStatus()) {<!-- -->
//Controller allows playback
vPlay.Render(ts);
}
else {<!-- -->
//display black screen
vPlay.BlackScreen();
}
}
}

Rendering function:

int VideoPlay::Render(int64_t ts) {<!-- -->
\t
//Get a frame from the queue
if (pFrame == NULL) {<!-- -->
pFrame = (AVFrame*)cache->ComsumeElem();
}
int64_t diff = pFrame->pts/VIDEO_TIME_BASE - ts/AUDIO_TIME_BASE;
if (diff< -25 || diff > 1000){<!-- -->
//More than 25 milliseconds late, or discarded earlier than 100 milliseconds
printf("drop frame, diff = %d\
", diff);
av_frame_free( & amp;pFrame);
pFrame = NULL;
return 0;
}
if (diff > 20)
{<!-- -->
Uint32 delay = diff - 20;
//printf("It's not time yet, sleep %d ms\
", delay);
SDL_Delay(1);
return 0;
}
//Refresh the rendering layer
int ret = SDL_UpdateYUVTexture(pTexture, &rect,
pFrame->data[0], pFrame->linesize[0],
pFrame->data[1], pFrame->linesize[1],
pFrame->data[2], pFrame->linesize[2]);
if (ret < 0) {<!-- -->
TestNotNull(NULL, "texture is not valid.");
}
//Clear, copy, play
//if (SDL_RenderClear(pRender) < 0) {<!-- -->
// TestNotNull(NULL, SDL_GetError());
//}
if (SDL_RenderCopy(pRender, pTexture, & amp;rect, NULL) < 0) {<!-- -->
TestNotNull(NULL, SDL_GetError());
}
SDL_RenderPresent(pRender);
\t
printf("play video.\
");
av_frame_free( & amp;pFrame);
pFrame = NULL;
return 0;
}