Android audio and video–H.264 video stream decoding

1 Introduction

H.264 is a digital video compression format used by many developers. It is mainly used for the transmission of live streaming and video streaming on video websites. Many developers have also begun to use H.265 for video compression, which has better performance. H.264 has a big improvement. This article focuses on the implementation of H.264 bare byte stream data using MediaCodec hardware. For more information about H.264, please check the structural introduction of H.264 in the reference article.

2. Use MediaCodec hard decoding

2.1 Introduction to MediaCodec

  • The MediaCodec class provided by Android is used to access the low-level multimedia codec interface. It is part of the Android low-level multimedia architecture. It is usually used in conjunction with MediaExtractor, MediaMuxer, and AudioTrack. It can codec such as H.264, H.265, AAC, and 3gp. and other common audio and video formats.

Android’s underlying multimedia module uses the OpenMax framework. Any implementation of Android’s underlying codec module must comply with the OpenMax standard. Google officially provides a series of software codecs by default: including: OMX.google.h264.encoder, OMX.google.h264.encoder, OMX.google.aac.encoder, OMX.google.aac.decoder, etc., and The hardware encoding and decoding function needs to be completed by the chip manufacturer in accordance with the OpenMax framework standard. Therefore, the implementation and performance of hardware encoding and decoding are generally different for mobile phones using different chip models.

The Android application layer uses the MediaCodec API to provide various audio and video codec functions. Parameter configuration determines which codec algorithm to use, whether to use hardware codec acceleration, etc.

2.2 MediaCodec workflow

The codec processes input data and produces output data. MediaCodec uses input and output buffers to process data asynchronously. Briefly, the general processing steps are as follows

  • Request an empty input buffer
  • Fill in the data and hand it to MediaCodec
  • After MediaCodec processes the data, it places the processed data in an empty output buffer.
  • Get the filled output buffer, get the data in it, and then return it to MediaCodec

2.3 MediaCodec API Description

MediaCodec can handle specific video streams, mainly in the following ways:

  • configure: Configure as encoder start: After successfully configuring the component, call the start method.
  • getInputBuffers: Gets the input stream queue that needs to be encoded, and returns a ByteBuffer array.
  • queueInputBuffer: input flows into the queue dequeueInputBuffer: takes data from the input stream queue for encoding operations
  • getOutputBuffers: Get the data output stream queue after encoding and decoding, and return a ByteBuffer array
  • dequeueOutputBuffer: Take the data after the encoding operation from the output queue
  • releaseOutputBuffer: processing completed, release ByteBuffer data
  • stop: After completing the decoding/encoding task, it should be noted that the codec is still active and ready to be restarted.
  • flush: flush the input and output ports of the component release: release the resources used by the codec instance.
  • reset: Return the codec to the initial (uninitialized) state.

2.4 Talk is cheap, Show me the code

Initialize MediaCodec

 /**
    *Video type
    */
    private final static String MIME_TYPE = "video/avc";

    /**
     * Initialize playback
     */
    private void initVideo(SurfaceHolder holder) {
        try {
            // Initialize MediaCodec. There are two methods, namely creating by name and type.
            // Create by type here
            mMediaCodec = MediaCodec.createDecoderByType(MIME_TYPE);
            // Get the width and height of the video
            mVideoHeight = holder.getSurfaceFrame().width();
            mVideoWidth = holder.getSurfaceFrame().height();
            // MediaFormat, this class includes bit rate, frame rate, key frame interval time, etc. If the bit rate is too low, it will cause a mosaic-like phenomenon.
            mMediaFormat = MediaFormat.createVideoFormat(MIME_TYPE,
                    1080, 1920);
            //Set bitrate
            mMediaFormat.setInteger(KEY_BIT_RATE,
                    mVideoHeight * mVideoWidth * 5);
            //Set frame rate
            mMediaFormat.setInteger(KEY_FRAME_RATE, 30);
            
            if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {
                //Key describing the desired bitrate mode to be used by the encoder
                // BITRATE_MODE_CQ: Indicates no control over the bitrate at all, ensuring image quality as much as possible
                //BITRATE_MODE_CBR: Indicates that the encoder will try to control the output bit rate to the set value
                //BITRATE_MODE_VBR: Indicates that the encoder will dynamically adjust the output bit rate according to the complexity of the image content (actually the size of the change between frames). If the image is complex, the bit rate will be high, and if the image is simple, the bit rate will be low;
                mMediaFormat.setInteger(MediaFormat.KEY_BITRATE_MODE,
                        MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_VBR);
            }
            mMediaFormat.setInteger(KEY_I_FRAME_INTERVAL, 1);

            byte[] headerSps = {0, 0, 0, 1, 103, 66, 0, 41, -115, -115, 64, 80,
                    30, -48, 15, 8, -124, 83, -128};
            byte[] headerPps = {0, 0, 0, 1, 104, -54, 67, -56};

            mMediaFormat.setByteBuffer("csd-0", ByteBuffer.wrap(headerSps));
            mMediaFormat.setByteBuffer("csd-1", ByteBuffer.wrap(headerPps));

            mMediaCodec.configure(mMediaFormat, holder.getSurface(), null, 0);
            mMediaCodec.start();

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

Video decoding part code

Pass the byte[] received or read from the file into onFrame,

/**
     * Decode data and display video
     * buf video data group
     * offset data offset
     * length effective length
     */
    private void onFrame(byte[] buf, int offset, int length) {
        try {
            ByteBuffer[] inputBuffers = mMediaCodec.getInputBuffers();
            int inputBufferIndex = mMediaCodec.dequeueInputBuffer(0);
                if (inputBufferIndex >= 0) {
                    ByteBuffer inputBuffer = inputBuffers[inputBufferIndex];
                    inputBuffer.clear();
                    inputBuffer.put(buf, offset, length);
                    mMediaCodec.queueInputBuffer(inputBufferIndex, 0, length, mCount
                            * 30, 0);
                    mCount + + ;
                }
            MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();
            int outputBufferIndex = mMediaCodec.dequeueOutputBuffer(bufferInfo, 0);
            while (outputBufferIndex >= 0) {
                mMediaCodec.releaseOutputBuffer(outputBufferIndex, true);
                outputBufferIndex = mMediaCodec.dequeueOutputBuffer(bufferInfo, 0);
                if (!isPlayingSound) {
                        mHandler.postDelayed(() -> isPlayingSound = true, 1000);
                }
            }
        } catch (Throwable t) {
            t.printStackTrace();
        }
    }

3. Decoding using FFmpeg

For usage and principles, please browse reference article 3. For specific code implementation, please refer to this class

4. Reference articles

  1. Android MediaCodec official document introduction
  2. Android native codec interface MediaCodec – complete analysis
  3. FFmpeg decoding H.264
  4. H.264 structure

Author: Kong_zZ
Original H.264 structure

On the business card at the end of the article, you can get free audio and video development learning materials, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmap, etc.

See below! ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓