Audio Unit Acquisition Audio Actual Combat

Requirements

Audio unit is used in iOS to realize audio data collection, directly collect PCM lossless data, Audio Unit cannot directly collect compressed data, and audio compression will be discussed in future articles.

Implementation principle

Use Audio Unit to collect hardware input terminals, such as microphones, and other external devices with microphone functions (earphones with microphones, microphones, etc., provided that they are compatible with Apple).

Read Prerequisites

  • Basic Principles of Core Audio: Short Book, Nuggets, Blog

  • Audio Unit Concept: Brief Book, Nuggets, Blog

  • Audio Session Basics: Short Book, Nuggets, Blog

  • Basic knowledge of audio and video

  • Basic knowledge of C,C++

This article is directly for actual combat. If you need to understand the theoretical basis, please refer to the content in the above link. This article focuses on the points to pay attention to in actual combat.

This project achieves low coupling and high cohesion, so you can directly use it by dragging related modules into your project to set parameters.

GitHub address (with code): Audio Unit Capture

Short book address: Audio Unit Capture

Nuggets address: Audio Unit Capture

Blog address: Audio Unit Capture

1. Specific implementation

1.1 Code structure

As shown above, we are generally divided into two categories, one is responsible for collection, and the other is responsible for audio recording. You can start and close the Audio Unit at an appropriate time according to your needs, and when the Audio Unit has been started Audio files can be recorded, and the previous requirements only need the following four APIs to complete.

// Start / Stop Audio Queue
[[XDXAudioCaptureManager getInstance] startAudioCapture];
[[XDXAudioCaptureManager getInstance] stopAudioCapture];
?
// Start / Stop Audio Record
[[XDXAudioQueueCaptureManager getInstance] startRecordFile];
[[XDXAudioQueueCaptureManager getInstance] stopRecordFile];

1.2 Initialize the audio unit

This example uses a singleton implementation, so the implementation of the audio unit is placed in the initialization, which is only executed once. If the audio unit is destroyed, the initialization API needs to be called again in the outer layer. Generally, it is not recommended to repeatedly destroy and create the audio unit, so it is best to do it in After configuring the audio unit in the singleton initialization, you only need to turn it on and off.

iPhone devices only support monophonic by default. If you set the dual-channel code, it cannot be initialized normally. If you need to simulate dual-channel, you can manually use the code to make a copy of the monophonic data. The specific method will be mentioned later in the article.

Note: The setting of the sampling buffer size and the sampling time here cannot be set arbitrarily. In other words, when the sampling time is fixed, the size of the sampling data we set cannot exceed its maximum value. The relationship between the sampling time and the sampling data can be calculated by the formula .

Sampling formula calculation

Amount of data (bytes/second) = (sampling frequency (Hz) * number of sampling bits (bit) * number of channels) / 8
-(instancetype)init {
    static dispatch_once_t onceToken;
    dispatch_once( & amp;onceToken, ^{
        _instace = [super init];
        
        // Note: audioBufferSize can not more than durationSec max size.
        [_instace configureAudioInfoWithDataFormat: &m_audioDataFormat
                                          formatID: kAudioFormatLinearPCM
                                        sampleRate: 44100
                                      channelCount: 1
                                   audioBufferSize: 2048
                                       durationSec: 0.02
                                          callBack:AudioCaptureCallback];
    });
    return_instace;
    
?
- (void)configureAudioInfoWithDataFormat:(AudioStreamBasicDescription *)dataFormat formatID:(UInt32)formatID sampleRate:(Float64)sampleRate channelCount:(UInt32)channelCount audioBufferSize:(int)audioBufferSize durationBackduration(AcallBackduration)Seccall {
    // Configure ASBD
    [self configureAudioToAudioFormat:dataFormat
                      byParamFormatID: formatID
                           sampleRate: sampleRate
                         channelCount:channelCount];
    
    // Set sample time
    [[AVAudioSession sharedInstance] setPreferredIOBufferDuration:durationSec error:NULL];
    
    // Configure Audio Unit
    m_audioUnit = [self configreAudioUnitWithDataFormat:*dataFormat
                                        audioBufferSize: audioBufferSize
                                               callBack:callBack];
}
}

1.3 Set the audio stream data format ASBD

  • be careful

It should be noted that the audio data format is directly related to the hardware. If you want to obtain the highest performance, it is best to directly use the audio attributes such as the sampling rate and the number of channels of the hardware itself. Therefore, such as the sampling rate, when we manually change it, the Audio Unit will convert once internally. Although there is no perception in the code, it still reduces the performance in a certain program.

[Learning address]: FFmpeg/WebRTC/RTMP/NDK/Android audio and video streaming media advanced development

[Article Benefits]: Receive more audio and video learning packages, Dachang interview questions, technical videos and learning roadmaps for free, including (C/C++, Linux, FFmpeg webRTC rtmp hls rtsp ffplay srs, etc. ) If you need it, you can click1079654574Add to the group to receive~

iOS does not support direct setting of dual channels. If you want to simulate dual channels, you can fill in the audio data yourself. The details will be mentioned in future articles. If you like it, please continue to pay attention.

  • Get the audio property value

Understand the AudioSessionGetProperty function, which indicates to query the value of the specified property of the current hardware, as follows, kAudioSessionProperty_CurrentHardwareSampleRate is to query the current hardware sampling rate, kAudioSessionProperty_CurrentHardwareInputNumberChannels is to query the current The number of channels collected. Because the manual assignment method is more flexible in this example, the queried value is not used.

  • Set properties for different format customizations

First of all, you must understand the uncompressed format (PCM…) and the compressed format (AAC…). Using iOS to directly collect uncompressed data can directly get the data collected by the hardware, because the audio unit cannot directly collect the aac type data, so only the original PCM data is collected here.

To use the PCM data format, you must set the flag of the sampled value: mFormatFlags, the sampled value in each channel is converted into a binary bit width mBitsPerChannel, used for each channel in iOS 16-bit bit width, how many frames are in each packet mFramesPerPacket, for PCM data, because it is not compressed, there is only 1 frame of data in each packet. How many in each packet The number of bytes (that is, how many bytes are in each frame) can be calculated simply as follows.

Note that if it is other compressed data formats, most of the above parameters do not need to be set separately, and the default is 0. This is because for compressed data, the number of frames compressed in each audio sample packet and the number of frames compressed by each audio sample packet The number of bytes may be different, so we cannot predict the setting, just like the mFramesPerPacket parameter, because the specific number of frames in each packet after compression can only be known after the compression is completed.

#define kXDXAudioPCMFramesPerPacket 1
#define KXDXAudioBitsPerChannel 16
?
-(void)configureAudioToAudioFormat:(AudioStreamBasicDescription *)audioFormat byParamFormatID:(UInt32)formatID sampleRate:(Float64)sampleRate channelCount:(UInt32)channelCount {
    AudioStreamBasicDescription dataFormat = {0};
    UInt32 size = sizeof(dataFormat.mSampleRate);
    // Get hardware origin sample rate. (Recommended it)
    Float64 hardwareSampleRate = 0;
    AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareSampleRate,
                             & amp; size,
                             &hardwareSampleRate);
    // Manual set sample rate
    dataFormat.mSampleRate = sampleRate;
    
    size = sizeof(dataFormat. mChannelsPerFrame);
    // Get hardware origin channels number. (Must refer to it)
    UInt32 hardwareNumberChannels = 0;
    AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareInputNumberChannels,
                             & amp; size,
                             &hardwareNumberChannels);
    dataFormat.mChannelsPerFrame = channelCount;
    
    dataFormat.mFormatID = formatID;
    
    if (formatID == kAudioFormatLinearPCM) {
        dataFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
        dataFormat.mBitsPerChannel = KXDXAudioBitsPerChannel;
        dataFormat.mBytesPerPacket = dataFormat.mBytesPerFrame = (dataFormat.mBitsPerChannel / 8) * dataFormat.mChannelsPerFrame;
        dataFormat.mFramesPerPacket = kXDXAudioPCMFramesPerPacket;
    }
?
    memcpy(audioFormat, & amp; dataFormat, sizeof(dataFormat));
    NSLog(@"%@: %s - sample rate:%f, channel count:%d",kModuleName, __func__,sampleRate,channelCount);
}

1.4 Set the sampling time

Use AVAudioSession to set the sampling time. Note that when the sampling time is fixed, the sampling size we set cannot exceed its maximum value.

Amount of data (bytes/second) = (sampling frequency (Hz) * number of sampling bits (bit) * number of channels) / 8

For example: the sampling rate is 44.1kHz, the number of sampling bits is 16, the number of channels is 1, and the sampling time is 0.01 seconds, then the maximum sampling data is 882. So even if we set more than this value, the system can only collect a maximum of 882 bytes of audio data.

[[AVAudioSession sharedInstance] setPreferredIOBufferDuration:durationSec error:NULL];

1.5 Configuring Audio Units

m_audioUnit = [self configreAudioUnitWithDataFormat:*dataFormat audioBufferSize:audioBufferSize callBack:callBack];
                                               
- (AudioUnit)configreAudioUnitWithDataFormat:(AudioStreamBasicDescription)dataFormat audioBufferSize:(int)audioBufferSize callBack:(AURenderCallback)callBack {
    AudioUnit audioUnit = [self createAudioUnitObject];
    
    if (!audioUnit) {
        return NULL;
    }
    
    [self initCaptureAudioBufferWithAudioUnit:audioUnit channelCount:dataFormat.mChannelsPerFrame dataByteSize:audioBufferSize];
    
    
    [self setAudioUnitPropertyWithAudioUnit:audioUnit dataFormat:dataFormat];
    
    [self initCaptureCallbackWithAudioUnit:audioUnit callBack:callBack];
    
    // Calls to AudioUnitInitialize() can fail if called back-to-back on different ADM instances. A fall-back solution is to allow multiple sequential calls with as small delay between each. This factor sets the max number of allowed initialization attempts.
    OSStatus status = AudioUnitInitialize(audioUnit);
    if (status != noErr) {
        NSLog(@"%@: %s - couldn't init audio unit instance, status : %d \
",kModuleName,__func__,status);
    }
    
    return audioUnit;
}
  • Create an audio unit object

Here you can specify which category to use the audio unit to create. The kAudioUnitSubType_VoiceProcessingIO category used here is for echo cancellation and human voice enhancement. If you only need the original unprocessed audio data, you can also use the kAudioUnitSubType_RemoteIO category. If you want to know more about the audio unit category , there are related links at the top of the article to visit.

AudioComponentFindNext: The first parameter is set to NULL to use the order defined by the system to find the first matching audio unit. If you pass the last used audio unit reference to this parameter, the function will continue to find the next description The matching audio unit.

-(AudioUnit)createAudioUnitObject {
    AudioUnit audioUnit;
    AudioComponentDescription audioDesc;
    audioDesc.componentType = kAudioUnitType_Output;
    audioDesc.componentSubType = kAudioUnitSubType_VoiceProcessingIO;//kAudioUnitSubType_RemoteIO;
    audioDesc.componentManufacturer = kAudioUnitManufacturer_Apple;
    audioDesc.componentFlags = 0;
    audioDesc.componentFlagsMask = 0;
    
    AudioComponent inputComponent = AudioComponentFindNext(NULL, &audioDesc);
    OSStatus status = AudioComponentInstanceNew(inputComponent, &audioUnit);
    if (status != noErr) {
        NSLog(@"%@: %s - create audio unit failed, status : %d \
",kModuleName, __func__, status);
        return NULL;
    } else {
        return audioUnit;
    }
}
  • Create a data structure that receives captured audio data

kAudioUnitProperty_ShouldAllocateBuffer: The default is true, it will create a buffer for receiving data in the callback function, set it to false here, we define a bufferList to receive the collected audio data.

- (void)initCaptureAudioBufferWithAudioUnit:(AudioUnit)audioUnit channelCount:(int)channelCount dataByteSize:(int)dataByteSize {
    // Disable AU buffer allocation for the recorder, we allocate our own.
    UInt32 flag = 0;
    OSStatus status = AudioUnitSetProperty(audioUnit,
                                           kAudioUnitProperty_ShouldAllocateBuffer,
                                           kAudioUnitScope_Output,
                                           INPUT_BUS,
                                            &flag,
                                           sizeof(flag));
    if (status != noErr) {
        NSLog(@"%@: %s - could not allocate buffer of callback, status : %d \
", kModuleName, __func__, status);
    }
    
    AudioBufferList * buffList = (AudioBufferList*)malloc(sizeof(AudioBufferList));
    buffList->mNumberBuffers = 1;
    buffList->mBuffers[0].mNumberChannels = channelCount;
    buffList->mBuffers[0].mDataByteSize = dataByteSize;
    buffList->mBuffers[0].mData = (UInt32 *)malloc(dataByteSize);
    m_buffList = buffList;
}
  • Set audio unit properties

    • kAudioUnitProperty_StreamFormat: Set the format of the audio data stream through the previously created ASBD

    • kAudioOutputUnitProperty_EnableIO: enable/disable for input/output

input bus / input element: Connect to the hardware input of the device (eg: microphone)

output bus / output element: Connect to the hardware output of the device (eg: speaker)

input scope: Each element/scope may have an input scope or output scope. Taking acquisition as an example, audio flows from the input scope of the audio unit, and we can only obtain audio data from the output scope. Because the input scope is the audio unit and The interaction between hardware. So you can see the two items INPUT_BUS, kAudioUnitScope_Output set in the code.

The remote I/O audio unit defaults to open the output terminal and close the input terminal, but this article is about using the audio unit for audio data collection, so we need to open the input terminal and disable the output terminal.

- (void)setAudioUnitPropertyWithAudioUnit:(AudioUnit)audioUnit dataFormat:(AudioStreamBasicDescription)dataFormat {
    OSStatus status;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Output,
                                  INPUT_BUS,
                                   &dataFormat,
                                  sizeof(dataFormat));
    if (status != noErr) {
        NSLog(@"%@: %s - set audio unit stream format failed, status : %d \
",kModuleName, __func__,status);
    }
    
    /*
     // remove echo but can not effect by testing.
     UInt32 echoCancellation = 0;
     AudioUnitSetProperty(m_audioUnit,
     kAUVoiceIOProperty_BypassVoiceProcessing,
     kAudioUnitScope_Global,
     0,
      &echoCancellation,
     sizeof(echoCancellation));
     */
    
    UInt32 enableFlag = 1;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_EnableIO,
                                  kAudioUnitScope_Input,
                                  INPUT_BUS,
                                   &enableFlag,
                                  sizeof(enableFlag));
    if (status != noErr) {
        NSLog(@"%@: %s - could not enable input on AURemoteIO, status : %d \
",kModuleName, __func__, status);
    }
    
    UInt32 disableFlag = 0;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_EnableIO,
                                  kAudioUnitScope_Output,
                                  OUTPUT_BUS,
                                   &disableFlag,
                                  sizeof(disableFlag));
    if (status != noErr) {
        NSLog(@"%@: %s - could not enable output on AURemoteIO, status : %d \
",kModuleName, __func__,status);
    }
}
  • Register callback function to receive audio data

- (void)initCaptureCallbackWithAudioUnit:(AudioUnit)audioUnit callBack:(AURenderCallback)callBack {
    AURenderCallbackStruct captureCallback;
    captureCallback.inputProc = callBack;
    captureCallback.inputProcRefCon = (__bridge void *)self;
    OSStatus status = AudioUnitSetProperty(audioUnit,
                                                            kAudioOutputUnitProperty_SetInputCallback,
                                                            kAudioUnitScope_Global,
                                                            INPUT_BUS,
                                                             &captureCallback,
                                                            sizeof(captureCallback));
    
    if (status != noErr) {
        NSLog(@"%@: %s - Audio Unit set capture callback failed, status : %d \
",kModuleName, __func__,status);
    }
}

1.6 Turn on the audio unit

Directly call AudioOutputUnitStart to start the audio unit. If the above configurations are correct, the audio unit can work directly.

- (void)startAudioCaptureWithAudioUnit:(AudioUnit)audioUnit isRunning:(BOOL *)isRunning {
    OSStatus status;
    
    if (*isRunning) {
        NSLog(@"%@: %s - start recorder repeat \
",kModuleName,__func__);
        return;
    }
    
    status = AudioOutputUnitStart(audioUnit);
    if (status == noErr) {
        *isRunning = YES;
        NSLog(@"%@: %s - start audio unit success \
",kModuleName,__func__);
    } else {
        *isRunning = NO;
        NSLog(@"%@: %s - start audio unit failed \
",kModuleName,__func__);
    }
}

1.7 Processing audio data in the callback function

  • inRefCon: For any data defined by the developer, the instance of this class is generally passed in, because the properties and methods of OC cannot be directly called in the callback function, this parameter can be used as a bridge for communication between OC and the callback function. That is, the object of this class is passed in .

  • ioActionFlags: describe context information

  • inTimeStamp: contains the timestamp of the sample

  • inBusNumber: The number of buses calling this callback function

  • inNumberFrames: How many frames of data are included in this call

  • ioData: audio data.

  • AudioUnitRender: Use this function to assign the collected audio data to the global variable m_buffList we defined

static OSStatus AudioCaptureCallback(void *inRefCon,
                                     AudioUnitRenderActionFlags *ioActionFlags,
                                     const AudioTimeStamp *inTimeStamp,
                                     UInt32 inBusNumber,
                                     UInt32 inNumberFrames,
                                     AudioBufferList *ioData) {
    AudioUnitRender(m_audioUnit, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, m_buffList);
    
    XDXAudioCaptureManager *manager = (__bridge XDXAudioCaptureManager *)inRefCon;
    
    /* Test audio fps
     static Float64 lastTime = 0;
     Float64 currentTime = CMTimeGetSeconds(CMClockMakeHostTimeFromSystemUnits(inTimeStamp->mHostTime))*1000;
     NSLog(@"Test duration - %f", currentTime - lastTime);
     lastTime = currentTime;
     */
    
    void *bufferData = m_buffList->mBuffers[0].mData;
    UInt32 bufferSize = m_buffList->mBuffers[0].mDataByteSize;
    
    // NSLog(@"demon = %d",bufferSize);
    
    if (manager. isRecordVoice) {
        [[XDXAudioFileHandler getInstance] writeFileWithInNumBytes:bufferSize
                                                      ioNumPackets:inNumberFrames
                                                          inBuffer:bufferData
                                                      inPacketDesc:NULL];
    }
    
    return noErr;
}

1.8 stop audio unit

AudioOutputUnitStop : Stop the audio unit.

-(void)stopAudioCaptureWithAudioUnit:(AudioUnit)audioUnit isRunning:(BOOL *)isRunning {
    if (*isRunning == NO) {
        NSLog(@"%@: %s - stop capture repeat \
",kModuleName,__func__);
        return;
    }
    
    *isRunning = NO;
    if (audioUnit != NULL) {
        OSStatus status = AudioOutputUnitStop(audioUnit);
        if (status != noErr){
            NSLog(@"%@: %s - stop audio unit failed. \
",kModuleName,__func__);
        } else {
            NSLog(@"%@: %s - stop audio unit successful",kModuleName,__func__);
        }
    }
}

1.9 release audio unit

When we do not use the audio unit at all, we can release the resources related to this type of audio unit. Note that the release has a sequence. First, the audio unit should be stopped, then the initialization state should be restored, and finally all related memory resources of the audio unit should be released.

- (void)freeAudioUnit:(AudioUnit)audioUnit {
    if (!audioUnit) {
        NSLog(@"%@: %s - repeat call!",kModuleName,__func__);
        return;
    }
    
    OSStatus result = AudioOutputUnitStop(audioUnit);
    if (result != noErr){
        NSLog(@"%@: %s - stop audio unit failed.",kModuleName,__func__);
    }
    
    result = AudioUnitUninitialize(m_audioUnit);
    if (result != noErr) {
        NSLog(@"%@: %s - uninitialize audio unit failed, status : %d",kModuleName,__func__,result);
    }
    
    // It will trigger audio route change repeatedly
    result = AudioComponentInstanceDispose(m_audioUnit);
    if (result != noErr) {
        NSLog(@"%@: %s - dispose audio unit failed. status : %d",kModuleName,__func__,result);
    } else {
        audioUnit = nil;
    }
}

1.10 Audio file recording

This part can refer to another article: Audio file recording

  • Short book address: Audio File Record

  • Nugget address: Audio File Record

  • Blog address: Audio File Record

Original Link: Audio Unit Acquisition of Audio Combat – Nuggets