In our previous case study, "From Chaos to Harmony – How We Built a Custom Audio Module in React Native When Popular Libraries Failed," we gave a high-level overview of the challenges we faced while working on an advanced audio-video application in React Native. We showed how conflicts between popular libraries led us to create our own native solution.
This article is a technical continuation of that story. Here, we'll focus on how exactly we overcame the challenges by designing and building a custom Expo module with centralized audio session management at its core.
The core issue, which manifested as unbalanced volume levels, wasn't a bug in a single library but a fundamental architectural conflict. We were using several powerful but independent tools, such as:
expo-av
: This library loaded sounds slowly, some of its settings didn't work, and most importantly, it drastically lowered the volume of other sounds during recording. react-native-voice
: It featured an inconvenient API and had poor error handling. react-native-vision-camera
: It lacked built-in audio normalization mechanisms. Each of these packages is a "separate entity" that tried to manage the native audio session on its own—the way the app communicates with the operating system about sound. The result was a constant battle for resources and unpredictable behavior that couldn't be fixed at the JavaScript configuration level. Just to clarify, these tools are great for a lot of cases and was just enough until there was a customization needed!
We realized that to regain stability, we had to create a single, central place that would be the sole "source of truth" for all audio operations. Our solution was to build a custom Expo Module that, under the hood, used Apple's native AVFoundation
framework.
The module's architecture is based on three pillars that directly address the diagnosed challenges:
Dual Audio Management System We created separate managers for main and temporary audio files. This allowed for independent control over playback, which was crucial for managing dialogues without affecting other in-app sounds.
private let mainAudioManager = AudioManager()
private let tempAudioManager = AudioManager()
private let recorder = AudioRecorder()
Enhanced Audio Session Control Direct access to the native audio session configuration ensured consistent behavior across different application states and better integration with the video recording process. The module ensures that audio settings are always optimized for Vision Camera
's current configuration.
AsyncFunction("prepareAudioSession") { (promise: Promise) in
do {
try AudioManager.prepareAudioSession()
promise.resolve(true)
} catch {
self.sendEvent("onError", [
"type": ErrorCodes.AudioSessionError.rawValue,
"message": error.localizedDescription,
])
promise.reject(ErrorCodes.AudioSessionError.rawValue, error.localizedDescription)
}
Improved Volume ManagementBy implementing native volume control and audio routing, we achieved an ideal balance between playback and microphone input. This completely eliminated the volume degradation issue that had occurred with expo-av
.
For playback, we used AVAudioPlayer
, and for recording, AVAudioRecorder
. Communication between the native layer and JavaScript was handled via events.
Creating a centralized session manager brought immediate and measurable benefits. Our custom module not only solved the original problem but also introduced a range of improvements:
the open source solutions
in key parts of the application, such as the player and dialogue reader services. Building our own audio session management module was a strategic decision that allowed us to transform chaos into harmony. For complex multimedia applications where reliability is key, creating a custom, centralized system for controlling native resources is not just a way to solve problems—it's the foundation for building a stable, high-quality product.