Blog
From Chaos to Harmony – How We Built a Custom Audio Module in React Native When Popular Libraries Failed

From Chaos to Harmony – How We Built a Custom Audio Module in React Native When Popular Libraries Failed

Synchronizing advanced audio and video features in mobile applications is one of the biggest challenges, especially in a cross-platform environment like React Native. Developers often rely on an ecosystem of off-the-shelf libraries, but what happens when these tools create more problems than they solve?
Tags
AI Assistant
React Native

From Chaos to Harmony – How We Built a Custom Audio Module in React Native When Popular Libraries Failed

Synchronizing advanced audio and video features in mobile applications is one of the biggest challenges, especially in a cross-platform environment like React Native. Developers often rely on an ecosystem of off-the-shelf libraries, but what happens when these tools create more problems than they solve?

We faced this exact challenge while working on an innovative application for actors. The project required simultaneous video recording, playback of dialogue tracks, and speech recognition. This story is a case study of how the limitations of popular libraries forced us to go down to the native layer to deliver uncompromising quality and reliability.

The Client's Challenge: Unbalanced Volume Levels

The client's primary problem was seemingly simple but critical to the application's usability. Actors recording "self-tapes" needed to hear dialogue lines read by the app. However, this created chaos in the volume levels:

  • When a user increased the volume to hear the dialogue from a distance, the lines became disproportionately loud in the recorded video compared to their own voice.
  • When they lowered the device's volume to balance the sounds in the final recording, the dialogue became too quiet to be heard comfortably during the performance.

Initially, we considered two approaches: a simple hardcoding of volume levels as a temporary fix, or the much more difficult, long-term solution of audio normalization, either in real-time or in post-processing.

Problem Analysis: Why Off-the-Shelf Solutions Weren't Enough

Our investigation revealed that the problem wasn't in the application's logic, but in the fundamental limitations and lack of cooperation between the three key libraries the app relied on.

  • expo-av: This library suffered from slow audio loading and had settings that were either unavailable or non-functional. Worse, it drastically lowered the volume of all other playing sounds while in recording mode, which was the source of our main problem.
  • react-native-voice: It was characterized by an inconvenient API and poor error handling, which complicated implementation and debugging.
  • react-native-vision-camera: Despite its video handling capabilities, it lacked built-in audio normalization mechanisms and also had issues with preview orientation, which thankfully had a community-provided patch.

The key conclusion was this: each of these packages is a separate entity. Each one managed the native audio session on its own, overwriting each other's configurations. Trying to find a "magic combination" of settings that would work for all of them was like hoping for a miracle. We had no cohesive control over the audio system as a whole.

Our Solution: A Custom Native Module in AVFoundation

To regain full freedom in managing audio settings, we decided to build our own native module for iOS using

AVFoundation. This allowed us to bypass the libraries' limitations and create a solution perfectly tailored to the application's needs.

Our module's architecture is built on three pillars:

  1. Dual Audio Management System: We created separate managers for main and temporary audio files. This enabled independent playback control, which was crucial for managing dialogue lines without affecting other in-app sounds.
  2. Enhanced Audio Session Control: Direct access to the native audio session configuration ensured consistent and predictable audio behavior across different app states, especially during simultaneous video recording. The module maintains optimal audio settings based on Vision Camera's configuration.
  3. Improved Volume Management: By implementing native volume control and audio routing, we achieved a perfect balance between the played dialogue and the microphone input. This eliminated the volume degradation issue that occurred with expo-av.

We also created the basic structure for an Android module to be implemented in the future.

Proof of Success: The Client's Verdict

The best summary of our work's impact is the feedback we received from the client. It confirms not only that we solved the key technical issues but also highlights the value of deep expertise combined with product awareness.

"I met The Widlarz Group when my team was in deep trouble. We needed to ship advanced AI audio features with a unique user interface but lacked the internal expertise. TWG guided us through complex challenges with precision, including audio normalization in react-native-vision-camera, slow sound loading in expo-av, and achieving cohesive control across multiple libraries. Beyond solving these deep technical issues, they seamlessly integrated everything into our app with exceptional UX/UI design. Thanks to their expertise, everything now runs smoothly, and the user experience is significantly better. They are a perfect blend of deep technical expertise and product awareness, and their work made all the difference in delivering a high-quality product."

AI Assistant
React Native

Building Advanced Speech Recognition in React Native: A Guide to Extending react-native-voice

React Native
React Native Voice
React Native Audio
AI Assistant
Speech Recognition
Learn how to tackle the current challenges of speech recognition technology in React Native. This article demonstrates how TheWidlarzGroup provides professional, customized solutions that elevate voice applications to the next level. Discover how to enhance your app's performance with advanced speech recognition features.
See article

Building Commercial-Ready Audiobook Applications with React Native

React Native
Treefort’s ebook provides a comprehensive guide to building a scalable audiobook application using React Native, covering its journey from a proof of concept to a fully operational platform. It addresses critical challenges like background audio, multi-track playback, and custom player controls for both Android and iOS. By examining key decisions, limitations of popular open-source libraries, and the setup required for complex audio handling, the ebook offers actionable insights for developers creating robust commercial audio applications.
See article
Do you need help with developing react solutions?

Leave your contact info and we’ll be in touch with you shortly

Leave contact info
Become one of our 10+ ambassadors and earn real $$$.